1 A Deep Learning Based Automatic Defect Analysis ..., Schemes and Mind Maps of Machine Learning

1 Department of Materials Science and Engineering, University of Wisconsin-Madison,. 7. Madison, Wisconsin, 53706, USA. 8. 2 Oak Ridge National Laboratory, ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 05/11/2023

christin
christin 🇺🇸

4.6

(18)

263 documents

1 / 41

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
A Deep Learning Based Automatic Defect Analysis Framework for
1
In-situ TEM Ion Irradiations
2
3
Mingren Shen1, Guanzhao Li1, Dongxia Wu3, Yudai Yaguchi5, Jack Haley4, Kevin G. Field2,6,
4
Dane Morgan1
5
6
1 Department of Materials Science and Engineering, University of Wisconsin-Madison,
7
Madison, Wisconsin, 53706, USA
8
2 Oak Ridge National Laboratory, Oak Ridge, Tennessee, 37830, USA
9
3 Department of Mathematics, University of Wisconsin-Madison, Madison, Wisconsin,
10
53706, USA
11
4 Department of Materials, University of Oxford, Oxford, OX1 3PH, UK
12
5 Department of Computer Sciences, University of WisconsinMadison, Madison,
13
Wisconsin, 53706, USA
14
6 Now at Nuclear Engineering and Radiological Sciences, University of Michigan - Ann
15
Arbor, Michigan, 48109 USA
16
17
18
19
20
21
22
23
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29

Partial preview of the text

Download 1 A Deep Learning Based Automatic Defect Analysis ... and more Schemes and Mind Maps Machine Learning in PDF only on Docsity!

1 A Deep Learning Based Automatic Defect Analysis Framework for

2 In-situ TEM Ion Irradiations

Mingren Shen

1

, Guanzhao Li

1

, Dongxia Wu

3

, Yudai Yaguchi

5

, Jack Haley

4

, Kevin G. Field

2 ,

Dane Morgan

1

1

7 Department of Materials Science and Engineering, University of Wisconsin-Madison,

8 Madison, Wisconsin, 53706, USA

2

9 Oak Ridge National Laboratory, Oak Ridge, Tennessee, 37830, USA

3

10 Department of Mathematics, University of Wisconsin-Madison, Madison, Wisconsin,

11 53706, USA

4

12 Department of Materials, University of Oxford, Oxford, OX1 3PH, UK

5

13 Department of Computer Sciences, University of Wisconsin–Madison, Madison,

14 Wisconsin, 53706, USA

6

15 Now at Nuclear Engineering and Radiological Sciences, University of Michigan - Ann

16 Arbor, Michigan, 48109 USA

25 Abstract

26 Videos captured using Transmission Electron Microscopy (TEM) can encode details regarding

27 the morphological and temporal evolution of a material by taking snapshots of the microstructure

28 sequentially. However, manual analysis of such video is tedious, error-prone, unreliable, and

29 prohibitively time-consuming if one wishes to analyze a significant fraction of frames for even

30 videos of modest length. In this work, we developed an automated TEM video analysis system

31 for microstructural features based on the advanced object detection model called YOLO and

32 tested the system on an in-situ ion irradiation TEM video of dislocation loops formed in a

33 FeCrAl alloy. The system provides analysis of features observed in TEM including both static

34 and dynamic properties using the YOLO-based defect detection module coupled to a geometry

35 analysis module and a dynamic tracking module. Results show that the system can achieve

36 human comparable performance with an F1 score of 0.89 for fast, consistent, and scalable frame-

37 level defect analysis. This result is obtained on a real but exceptionally clean and stable data set

38 and more challenging data sets may not achieve this performance. The dynamic tracking also

39 enabled evaluation of individual defect evolution like per defect growth rate at a fidelity never

40 before achieved using common human analysis methods. Our work shows that automatically

41 detecting and tracking interesting microstructures and properties contained in TEM videos is

42 viable and opens new doors for evaluating materials dynamics.

47 Introduction

48 Transmission Electron Microscope (TEM) has widely been used to characterize a

49 material or material system since TEM provides resolution limits at or below common

50 microstructural features of interest. Recently, a surge in the use of in-situ TEM techniques has

51 occurred, partially due to the advent of digital capture devices. In-situ TEM experiments have a

52 distinct advantage over ex-situ experiments as they allow researchers to study materials’ intrinsic

53 properties and responses as external conditions are manipulated such as temperature, pressure,

and type of reaction cells

1

  1. In material science, in-situ TEM is frequently used to shed light on

55 challenging problems like elucidating mechanisms for catalysis, atomic behavior during material

reactions, and nanoscale property changes under loads

1,

57 The value extracted from an in-situ TEM experiment requires careful analysis of the

58 observed processes. For many of these experiments, this analysis includes dynamically detecting

59 features present in the microstructure and analyzing the microstructural evolution in each frame

60 of the experiment, typically captured in a digital video form. For decades, quantification of

61 defects in in-situ TEM data has been completed by humans, which is tedious, time-intensive,

62 error-prone, biased, and impractical to scale. For example, a typical manual workflow of

63 counting defects in TEM images requires an experienced researcher carefully going through

64 every frame for different types of defects and labeling objects in the images one by one. Such

65 manual analysis typically takes many minutes per frame (e.g., in this study, we found it takes

66 about 20 to 60 minutes to process 1 frame, depending on the complexity of the TEM images).

67 Typical in-situ TEM experiments can generate tens to hundreds of frames of video data per

68 minute, so a long video can rapidly become impractical to analyze. Moreover, the labeling

69 quality also depends on the attentiveness of researchers, which may be reduced after spending

70 hours on this repetitive work. Furthermore, other factors such as researchers’ proficiency and

71 personal preference when analyzing TEM images contribute to inaccurate or at least inconsistent

72 labeling. Human interpretations are often required for analyzing TEM images that the same

73 researcher may give different labeling results at different times. The above observations imply

74 that manual counting and analyzing methods are hard to scale and prone to human-based errors.

75 In the future, the demand for better TEM analysis methods will only grow, as recent advances in

76 TEM equipment, e.g., high-speed cameras, and fast microprocessors will keep accelerating the

rate of data acquisition

3

78 Automatic analysis of TEM/STEM data, especially identifying microstructural defects, is

79 a long-standing pursuit of both the academic community and the industrial sector. To

80 automatically analyze defects contained in TEM/STEM images, various methods have been

applied, such as matching key-points in different regions of interest

4

81 , applying different

thresholding values to segment different defects

5,

82 , representing the texture of various targeted

structures by the bag of visual words (BoW)

7

or synthesizing artificial image dataset

8

83 , and using

84 traditional machine learning methods, e.g., k-means clustering, to find defects contained in

TEM/STEM images

3

  1. To the best of our knowledge, these methods are only semi-automatic in

86 that they still require extensive human knowledge and time to apply to a given system, and each

87 new material system requires a significant new investment to find an effective approach.

88 Recently, modern deep learning methods have been applied to solve the defect identification

problem in static TEM/STEM image data

9 – 11

89 , suggesting this is a promising approach that could

90 be reliable and highly flexible across many materials and problems. However, deep learning

91 approaches have not yet been applied to the problem of automatic detection and analysis of

92 defects contained in in-situ TEM video. Imaging-related research around TEM/STEM video

work

9

, computer vision methods like Local Binary Patterns (LBP)

30

115 descriptor were used to

describe local pixel environment, machine learning methods like AdaBoost

31

116 were used to select

the most useful visual features, and shallow Convolutional Neural Networks (CNNs)

32

117 were used

118 to refine the final output to find the defects. This type of methods requires extensive hand-tuning

119 and integration of multiple stages which is unlikely to provide a practical general approach

120 compared to relying on just one deep learning model. The second main type of methods belongs

to the encoder-decoder type of deep learning methods

33

  1. For example, Ziatdinov et al. used a

122 weakly supervised encoder-decoder type neural network system to extract atomic locations and

123 defect types from atomically resolved scanning transmission electron microscopy images to

124 interpret complex atomic and defect transformation of silicon dopants in graphene as a function

of time

26

  1. One special type of encoder-decoder type of methods called U-Net is also popular and

126 widely used for analyzing material images like studying segmentation of nanoparticles in bright

field TEM images

34

and defects in STEM images of steels

11

  1. An encoder-decoder model extracts

128 the most relevant information and builds a useful inner state, which usually requires careful

training

33

  1. The third type of methods relies on using mature object detection methods like Faster

R-CNN

35

, YOLO

16

, Mask R-CNN

36

, and SSD

37

130 etc. For example, Mask R-CNN is used by Chen

et al. to study the microstructure of aluminum alloy

38

131 and Faster R-CNN is utilized by Anderson

et al. to investigate helium bubbles in X-750 alloy under irradiation

27

  1. These methods tend to be

133 quite accurate and fairly easy to train. Given their wide use and appealing properties, in this

134 study, we use one of the third type of methods (YOLO) to study geometrical and temporal

135 changes of loop type defects in the videos of In-situ TEM ion irradiations. YOLO is similar to

136 methods used by others for TEM but unique in its speed to analyze videos in real time.

137 One of the most successful applications of deep learning is computer vision, where the

138 ultimate goal is teaching a computer to do the image-related task(s) like finding which object

139 contained in an image (object detection) and which pixel belongs to different objects (image

segmentation)

39

  1. Since a breakthrough in the ImageNet Large Scale Visual Recognition

141 Challenge (ILSVRC) was made in 2012, the deep Convolutional Neural Network (CNN) based

approach has demonstrated its success in many image-related tasks

39

  1. For the object detection

143 problem (trying to find the location and category of all the objects contained in an image), which

144 is also the focus of this project, there are two general categories of methods: two-stage methods

and one-stage methods

39

. For a two-stage method like Faster R-CNN

35

145 , the object detector will

146 first propose some candidate bounding boxes containing the object location information and then

147 classify the category of those candidate bounding boxes. One-stage methods like YOLO will

output the object location and category at the same time

16

  1. Typically, two-stage methods are

149 slower but more accurate than one-stage methods. YOLO (which is an acronym for You Only

150 Look Once), is one of the most widely used one-stage methods and offers speed, accuracy, and

fast engineering application potentials

39

  1. The key ideas of YOLO are dividing the whole image

152 (or video frames) into grids and predicting the location and the category of the potential

bounding boxes with a set of pre-defined anchor boxes in each cell of the grid

16

  1. YOLO keeps

154 improving its design and implementation details over time and during the writing of this draft,

YOLOv

40

155 and YOLOv5 (https://github.com/ultralytics/yolov5) were developed. However, we

will use YOLOv3 as this is still the most widely used and recognized version

16,41– 43

157 In this paper, we focus on the specific task of adapting the deep learning based YOLOv

158 model into an automated framework for analyzing in-situ TEM video data. Specifically, we

159 focus on the problem of detection and analysis of radiation-induced dislocation loops generated

condition used where g =011 and the deviation parameter, s g

183 , close to zero, only a fraction of

184 a/2〈 111 〉 and a〈 100 〉 loops are visible in TEM. To enable direct comparison to the previous

human-based analysis where multiple g - vectors were used for detection and analysis

47,

185 , we

186 applied a fractional visibility constant 7/4 to make YOLO detection results comparable with

published results using other g conditions

44

  1. Defect size was estimated by assuming the defects

188 are elliptical and the defect size is the length of major axes of the ellipse. The video image size is

189 1344 pixel x 962 pixel with 2.6884 pixel/nm conversion factor which gives the physical size as

190 500.0 nm x 357.8 nm. The video consisted of 1175 frames which were linearly related to dpa and

191 time through Equation 1 which means each frame corresponds to 1.75 seconds.

8 × 10

− 4

[(𝐹𝑟𝑎𝑚𝑒 𝑁𝑢𝑚𝑏𝑒𝑟) × 1. 6466 ]

= 0. 8534 + (𝐹𝑟𝑎𝑚𝑒 𝑁𝑢𝑚𝑏𝑒𝑟) × 0. 00140

193 Equation 1

194 The video data was acquired via frame-by-frame image registration to eliminate sample

195 drift and relevant camera movement. Since there is no landmark frame or feature that can be

196 used to align the video across the whole irradiation dose range, the video was divided into

197 smaller batches for primary image registration, and the final sets from the previous batch were

used to carry over the alignment

15,

  1. The alignment was done by frame registration based on the

selected landmark frame with a template matching and slice alignment plugin

50

  1. The

200 dissemination of data and codes for this paper is described in the Data and Code Dissemination

201 section.

202 We opted not to use any previous data labeling and decided to label the data ourselves for

203 this project to establish the ground truth data. This choice was because we did not have the exact

pixel positions of each defect in the previous study by Haley, et al

44

  1. We followed the labeling

process that has been used in other studies

9

  1. The ground truth data was labeled by two trained

206 researchers and they checked each other’s labeling and explanations for 3 frames before labeling

the real data via an open-source software called ImageJ

51

  1. Their labeling will be treated as

208 ground truth in this study.

209 The YOLOv3 model was adopted from an open GitHub repository

210 (https://github.com/qqwweee/keras-yolo3). To train the model, we first converted the pretrained

211 darknet53 weight via COCO dataset into Keras format and then modified the final class number

212 to our defect number, which was one class in our case since we treated all a/2〈 111 〉 and a〈 100 〉

213 loops as the same type of defect. This single class approach was necessary as Burgers vector

214 determination is not possible using only the g =011 condition in the video. We then applied the

215 transfer learning technique to fine-tune the model by freezing the first 245 layers of YOLOv

and training the last 7 layers

52

  1. The in-situ ion irradiation TEM video data in this study was

217 composed of 1176 frames and 21 frames were selected and labeled. The sampling was done at

218 random except an effort was made to assure that the sampled frames were approximately

219 uniformly distributed throughout the full set. Among the sampled 21 frames, 15 frames were

220 used for training (trained on 12 frames and validated on 3 frames) and 6 frames were used for

221 testing, where these 6 frames were not seen by the YOLO model during training. The model was

222 trained on GeForce GTX 1080 for 18300 epoch and the learning rate of Adam optimizer was

switched between 10

  • 4

and 10

  • 5

223 with batch size equal to 4 and Non-Max Suppression (NMS) IoU

224 equals 0.45 to find the optimal weights. Real-time data augmentation operations, e.g., left-right

225 flip, changing hue, saturation, lightness, were applied for the training dataset to enrich the dataset

226 and enhance the performance of the CNN for variations in defect contrast, size, and morphology

242 Figure 1. Selected images from the test dataset for various damage doses (e.g. time scale). Subfigure (a), (b),

243 and (c) are the ground truth labeling developed by two researchers, while subfigure (d), (e), and (f) are

244 labeled by the automated machine learning program. Here, (a) and (d) are for frame number 120 , (b) and (e)

245 are for frame number 472 , and (c) and (d) are for frame number 824. 1 frame increment equates to about

246 0.00140 dpa, see Equation 1. F1 score compares the machine detection results with human labeling of each

247 column separately.

250 Figure 2. The visualized comparison of the human labeling results (blue boxes) to the YOLO detector results

251 (yellow boxes) for frame number 824.

253 The initial qualitative comparison was encouraging so additional quantitative analysis

254 was conducted. The statistics of the model performance were examined based on the metrics of

255 precision and recall and their harmonic mean which is also called the F1 score. The precision,

256 recall, and F1 score were generated using the six test images that were never used in the training

257 process. The test was iterated with different cut-off Intersection-over-Union (cut-off IoU) values

258 as shown in Figure 3. Here and elsewhere in the paper the IoU refers to the ratio of the area of

259 overlap (intersection) to the combined areas (union) of predicted and ground truth bounding

260 boxes. The cut-off IoU refers to the threshold above which a predicted bounding box is

261 considered as a candidate match for a ground truth bounding box. Predicted bounding box

262 matches are assigned by building a matrix of all IoU values between all predicted and ground

288 Figure 3 .The performance of the YOLO detector with different cut-off IoU thresholds.

290 The developed YOLO model was run on each frame of the in-situ TEM video to extract

291 geometry information of each visible defect for the duration of the experiment. After obtaining

292 the geometry and position of each defect per frame, we used this information to extract defect

293 properties, such as median size and number density. Such properties of materials are widely used

294 in the nuclear materials field from which the dataset originated and provide insights into the

295 interplay between the imparted damage and the change in microstructure. We picked four typical

296 frames and compared the machine learning prediction results with ground-truth labeling. Those 4

297 frames were not used in training and testing the machine learning model. We first compared

298 defect density. Defect density is important for many materials properties and for nuclear

299 materials as it is strongly correlated to mechanical properties, e.g., through the dispersed barrier

model

46,

  1. Loop density comparisons between machine learning results and our labeling results

301 are summarized in Figure 4. The densities of defects per frame were determined via machine

302 learning (ML) method and manual labeling for ground truth by dividing the total number of the

303 loops by the volume of the sample for each frame. The sample was treated as a rectangle bulk

with dimensions of 416.6 x 264 x 75 nm

3

304 and both results were corrected based on the loop

305 invisibility for the given imaging condition.

308 Figure 4. Loop number density from the whole TEM video. The plot compares the loop number density

309 obtained from the ground truth labeling done by experts in this study and the result obtained from the

310 machine learning detector. All the data shown on the plot uses the corrected proposed density, which is 7/4 of

335 background were found by applying a thresholding method, specifically Otsu's binarization and

336 Distance Transformations. To remove noise, we use a morphological opening operation with a

337 3x3 kernel. We followed the official tutorial from OpenCV, and more details can be found

there

58

  1. Watershed found boundaries of defects and backgrounds, but the boundaries were not

339 very smooth. OpenCV’s fitEllipse() function was called to fit the needed defects and the

340 major axis length of the fitted ellipse was defined as the defect size. Detailed fitting results with a

341 cropped region of interests are provided in the Data and Code Dissemination section. The

342 machine learning results of the defect size distributions were compared to ground truth labeling

343 in Figure 5. Although differences were observed in defect median size of each frame in Figure 5 ,

344 investigation indicated that these differences did not exceed 13 .0% difference in median size,

345 and the average difference is only 5.5% and the standard deviation of difference is 5.3% across

346 all doses investigated. The exact formula used to calculate these statistics is given in the

347 Supplemental Information (SI) Section 1. These results indicated that a well-trained machine

348 learning based model could be used for loop detection and analysis and achieve human-like

349 performance comparable or better than the large differences that can be expected by manual

labelling

9

  1. The boxplot comparison provided in Figure 5 showed the viability of the machine

351 learning results. At the same time, it needs to be emphasized again that the true strength of such a

352 technique lies in its ability to detect defect information for every frame quickly and accurately

353 instead of just focusing on a small subset of frames.

356 Figure 5. Box plot comparing the distribution of median size for two methods the ground truth labeling done

357 by experts in this study, and the result from machine learning detector. All distributions are separately

358 analyzed and compared by their irradiation condition, which is 1.0, 1.5, 2.0, and 2.5 dpa.

360 Figure 6 shows the size distribution for the entire duration of the in-situ ion irradiation

361 TEM experiment where, for each frame, the blue line represents the median of loop size, the top

362 of the gray boundary indicates the third quartile of loop size distribution and the bottom of the

363 gray boundary indicates the first quartile of loop size distribution. With the YOLO-based

364 machine learning detector, we could extract data generated in every frame and investigate the

365 material properties with hundreds of times more data than previously collected by hand for this

366 data set (the data collected by hand are the red points shown for four different typical dpa values

367 where these four points are not seen in training data, with red lines connecting them as a guide to

368 the eye). The large amount of analyzed data makes subtle trends easy to identify. For example,

369 although there are some noises, a clear trend can be seen in Figure 6 that the median size, Q1,