DEEP LEARNING TECHNIQUES FOR MAGNETIC FLUX LEAKAGE INSPECTION WITH 
UNCERTAINTY QUANTIFICATION
 
 
By
 
 
Zi
 
Li 
 
 
A THESIS
 
 
Submitted to
 
Michigan State
 
University
 
in partial fulfillment of the requirements
 
for the
 
degree 
of
 
 
Electrical
 
Engineering 

 
Master of Science
 
 
201
9
 
 
ABSTRACT
 
 
DEEP LEARNING TECHNIQUES FOR MAGNETIC FLUX LEAKAGE INSPECTION WITH 
UNCERTAINTY QUANTIFICATION
 
 
By
 
 
Zi
 
Li
 
 
Magnetic flux leakage (MFL), one of the most popular electromagnetic nondestructive 
evaluation (NDE) methods, is a crucial inspection technique of pipeline safety to prevent long
-
term failures. The important problems in MFL inspection is to detect and cha
racterize defects in 
terms of shape and size. In industry, the collected MFL data amount is quite large, Convolutional 
neural networks (CNNs), one of the main categories in deep learning applying to images 
classification problems, are considered as good ap
proaches to make the classification. In solving 
the inverse problem to characterize the metal loss defects, the collected MFL signals are 
represented by three
-
axis signals in terms of three groups of matric
e
s which are consistent in the 
form of images. The
refore, this M.S thesis proposed a novel CNN model to estimate the size and 
shape of defects fed by simulated MFL signals. Some comparative results of the proposed model 
prove that the method is robust for distortion and variances of input MFL signals and 
can be 
applied in other NDE problems with high classification accuracy. Besides, the prediction results 
are correlated and affected by the systematic and random uncertainties in the MFL inspection 
process. The proposed CNN is then combined with a Bayesian 
inference method to analyze the 
final classification results and make uncertainty estimation on defect identification in MFL 
inspection. The influences of data and model variation on aleatoric and epistemic uncertainties 
are addressed in my work. Further, 
the relationship between the classification accuracy and the 
uncertainties are described, which provide more hints to further research in MFL inspection.
 
 
iii
 
 
ACKNOWLEDGMENTS
 
 
During my M.S. program, 
I met a lot of people who 
helped and encouraged me.
 
First, I 
would like to thank my advisor Dr. 
Yiming
 
D
e
ng who gives me a great opportunity to do the 
research in his group as master student. I am very grateful to his encouragement, inspiration 
and knowledge support through my enti
re master program. I would also like to thank my 
committee member Dr. Mi Zhang and Dr. 
Latita Udpa
 
for their constructive guidance and 
valuable feedback.
 
 
I also appreciate all the members from
 
our Nondestructive Evaluation Laboratory, and
 
they 
provi
de a lot of technic supports and suggestions while doing the experiment.
 
 
Finally, special thanks 
to my friends 
and
 
my lovely family for their unconditional supports 
and encouragement
s
.
 
Thank you!
 
 
iv
 
 
TABLE OF CONTENTS
 
 
LIST OF TABLES
 
................................
................................
................................
.........................
 
vi
 
LIST OF FIGURES
 
................................
................................
................................
......................
 
vii
 
Chapter 1: Intro
duction
 
................................
................................
................................
...................
 
1
 
1.1 Introduction
 
................................
................................
................................
...........................
 
1
 
1.2 Motivation
 
................................
................................
................................
.............................
 
2
 
1.3 Contribution
 
................................
................................
................................
..........................
 
4
 
Chapter 2: Theory
 
................................
................................
................................
...........................
 
5
 
2.1 Magnetic Flux Leakage Theory
 
................................
................................
............................
 
5
 
2.1.1 Principle of Magnetic Flux Leakage Detection
 
................................
..............................
 
5
 
2.1.2 Defect Inversion Methods from MFL signals
 
................................
................................
 
7
 
2.2 Machine Learning, Deep Learning, and Neural Network
 
................................
.....................
 
9
 
2.2.1 Machine Learning and Deep Learning
 
................................
................................
...........
 
9
 
2.2.2 Neural Network for Deep Learning
 
................................
................................
..............
 
10
 
2.2.3 Convolutional Neural Network
................................
................................
.....................
 
12
 
2.3 Uncertainty Quantification
 
................................
................................
................................
..
 
14
 
2.3.1 Probabilistic Modelling and Variational Inference
................................
.......................
 
14
 
2.3.2 Dropout as approximating variational inference
 
................................
..........................
 
16
 
2.3.3 Source of Uncertainties
................................
................................
................................
.
 
17
 
Chapter 3: Magnetic Flux Leakage Simulation
 
................................
................................
............
 
21
 
3.1 Finite Element Modeling
................................
................................
................................
.....
 
21
 
3.2 Simulation Environment
 
................................
................................
................................
.....
 
22
 
3.3 Simulation Parameter
 
................................
................................
................................
..........
 
24
 
Chapter 4: Convolutional Neural Network in NDE
 
................................
................................
.....
 
26
 
4.1 Proposed CNN model
 
................................
................................
................................
.........
 
26
 
4.2 Validation of the proposed CNN in other NDE application
 
................................
...............
 
28
 
4.2.1 Concrete Crack Detection
 
................................
................................
.............................
 
28
 
4.2.2 Surface Defect Detection
 
................................
................................
..............................
 
30
 
4.2.3 Defect Detection on Eddy Current Testing
 
................................
................................
..
 
32
 
4.3 CNN Classification Result in MFL
 
................................
................................
.....................
 
35
 
4.4 Comparison with Other Machine Learning Methods
................................
..........................
 
41
 
4.4.1 Support Vector Machine
 
................................
................................
...............................
 
42
 
4.4.2 Decision Tree
 
................................
................................
................................
................
 
43
 
4.4.3 Comparison Results
 
................................
................................
................................
......
 
44
 
Chapter 5: Uncertainty Estimation in MFL NDE
 
................................
................................
.........
 
46
 
5.1 Aleatoric Uncertainty and Epistemic Uncertainty in CNN
 
................................
.................
 
46
 
5.2 Uncertainty Estimation on MFL
 
................................
................................
.........................
 
48
 
5.2.1 Uncertainty estimation in the proposed CNN on MFL
 
................................
......................
 
48
 
5.2.2 Uncertainty Estimation Result on MFL
 
................................
................................
...............
 
49
 
v
 
 
CONCLUSIONS
 
................................
................................
................................
..........................
 
58
 
FUTURE WORK
................................
................................
................................
..........................
 
60
 
BIBLIOGRAPHY
................................
................................
................................
.........................
 
61
 
 
vi
 
 
LIST OF TABLES
 
 
Table 3. 1 MFL simulation defect parameters
 
................................
................................
..............
 
25
 
 
Table 4. 1 Comparison result in Concrete Crack Data
 
................................
................................
.
 
29
 
Table 4. 2 Classification accuracy for MFL signals
 
................................
................................
.....
 
36
 
Table 4. 3 Network comparison result in MFL
 
................................
................................
............
 
44
 
 
Table 5. 1 Comparison of accuracy, averages of total aleatoric and epistemic uncertainties
 
......
 
51
 
Table 5. 2 Comparison of aleatoric and epistemic uncertainties of each shape
 
...........................
 
52
 
 
vii
 
 
LIST OF FIGURES
 
 
Figure 2. 1 Surface plot of the amplitude for the magnetic flux density
 
................................
........
 
6
 
Figure 2. 2 The flow diagram of the entire NDE UQ system
 
................................
.......................
 
18
 
Figure 2. 3 The diagram of NDE uncertainties
................................
................................
.............
 
20
 
 
Figure 3. 1 3D model geometry of MFL inspection in ANSYS
 
................................
...................
 
23
 
Figure 3. 2 
3
-
D profiles of each shaped defect
 
................................
................................
.............
 
24
 
Figure 3. 3 

 
component of each shaped defect (L, W, D = 5mm)
 
................................
...........
 
25
 
 
Figure 4. 1 The proposed CNN 
architecture
 
................................
................................
.................
 
26
 
Figure 4. 2 Concrete image with crack (left) and without crack (right)
 
................................
.......
 
29
 
Figure 4. 3 NEU surface defect sample image
 
................................
................................
.............
 
30
 
Figure 4. 4 Comparison model accuracy in reference work117
 
................................
...................
 
31
 
Figure 4. 5 Model accuracy of the proposed network
 
................................
................................
..
 
32
 
Figure 4. 6  One initial ECT sample image (left) and its sparse component with ROIs (right)
 
...
 
33
 
Figure 4
. 7 
Comparison model accuracy in reference work80
 
................................
.....................
 
34
 
Figure 4. 8 
Example model accuracy and model loss of the proposed 
network
 
..........................
 
34
 
Figure 4. 9 Magnetic fields corresponding to different located defect.
 
................................
........
 
38
 
Figure 4. 10 Noise influence on different defect classification tasks
 
................................
...........
 
39
 
Figure 4. 11 Location influence on different defect classification tasks
 
................................
......
 
40
 
 
Figure 5. 1 Epistemic and aleatoric uncertainty in MFL size classification tasks
 
........................
 
50
 
Figure 5. 2 Epistemic and aleatoric uncertainty in MFL shape classification task
 
......................
 
52
 
Figure 5. 3 Aleatoric and Epistemic uncertainty are computed on the MFL signal with 
different percentage noise
................................
................................
................................
.............
 
53
 
Figure 5. 4 Aleatoric, epistemic uncertainty and average uncertainties are computed on each 
shaped defect under different data 
size
 
................................
................................
........................
 
56
 
1
 
 
Chapter 1: 
Introduction
 
 
1.1 Introduction
 
 
N
ondestructive 
E
valuation (NDE) method
s 
are widely applied techniques to assure the 
structural and mechanical components functional well in a safe and reliable manner. NDE 
techniques 
allow for
 
a
 
thorough evaluation
 
of
 
engineering component
s
 
and structure
 
without the 
need for deconstruction 
and
 
d
amage
1
. Specifically, probing mechanisms are applied in NDE 
testing to identify material properties and demonstrate anomalies in material based on the 
variation in physical properties of the material. Several elec
tromagnetic NDE techniques have 
shown great 
advance
 
for 
metallic
 
components
 
evaluations in the oil, gas, nuclear, energy and 
petrochemical industries
2
, which involving Magnetic Flux Leakage (MFL) method
3
, Pulsed 
Magnetic Flux Leakage (PMFL) method
4
, Eddy Current (
EC) method
5
, Pulsed Eddy Current 
(PEC) method
6
, etc.
 
In modern industry, 
petrochemical, oil, gas
 
and power generation 
are important materials 
which are transported through 
millions of 
miles
 
of pipelines.
 
P
ipelines are the
 
most economical 
and
 
widely installed
 
components in subsea and undergrou
nd infrastructur
e. The inevitably 
attack
s
 
from external and internal corrosion, cracking and manufacturing flaws will affect the 
transportation safety, 
therefore,
 
it is necessary to
 
locate 
the
 
defects 
in the pipeline 
at regular 
intervals before they become a cause to concern
.
 
MFL
 
technique is one
 
of the most popular 
electromagnetic NDE
 
methods
 
to detect metal
-
loss defects of oil and gas caused by corrosion, 
fatigue, erosion and abrasive wear in ferromagnetic pipelines
 
since 1960s
7
-
9
.
 
The capability and 
application of MFL have undergone
 
a tremen
dous improvement, 
and
 
over 80% of pipeline 
inspection relies on MFL technique
3
 
while others rely on ultrasonic inspection techniques
10
, 
eddy current inspection techniques
11
 
and some combinational techiniques
12, 13
. The MFL 
2
 
 
inspection tool consists of a permanent magnet to magnetize the pipe wall an
d a series of hall 
sensors around the circumference of the probe to detect leakage flux where there is corrosion or 
material loss
14
. In MFL based pipe inspection and NDE systems,
 
various
 
magnetic circuit
s
 
are 
formed between the part and probe to induce the 
magnetic field
. After the field saturates, 
i
f there 
is no defect in the mate
rial, most magnetic flux lines will pass through the inside
 
of the 
ferromagnetic material; 
otherwise
,
 
s
ome
 
three dimensional magnetic flux 
leak out of the pipe 
wall
 
since 
the magnetic permeability of the defect 
area
 
is much
 
smaller than that of the 
ferroma
gnetic material itself, magnetic resistance will increase in the defect
 
area
 
to form a
 
distorted 
magnetic field
 
region. The overflowing signals are then acquired by the 
magnetic 
detecto
r to make further damaged areas identification and characterization
15
.
 
The important problems in MFL analysis 
are
 
to
 
reali
ze
 
the reconstruction of practical cracks 
from the measured signals
.
 
Traditionally a defect is
 
characterized 
associated with
 
primary 
parameters
16

length, width
 
and percentage wall loss (%WL)
 
which are obtained from the 
measured three
-
axis MFL signals in terms of the flux intensity.
 
Besides, 
for defects with 
irregular and complex shapes, profiling is
 
necessary for a good estimation o
f 
pipeline 
s
everity
17
.
 
Generally, the accurate identification of the defect shape and size of MFL inspection is of great 
importance in ensuring pip
eline safety. 
 
 
1
.
2
 
Motivation
 
 
Usually, the pro
cess of identifying the characteristics
 
of 
metal loss defects in transmission 
pipelines
 
from MFL signals is referred to the inversion problem. The solutions to the defect 
inversion problem are normally classified either as 
non
-
model
-
based direct methods 
or
 
the 
model
-
based iterative method
s
18
. The model
-
based methods employ a physical model in the 
3
 
 
forward model to simulate the measured signals to update the parameter continuousl
y in the 
inversion problem. The involving numerical computation could provide 
higher confidence 
defect profile
 
reconstruction. However, they are computationally expensive. 
 
Contrast to that, 
the result of a direct mapping method is 
a rough approximation to
 
the defect parameters
 
by 
establishing
 
a relationship between
 
the signal and the geometry of the def
ect
19
. This modeling 
network is fast and of less complexity. 
 
 
In the pipeline inspection, more than thousands of groups of MFL signals are collecte
d
, and it 
will take a long time for iterative methods to 
optimizing model, therefore, the direct mapping 
methods, such as neural networks are more suitable to process massive amount data. Besides, 
in this thesis work, the defect identification in terms of the profile classification problem 
concerning defect shap
e and size is based on MFL measurements. As a result, a direct mapping 
method with a good performance in large scale data 
classification
 
is needed. 
The convolutional 
neural network (CNN), which is a key element of modern deep learning technologies, has sho
wn 
a 
great advance in 
extracting features from 
large
 
amounts of data and
 
has been successfully 
adopted in image and objective classi

cation
 
tasks
20, 21
.
 
The previous study has applied CNNs 
to identify the injurious or non
-
injurious defect from MFL images with high accuracy
22
. Despite 
the input in my work are signals, they co
n
sist of three groups of matric
e
s corresponding to the 
three
-
axis co
mponents in MFL measurements, which 
are
 
in identical form to images. Therefore, 
i
t
 
is quite promising to deal with MFL defect classi

cation problem using CNN
s
. 
 
 
The uncertainties existing in the inspection process affect
 
prediction capabilities
 
and theref
ore, 
the m
easurement
s
 
to 
uncertainty 
are
 
critical to assess the reliability of the result
.
 
The errors of 
the measurements could be systematic and randomly
,
 
and they reflect the effects of these factors 
on the value of uncertainty of the results
23
.
 
The problems
 
that
 
u
ncertainty 
q
uantification (UQ) 
4
 
 
address
es
 
are 
derived from
 
probability theory
24
, dynamical
 
systems
25
, 
and nume
rical 
simulations
26
, while the methods used 
usually rely 
on statistics, machine learning
27
,
 
and 
functions approximation
28
.
 
In NDE inspection, the me
asurement results are quite sensitive to 
the environmental conditions as well as the signal processing methods
29
. Therefore, a quantified 
uncertain
ty estimation in NDE is 
i
ndispensable
.
 
 
1.3
 
Contribution
 
 
This M.S. 
t
hesis work 
focuses 
on addressing the problem of the defects shape and size 
identification 
for f
erromagnetic pipe 
inspection, 
and
 
a novel CNN model is proposed to classify 
defects
 
from
 
the 
simulated MFL
 
signal
s
 
directly.
 
The well
-
trained network can efficiently and 
automatically learn 
defect features 
from the 
MFL signals, which could
 
provide
 
information on 

e to
 
undergo further inspection. 
The proposed model is further applied in 
other 
ND
E
 
related
 
classification problems, and the network performances
 
on simulated MFL 
signals are compared with 
conventional machine learning methods
 
Support Vector Machine and 
Decisi
on tree. The 
compar
ison
 
result
s 
prove that the proposed method is robust for distortion 
variances of input MFL signals and 
versatile
 
in
 
other
 
classification
 
tasks
 
with high accuracy
.
 
Furthermore, 
a Bayesian 
inference
 
method 
is
 
addressed
 
in the proposed 
convolutional
 
neural 
network
 
to 
provide assistance in 
analyz
ing
 
the final classification result
s with
 
uncertainty 
estimation
s.
 
T
he 
uncertainties in the 
physical model
,
 
as well as the applied classification model
, 
have been clarified in thi
s MFL defect identification task
. 
The relationship between the variation 
in data and model and uncertainties are addressed in my work. Further, the classification 
accuracy is proven to be related to uncertainty.
 
 
5
 
 
Chapter 2: 
Theory
 
 
2.1 
Magnetic 
Flux Leakage Theory
 
 
The detection principle 
of MFL 
is 
when
 
the ferromagnetic material is magnetized close to 
saturation under the applied magnetics 

eld
, i
f there is a defect area in the material, a smaller 
magnetic permeability will be formed and the mag
netic resistance will increase, therefore, 
magnetic 

eld in the region will be distorted and the leakage 

ux arises. The 

ux lines that pass 
off the ferromagnetic material are detected by magnetic sensitive sensors as the electrical 
leakage signals
30
-
32
. Once the 
magnetic flux leakage is detected, it is easy to verify the 
occurrence of a defect. Besides, MFL signals could provide valuable information to exploit the 
existence and characteristic of metal loss defect.
 
 
2.1.1 Principle of Magnetic Flux Leakage 
Detection 
 
When there is a defect in the pipeline, the defect leakage field is generated (Figure 
2.1
a)
. 
The 
information contained in the measurement 
of the magnetic flux density 


has been well 
evaluated so that the status of the defect can be determined 
15, 33, 34
.
 
T
he leakage signals are split 
into 
three
 
vector distributions: 


,


,


,
 
which represent the axial, tangential (circumferential) 
and radial components of the
 
magnetic
 
flux density fields, 
respectively (Fig
 
2.
1
b
-
d
).
 
The 
horizontal 
x and y
-
axis represent the
 
length and width of the de
fect; the
 
vertical axis is the 
intensity of the magnetic induction. 
The surface plot of 
the 
axial component 


is with one 
positive peak and two negative peaks, while the tangential component  


always has two 
positive peaks and two negative peaks, which are divided along the defect width
-
direction from 
6
 
 
the center. The surface plot of 
the 
radial component 
 

has
 
one positive peak and one negative 
peak. The peak
-
to
-
peak separation midpoint is at the defect center
.
 
 
Figure 2. 
1
 
Surface plot of the amplitude for the magnetic flux density
 
 
The applied permanent
-
magnet excitation ensur
es all the involved process are static, so the 

s
 
can describe this problem appropriately
35
:
 

where 


is the magnetic field intensity vector
 
and
 

represents
 
the magnetic fl
ux density. The 
relationship between magnetic field intensity vector and magnetic flux density is represented as 
follow:
 

gradient of a 
magnetic 
scalar potential U
:
 
7
 
 
When combining eq
.2
-
4 
together and assuming the region is homogeneous and isotropic, the 

 
In practical calculation, 
a 
direct solution to the above electromagnetic model is quite difficult, 
so a numerical technique: finite element model (FEM) is applied
 
to compute the
 
distribution of 
magnetic flux density for 
the 
system
. FEM discretizes the computed region into a finite number 
of rectangular elements and solve the corresponding variational problem. 
 
In this thesis work, 
the MFL data are generated through FEM simulation
 
software 

 
ANSYS, which will be 
introduced and discussed in Chap 3.
 
 
2.1.2 Defect Inversion Methods from MFL signals
 
As mentioned in Chap.1, both the model
-
based methods and the non
-
model based methods 
have been developed to solve the MFL inversion proble
m. Model
-
based
 
method
s are 
advantageous to make accurate inversions by applying 
a forward model to solve
 
the well
-
behaved forward problem iteratively
. It
 
starts
 
with an initial estimation of the
 
defect and 
involved 
extra iterative inverse algorithms to upd
ate defect pro

le by minimizing the error 
between the predicted and original profiles
19
.
 
Finite
-
element method (FEM)
36, 37
, analytical 
models
38
 
and neural networks
14, 39
 
are generally used as the forward models. In the previous 
work, a
 
novel iterative method 
was proposed to 
combin
e with
 
the parallel radial wavelet basis 
function 
with a 
fi
nite
-
element neural network to accomplish the forward and iterative backward 
algorithm
s,
 
respectively
40
. 
Space mapping (SM) is another optimization method which could 
provide a satisfactory result in an iterative manner following the FEM forward
41
. This SM
-
based 
8
 
 
algorithm has shown good results in crack parameter estimation from FEM simulated MFL 
signals
42
.
 
The forward training and backpropagation parameter updating scheme in the model
-
based 
network could 
provide higher confidence defect
 
profiles
, but they are computationally expensive. 
Besides, if there is no prior knowledge of the estimated shape, a large number of parameters 

some non
-
model b
ased approaches, typica
lly as the neural networks
43, 44
,
 
have shown great 
advances in this defect inversion problem. This procedure is to
 
establish a 
functional 
relationship 
between
 
t
he signal and the geometry of the de
fect through a l
arge train
in
g 
amount
. Though onl
y 
a rough approximation to the defect parameters could be obtained, these models are fast and the 
networks are of higher eff
i
ciency
19
. Some novel function
-
approximation method
s
, su
ch as 
radial
-
basis function neural network (RBFNN), wavelet
-
basis function neural network 
(WBFNN)
14, 45
 
and finite element neural network (FEN)
46
, generic algorithm
47
, support vector 
machine (SVM)
48
 
are app
lied to establish the relationship from the signal to the defect space. 
Like other traditional neural networks, convolutional neural networks use several groups of 
learnable parameters and effectively extract input features to make further classification a
nd 
recognition. Although the MFL signals inputs are not actual images, CNNs can still extract 
required effective features and then after training, the relationship between input MFL signals 
and the corresponding defect size and shape can be well establishe
d. The results will be fully 
presented and discussed in Chap 4.
 
 
9
 
 
2
.
2
 
Machine Learning, Deep Learning, and Neural Network
 
 
2.2.1 Machine Learning and Deep Learning
 
With the increasing amount of data 
available 
in high
-
performance 
computing and 
storage 
centers
, machine learning (ML) technique
49
 
is
 
the study of computer algorithms capable of 
learning to improve their performance of a task
 
based on 
the
ir previous experience
 
learned the 
massive data. Given the sample data, ML
 
algorithms 
use statistical methods
 
to
 
provide 
high
-
level information aids in decision
-
making processes
 
without being programmed
 
specifically
.
 
The field is closely related to pattern
 
recognition and statistical inference
. 
ML technology 
consists of supervised learning, unsupervised learning
,
 
and reinforcement learning
50
.
 
Supervised 
learning 
is implemented in the classification or regression tasks; in other words, which is a task
-
driven method. T
he model learns from the labeled data which provide the features that 
the model 
must learn.
 
Therefore, su
pervised learning is best suited to problems 
with prior 
ground truth
 
knowledge or available references points, such as 
maximum entropy
51
, classification and 
regression trees
52
,
 
support vecto
r machines
52, 53
 
and wavelet analysis
54
.
 
Unsupervised learning 
is a data
-
driven task that
 
machine learning models learn from unlabeled data without any human 
intervention
, and it is used to reveal
 
patterns in the 
ecological 
data
, including self
-
organizing 
maps
55
 
a
nd Hopfield neural networks
56
. 
Reinforcement learning refers to 
goal
-
oriented 
algorithms that 
learn
 
a sequence of successful decision
s by 
trial and error
,
 
to find the best 
soluti
on, like on
-
policy Sarsa
57
 
and off
-
policy Q learning
58
. 
 
D
eep learning (DL) is a specific technique for implementing Machine Learning based on 
Artificial Neural Networks
, but 
DL can automatically discover the features to be used for 
classification
 
while 
ML requires these featur
es to be provided manually
. 
DL techniques model 
hierarchical representations in data using deep networks of supervised or unsupervised learning 
10
 
 
algorithms.
 
The multiple processing layers in the models could learn a better abstract 
representation of data
59
. DL works in continuously iterative manners to adjust the model 
parameters until 
meeting
 
the
 
stopping
 
condition. 
In recent years, deep learning has 
excellent 
performance 
not only in academic communities
, such as
 
image
 
recognition and restoration
60, 61
, 
speech recognition
62
, natural language processing
63
, posts or
 

64
, 
also 
it has gained attractions 
in industry products like
 
Googles translator and 
i
mage search 

Siri
 

and other companies such as Facebook and IBM
59
.
 
 
2.2.2 Neural Network for Deep Learning
 
Neural Networks
 
(NNs) 
is a biologically inspired network of artificial neurons
, which is 
configured to perform 
specific tasks.
 
Neurons in NNs apply mathematical functions on the given 
inputs and produce an outpu
t
.
 
T
he output of each neuron is computed by some non
-
linear 
function of the sum of its input
.  The collection of neurons is called 
layer
 
and 
each produc
es
 
a
 
sequence of activations
. There are three different layers in a typical Neural Network: input layer 
(fed with inputs), output layer (fed with processed data) and hidden layer (processing the data 
from input layers).
 
The 
l
earning 
target
 
is 
to
 
find 
suitable 
weights 
and connections to the neurons, 
that make the NN 
realize 
desired behavior
. This process 
require
s
 
long 
chains of computational 
stages
 
where each 
of them
 
transforms the aggregate activation of the network
 
(
often in a non
-
linear 
manner)
65
.
 
The succes
si
ve layers in deep learning enable the network to accurately learn 
the deeper intermediate 
feature representation of the input
 
and thus provide a more reliable 
network.
 
The iterative learning process of NNs enables the network to have the robus
tness to noise in 
data and superior classification ability in untrained networks. The learning process is described 
11
 
 
as follows: The initial values of each neuron are multiplied with some weights and summed with 
all other values into the same neuron. The in
itial prediction results are then compared with the 
expected label values and calculate the loss between them. The propagation stage is then 
performed to 
propagate this loss to update 
every
 
parameter 
aiming at reducing the total loss in
 
the neural network.
 
Those parameters are updated each time with the new inputs. The whole 
iterative process is repeated until all the cases are fed into the network or 
a better model is 
obtained.
 
 
Neural Networks have a good performance in the broad spectrum of data
-
intensiv
e 
applications, such as the target recognition, medical diagnosis, voice recognition, which are 
companie
s
 
with some typical neural network architectures implemented in deep learning 
techniques
, like 
feed
-
forward neural networks, multi
-
layer perceptron (MLP
), recurrent neural 
networks (RNN) and convolutional neural networks (CNN). In feed
-
forward neural networks, 
multiple layers of computational neurons are interconnected in a feed
-
forward way. It
 
has 
been 
widely applied in 
the 
chemistry area, like 
the 
modeling of 
the 
secondary molecular structure of 
proteins and DNA
66
.
 
M
LP is a class of feed
-
forward neural network with two or more trainable 
weight layers (consisting of perceptron).
 
Combined with several decision classifiers, MLP is 
then 
applied in recognition and pose estimation of 3D objects as well as the handwritten di
git 
recognition
67
. In RNN,
 
each node in a given layer is connected with a directed connection to the 
other neuron in the next successive layer.
 
As 
r
ecurrent
 
neural network
s
 
consider the previous 
word during predicting, it 

 
for a short period
 
time
, 
therefore, it shows a great advance in speech recognition
68
, time
-
series prediction
69
, speech 
synthesis
70
 
and other language modeling areas. T
he 
CNN 
comprised of
 
several
 
convolutional 
and subsampling layers 
and are 
optionally followed by fully connected layers.
 
Apart from image 
12
 
 
recognition, CNNs 
have been successful in identifying face
s
71
, object
s 
72
 
and traffic signs apart 
from powering vision in self
-
driving cars
73
.
 
 
2.2.3 Convolutional Neural Network
 
The huge computational cost is a typical drawback in traditional neural ne
tworks, that is due 
to the matrix multiplication operations involved massive parameters. CNNs easily tackle this 
problem by introducing convolutions which make the local spatial coherence in the input ideal 
for extracting relevant information with a lower 
computational cost. 
 
The typical operations in CNN structure are introduced as follows:  
 
a)
 
Convolutional Step
: a 

lter matrix is used to convolute with the original input matrix with 
learnable kernel
s
 
to get the 

eature
 
m
ap

. Convolution 
preserves the spatial relationship 
between pixels by learning image features using small squares of input data. By increasing the 
number of filters, more matrix features can be extracted so that the network performs better in 
recogniz
ing
 
patterns or classi
fication to invisible matrices or images.
 
b)
 
Activation Function
: It is used to determine the output
s
 
of
 
the 
neural
 
ne
twork by mapping 
the resulting values 
to
 
some certain range. 
As neural networks are used to implement complex 
functions, 
s
igmoid
74
, 
hyperbolic tangent
 
(
Tanh
)
75
, 
r
ectified linear unit (ReLU)
71
 
are the 
commonly used 
non
-
linear activation functions.
 
Both sigmoid and 
Tanh 
are saturating non
-
l
inear functions 
where
 
the output gradient 
decreases close
 
to zero as the input increases
. 
Different from these two, ReLU is a non
-
saturating function, that the output returns 0 if it 
receives negative input,
 
otherwise the input will be returned
. It can be 
written as 


.
 
Previous works show that ReLU 
has become the default activation function for many 
13
 
 
types of neural networks
, as it greatly 
shortens the network converge period
 
and improve 
classi

cation performance in
 
deep neural network
 
applications
76, 77
.
 
 
c
)
 
Pooling
: Spatial Pooling (also called sub
-
sampling or down
-
sampling) summarizes feature 
responses across neighboring pixels. It makes the feature dimension smaller so that the 
computation load can be reduced, therefore, controls 
the 
over

tting problem. Besides, it helps 
to retain the most important information. 
 
d)
 
Dropout
: Dropout is that some units are chosen randomly to be abandoned during a 
particular forward or backward pass. In the process of training, neuron interdependent learning 
exists, which leads 
to 
the over
-

tting problem. Dropout is a typical regularization method to 
solve the problem by forcing a neural network to learn more robust features that are useful in 
conjunction with many different random subsets of the other neurons
.
 
Receptive 

elds, local connectivity, and shared weights are three structural characteristics in 
CNN, which ensure the network to be robust enough for 
the 
shift, scale, and distortion variance 
of the input data, as well as the noise. Normally, CNN is appl
ied as a standard neural network 
with some novel structures to tackle specific problem
s
 
in various area
s
. 
H.
 
Nam et al. used a 
pre
-
trained CNN to obtain generic target representations from a set of labeled videos and applied 
in visual tracking tasks
78
.
 
Later, 
an automatic brain tumor segmentation method 
is proposed 
which 
appl
ied 
a novel two
-
pathway architecture to model both the local details and global 
context and two CNNs 
are stacked 
to model local label dependencies
. Even though 
the 
distribut
ion of the label is unbalanced
, it can
 
effectively solve medical segmentation problem
79
. 
In ND
E
 
literature,
 
P. Zhu et al. developed a
 
CNN classification model with weighted loss 
function
 
in 
for eddy current testing
 
defect detection with high classification result
80
.
 
A novel 
14
 
 
CNN model with ReLUs activation 
function is employed in the MFL response segments 
classification
81
.
 
The applied CNN model in this work is presented and explai
ned in Chap 4.
 
 
2.3 Uncertainty Quantification
 
 
Quantification of uncertainties in models and measurements requires identification of sources 
of error that lead to 
the 
uncertaint
ies
. The forward problem involves the use of 
a
 
calibrated 
model for probabilistic prediction
, e.g., 
classification and segment
ation
,
 
which has been widely 
used in the computer vision, medical image and ND
E fields
82
-
85
.
 
This section w
ill 
review 
probabilistic modeling and variational inference
,
 
which is
 
the foundations of
 
derivations
 
in the 
Bayesian method, and the uncertainty sources
 
in NDE area are clarified.
 
 
2.3.1 Probabilistic Modelling and Variational Inference 
 
The uncertainty qu
antification tries t
o
 
determine how likely certain outcomes are if some 
aspects of the system are not exactly known. 
In Bayesian theory, t
he posterior distribution is 
usually used to describe this relationship. Let 


be the training inputs and the 
corresponding one
-
hot encoded categorical output
s 


, where 

 
is the 
sample size, 

 
denotes the input variable dimension and 

 
is the number of categories. In the 
Bayesian probabilistic modeling, 
post
erior 


is 
computed
 
over the weights
 

, which
 
captures the set of 
uncertainty
 
model parameter
 
vector
s, 
given the data.
 
The 
corresponding 
posterior distribution can be expressed as:
 

15
 
 
This d
istribution describes the most likely function based on the given information. Afterward, 
the predictive distribution of the output for a new input point 


and a new output point 


can 
be derived as:
 

As the learning process of the posterior distribution is usually hard to evaluate analytically, 
Radford M Neal
 
invest
igat
ed the Hamiltonian Monte Carlo, a Markov Chain Monte Carlo 
(MCMC) sampling approach using Hamiltonian dynamics 
to approximate the posterior 
distribution as calculated by Bayes Neural network
86
. The results consist of a set of posterior 
samples without direct calculation but computationally complicate
d
. Besides, the variational 
inference method transforms the standard Bayesian learning from integration to optimization 
problem.
 
T
ractable appro
ximating variational distribution 


indexed by a variational 
parameter 

,
 
is 
applied to fit the posterior distribution


that obtained from the original 
model
87, 88
. The closeness to 
the optimal variational distribution and the posterior distribution 
is 
measured 
by the Kullback

Leibler (KL) divergence
,
 
which is defined as 


,
 
to find the optimal parameters 

. Minimizing the Kullback

Leibler 
divergence between 


and 


is equivalent to m
aximi
z
ing
 
the log evidence lower 
bo
und,
 

with respect to the variational parameters 


.
 
This is known as variational inference,
 
a 
general method applied in the Bayesian modeling.
 
 
16
 
 
2.3.2 Dropout as approximating variational inference 
 
The uncertainty prediction in Neural Networks is accomplished by introducing Bayesian 
inference methods for training recurring neural 
networks
89
 
and convolutional network
90, 91
. 
Several studies have demonstrated that Dropout
92
 
and Gaussian Dropout 
93
applied before the 
weighted layer can be used as approximat
ing
 
variational reasoning schemes in the deep 
Gaussian process as they are marginalized over its cov
ariance function parameters
94
.
 
 
Gal and Ghahramani implemented the dropout training in 
CNN
 
as the Bayesian approximation 
and 
developed the 
approximate variational inference in Bayesian NNs using Bernoulli 
approximating variational distributions
 
and relate
 
this to dropout training
95
. 
I
n Neural Network, 
normally the 


regularization term is used to optimize 
the
 
dropout process with weight decay 

,
 
resulting in a minimization objective:
 

where 


is the output with 
L
 
layers in 
th
e network. 


represents the loss function, 
such as 
the softmax loss
 
or the Euclidean loss (squared loss)
 
with weighted matrices 

 
and bias vector 

. In the 
Bayesian Neural Networks with dropout, the t
ractable approximating variational 
distribution 


for every layer is defined as: 
 

Here 


are Bernoulli distributed random variables
 
with some probabilities


, and
 

are 
variational parameters 
need
 
to be optimi
z
ed.
 
As the variational inference defined in the last 
section cannot be evaluated analytically for approximating distribution, an unbiased estimator 
to 

 
is 
proposed as:
 
17
 
 
T
he 
s
oft
m
ax loss
 
is applied to 
normali
ze
 
the
 
network predictions
, which are 
interprete
d
 
as 
probabilities
. As 
s
ampling from


is identical to performing dropout 
operation
 
in a network.
 
The second term in eq.8 
can be approximated as 


with prior length 
scale 

 
which is derived i
n
 
the previous work
96
. With the Monte Carlo integration, the 
approximat
ing
 
predictive
 
posterior
 
distribution
 
can be rewritten as 


.
 
Therefore, it has proven that 

 
is an approximation to 


, resulting 
dropout is the approximating variational inference in Bayesian NNs.
 
 
2.3.3 Source of U
ncertainties
 
In general, the sources of uncertainty in the context of modeling, based on the character of 
uncertainties, can be categorized as: 
 

Aleatoric uncertainty: the intrinsic randomness of a phenomenon.
 

Epistemic uncertainty: the reducible uncertainty caused
 
by lack of knowledge
. 
 
However, in most cases, it is difficult to determine the uncertainty category in a more general 
way, which should depend on the specific context and application
23
. In the Bayesian
 
methods 
that are 
applied in Neural Network, 
e
pistemic uncertainty is modeled by placing a prior 

given some data
 
while a
leatoric uncertainty
, 
on the other hand
,
 
is m
odeled by placing a 
distribution over the output of the mode
l
.
 
E
pistemic uncertainty
, 
often referred 
to
 
model 
uncertainty
,
 
accounts for uncertainty in the parameters 
of the machine learning model 
which 
could be improved by
 
given enough data
. On the other h
and, a
leatoric uncertainty captures 
18
 
 
inherent 
noise in the 
data, such as
 
sensor noise or motion noise, resulting in uncertainty which 
cannot be reduced even if more data were to be collected
. Normally, a
leatoric uncertainty can 
be categorized into homosceda
stic uncertainty, which 
assumes identical observation noise for 
every input
, and heteroscedastic uncertainty
 
where
 
different extent
s
 
of noise
 
for each input
97
.
 
Generally
, in ND
E
 
area, the data 
are
 
generated by the physics model based on the selected 
defect 
parameters, material properties, etc.,
 
and then passed through the machine learning model 
to obtain the output. The final predicted outputs are collected to estimate th
e uncertainties 
brought by data
 
and the machine learning model. The flow diagram of the whole process is 
shown in Fig
 
2.2
.
 
 
Figure 2. 
2
 
The flow diagram of the entire NDE UQ system
 
 
I
n
 
this thesis
 
work, the uncertainty quantification 
to MFL
 
is based on the Bayesian 
i
nference 
method
. 
According to the previous definition
s
, the uncertainties could be divided into the data 
related and the 
machine learning 
model
-
related uncertainties
, while t
he former 
c
an be 
understood 
as the aleatoric uncertainty and the latter 
as the
 
epistemic uncertainty.
 
In this case, 
data are generated from the physical model, thus the inherent noise in data is considered as 
19
 
 
coming from the physics model. In the Bayesian approach, o
nly the uncertainties directly related 
to data and model are taken into consideration, the specific uncertainty quantification to this 
physical model needs further investigation. 
 
To be specific, the sources of aleatoric uncertainties are physics propertie
s, data
-
producing 
method
,
 
and noise:
 

Physics properties:
 
piping 
material 
properties such as 
grain size, fracture toughness, chemical 
composition, 
yield strength, and ultimate tensile strength
; 
loading/pressure, e.g., operating 
pressure
; 
geometry
 
such as 
outer diameter, thickness
, defect shape, size
,
 
and location. 
 

Data producing method
: the data can be collected from the real 
field testing/experiments
 
or
 
simulation platform
s
, like ANSYS, ANSOFT
,
 
and COMSOL. Even with the same experiment 
settings, different software results 
may 
vary from each other. In 
this
 
work,
 
ANSYS is 
adopted 
to 
generate
 
MFL data.
 

Noise
:
 
experimental device
 
may have
 
various measurement 
noise
 
to 
contaminate 
measured 
signals
, 
i.e., 
sensor
 
lift
-
off variation noise, that 
the distance between the pipe wall and the 
detector sensors always varies throughout the whole detection process
 
due to the surface 
discontinuity and 
vibration of the
 
detector
98
; 
seamless pipe noise
, which 
contribute
s
 
to a helical 
variation in the grain properties of
 
the 
seamless 
pipe
99
; 
system noise
, which is referred as
 
inherent noise in 
on
-
board electronics
100
. During the experiment, these noise
s
 
c
an be modeled 
as additive white Gaussian noise in the data
, which 
presented most of the high
-
frequency 
noise
101
. 
 
T
he sources of epist
e
mic uncertainties are related 
to 
the model structure and hyperparameters:
 

Model structure
: Different 
applying NNs 
bring uncertainties to the whole process. 
In this 
thesis work
, only CNN is implement
ed as the machine learning model. 
 
20
 
 
Hyperparameters
: 
uncertainties are from various
 
parameters and functions in the 
model
, for 
example
 
in CNN, 
differences in the 
number of conv
olutional
 
layers
, kinds of 
activation 
functions
 
and
 
loss functions
 
will bring uncertainties
.
 
 
Figure 2. 
3
 
The di
agram of 
NDE uncertainties
 
 
Fig
 
2
.
3
 
shows the diagram of the NDE uncertainties 
and their 
sourc
es for this study
.
 
According to the analysis 
above
, the uncertainties are involved 
from
 
multiple sources. In this 
MFL defect classification 
work
, 
the variation in defect shape, size, location, noise and data 
amount are considered as the main sources and their influences on defect detection will be 
explici
tly investigated in Chap 5.
 
 
21
 
 
Chapter 3
:
 
Magnetic Flux Leakage Simulation
 
 
3.1 Finite Element Modeling 
 
 
T
hree
-
dimensional
 
(3
-
D) finite element method (FEM)
 
is 
a widely 
adopted 
approach in
 
analy
zing and modeling
 
the accurate 3
-
D defects
 
and detailed 
MFL
 
signals
102
. 3
-
D FE
M
 
has 
been
 
used as a general 
discretization technique
 
for many
 
physical problems in various 
engineering fields, such as
 
structure analysis, heat transfer, fluid flow, electromagnetic potential. 
In a structural simulation, 
FEM
 
helps to predict the deformati
on of a structure and providing 
stiffness and strength visualizations
103
. R
. W. Lewis et al. applied an adaptive finite element 
analysis (FEA) with an error estimation technique in heat conduction problem and provided
 
satisfactory results for non
-
linear
 
transient heat diffusion problems and steady incompressible 
flow problems
104
. FEM also provides an effective solution to the 3D electromagnetic forward
-
modeling problem in the frequency domain accompanied vector and scalar potentials and 
unstructured grids
105
.
 
The p
hysical interpretation
 
of FEM
 
is
 
to subdivide 
t
he mathematical model into disjoint (non
-
overlapping) components 
called
 
finite elements
 
of simple geometr
y. 
The
 
degrees of freedom are 
used to represent the 
response of each element
, which is 
characterized by the value of an 
unknown function or function at a set of nodes
106
.
 
In general, the number of degrees of freedom 
equal
s
 
to the product of the number of nodes and the number of values 
of the field variables
, 
possibly their derivati
ves
, that
 
must be calculated at each node.
 
The analytical solution at each 
element is 
converted
 
to solve the boundary value problems for differential algebraical 
equations 
at selected elements. All elements are then assembled to form a discrete model in te
rms of a 
system of equations, which is an approximation to the original mathematical model. The 
22
 
 
variational methods are used to approximate a solution by minimizing an associated error 
function. 
 
In
 
3
-
D finite element computation of MFL numerical
 
model
s
, 
t
he
 
geometry is constructed 
with
 
the specified
 
material
 
properties and boundary conditions. It is then 
discretized
 
into small 
regions which create the equations to be solved.
 
As discussed in Chap 2, the simplified 

elec
tromagnetic phenomena in MFL
 
with 
permanent
-
magnet excitation
. B
ased
 

where A
 
is the 
vector magnetic potential vector, B is the magnetic flux density vector, 

spatial permeability
107
. In the finite element model, the 
equations can be expressed as
108
:
 

where 


is a global stiff matrix, 

 
is an unknown column vector
 
about magnetic vector 
potential
 
and S
 
is a column vector 
of
 
the 
excitation source.
 
With proper
 
boundary condition
s
, 
the
 
magnetic potential vector 

 
can be solved from 
this 
formula
 
and the distribution of magnetic 
flux density is then obtained. As 
FEM is 
usually 
used in MFL signals analysis
 
by 
correlat
ing
 
MFL signals a
nd
 
the defect geometry 
parameter, in this work, A
NSYS
 
finite element software 
i
s used to
 
obtain the three
-
dimensional MFL signals.
 
 
3.
2
 
Simulation Enviro
n
ment
 
 
The 3
-
D model
 
defines the simulation geometry
 
in ANSYS,
 
shown in Fig 3.
1. T
he defects are 
set 
to locate in the center area of the specimen, while the size of the specimen is of 400
 
mm long, 
23
 
 
200
 
mm width and 10
 
mm thick. 
T
he depth of the defect is set to be 5
 
mm
,
 
8
 
mm
,
 
and 10
 
mm
.
 
I
n 
other 
word
s
, the flaw depth is of 50%, 80% and 100% to the sampl
e. The 
York
, magnets, 
brushes and the specimen compose the whole MFL 
3
-
D finite element 
simulation
 
model
. The 
permanent magnets work as the magnetic flux induction to activate the magnetic circuit
, 
and
 
also 
are load
.
 
Brushes act as a transmitter of magneti
c flux from the tool into the pip
ing material, 
and 
strategically placed tri
-
axial 
H
all effect sensor heads can accurately measure the 
3
-
D
 
MFL vector
 
field
s.
 
In ANSYS, the chosen
 
permanent
 
magnets
, 
are translated into equivalent current
 
and 
apply on 
every element and node of the model
102
.
 
The m
aterial of
 
magne
t is NdFe30, and that of 
York
 
and brushes are all nickel, while the specimen is made wi
th
 
iron. Huang et al
.
 
prove
d
 
that 
the MFL peak to peak value is inversely proportional to the lift
-
off value
9
. In order to obtain a 
precise result in the experiment, a 20
 
mm 

 
2
0
 
mm measured area is selected around the center 
of the specimen with 1
 
mm lift
-
off to receive the output MFL signal
s
.
 
 
Figure 3. 
1
 
3D model geometry of MFL inspection in ANSYS
 
 
24
 
 
3.
3
 
Simulation Parameter
 
 
T
here are many types of 
actual 
defects 
in pipelines 
and the geometry of them might be arbitrary 
and complex
, 
this 
thesis
 
takes 
three typical 
defect types
 
into simulation
: cylindrical (Cy), cubical 
(Cu) and a novel shape 
(C).
 
Specifically, shape C 
is
 
a half
-
cylinder, which is constructed by 
c
utting a horizontal cylinder with an incision parallel to the side of the cylinder
.
 
The metal loss 
volume
 
is 
proven to be greatly 
related to the MFL signal
, that l
eakage flux increases with the 
increase in 
the 
volume of defect
109
. As t
h
e
se three shap
e
 
defects 
are of similar volume, 
the 
corresponding MFL signals will not be greatly varied and therefore, it is 
feasible
 
to make 
classification 
on
 
defect shape. The 
cal
culation formulas of defect volume are expressed a
s 
follows
:  
 
C
ylindrical shape:
 
                
C
ubical shape:                      


C
 
shape
:                                 


Where 

, 

,
 
and 

 
denote the length, width
,
 
and
 
depth of the defect, respectively
.
 
Fig 3.
2 (
a
-
c
)
 
show
 
the 3
-
D profiles of each defect and the corresponding heat
 
map of the ax
ial component 


are shown in Fig 3.
3 (
a
-
c
)
. 
 
 
Figure 3. 
2
 
3
-
D profiles of each shaped defect
 
25
 
 
Figure 3. 
3
 

component of each shaped defect (L, W, D = 5mm)
 
 
For each 
shape
 
defect, 
different length, width
,
 
and depth are assigned and combined together 
to enrich the dataset
, which will be applied in the CNN to classify the defect shape and size
. 
T
he 
specific 
defect 
parameters are explained in 
T
able 3.1
 
and it can be seen that there are
 
27 
kinds
 
of
 
size defect
s
 
under each shape and totally, 81 kinds of defects are simulated to achieve a 
balanced dataset. In one simulation, the acquired axial, tangential and radial component signals 
are referred to a group of MFL signals and there will be
 
three groups of MFL signals of e
very
 
defect. Generally, 243 groups of MFL signals are simulated, while 
170 groups are used as the 
training dat
a and the others
 
are the test data
 
in defect classification tasks.
 
 
Table 3. 
1
 
MFL sim
ulation defect parameters
 
 
C
y
 
C
u
 
C
 
L
ength
 

W
idth
 

D
epth
 

26
 
 
Chapter 4
:
 
Convolutional Neural Network in NDE
 
 
4.1
 
Proposed CNN model
 
 
In 
general, the Convolutional Neural Network is considered as a hierarchical feature extractor, 
which extracts features of different abstract levels and maps the input image or matrix into a 
feature vector by several fully connected layers. The
 
overall
 
archit
ecture 
and detailed settings 
of the proposed network 
are
 
illustrated in Fig
 
4.1
. All convolutional filter kernel elements are 
trained from the data in a supervised fashion by learning from the labeled 
MFL 
data. 
 
 
Figure 4. 
1
 
The proposed CNN architecture
 
 
In the
 
proposed
 
architecture, four convolutional layers are employed
 
and
 
each of them is 
activated by
 
ReL
U
, which enable the network to extract the important features of the input. 
The 
first two convolutional layers 
with 32 kernels of 


sized filters to obtain the
 
corresponding
 
feature map
s
 
of 
the 
input
 
matrix
. 
The number of kernels is selected according
 
to 
the principle 
27
 
 
that keep
s
 
the total number of activations (number of feature maps times number of pixel 
positions) t
o 
be non
-
decreasing from one layer to the next. 
Maxpooling layer takes the largest 
element from each feature maps within the 2x2 window
, 
whic
h helps 
to 
reduces the 
dimensionality of our feature map.
 
These two connected convolution
s
 
with 
a 
pooling layer act 
as the feature extractors from the input and in this case, produce 32 feature maps. Then 25
%
 
of 
neurons are dropout to increase the validati
on accuracy and decrease the loss initially before the 
trend starts to go down.
 
  
The previous layers are employed again to deeper the network and therefore, more
 
detailed
 
features of the input image or matri
x
 
can be extracted. As the output of previous layers 
represent
s
 
high
-
level features of the input image, 
a
 
f
ully 
c
onnected layer is added to
 
combine 
features 
to create a model. Finally, the 
softmax activation function
 
is applied to 
classify the 
input
s
 
into various classes. 
It is a widely applied
 
function 
in various multiclass classification 
methods
 
by taking
 
a vector of arbitrary real
-
valued scores and squash
ing the feature maps 
to a 
vector of values that sum to o
ne
110, 111
. In this way, 
the output probability distribution 
over 
predicted output classes
 
can be specified
. All parameters are jo
intly optimized through 
minimization of the misclassification error over the
 
training 
process
.
 
The entire experiment was 
implemented on the Google Cloud high
-
performance computing platform, with 8 v
irtual
 
CPU
s
 
and an NVIDIA Tesla K80 as computing resources
. The project is based on Keras and 
Tensorflow neural networks libraries.
 
The performance of the proposed CNN will be validated 
on some other NDE applications and the simulated MFL detection data, which will be shown in 
the following two sections.
 
 
28
 
 
4.2 Va
lidation of the proposed CNN in other NDE application
 
 
In order to validate the generality and robustness
 
of the proposed network
, three
 
different NDE 
related
 
datasets ar
e tested in the proposed CNN
 
and some of the results are compared with 
previously published work
.
 
The comparison results showed that the 
proposed
 
CNN is
 
effective 
in solving different 
defect detection and recognition 
problems
 
in NDE area.
 
 
4.2.1 Concrete Crack Detection
 
I
n transportation 
infrastructure 
maintenance
, a
utomatic 
detection of pavement cracks is an 
important task for driving safety assurance.
 
T
he objective of 
the
 
crack detection
 
problem is to 
determine whether a specific pixel 
in pavement images can be classified and grouped as a crack. 
Z
hang
 
et al. used a supervise
d deep convolutional neural network to detect the crack in each 
image patch and compared the classification performances with other two conventional machine 
learning methods: Support Vector Machine and the Boosting method
.
 
T
he result
s
 
show
ed
 
that
 
compared
 
with
 
CNNs,
 
SVM and boosting method cannot correctly distinguish the crack from 
the background
112
. Inspired by this, my proposed CNN is trained to class
ify image patch from 
the open
-
sourced concrete images. 
 
458 high
-
resolution
 
concrete
 
surface
 
images 
with
 
various 
cracks 
are
 
collected from various 
Middle East Technical University
 
(METU) c
ampus 
b
uildings
113
.  Following the sampling 
methods proposed in 
112
,  40000 annotated RGB images 
with 227 x 227 pixels
 
are generated and 
are 
divided into two 
classes
 
as negative and positive crack images
. 
 
The numbers of crack and 
non
-
crack patches are set to equal
 
in 
this
 
data
set. 
Fig 4.2
 
shows
 
sample imag
es with crack and 
without crack
.
 
 
29
 
 
Figure 4. 
2
 
Concrete image with crack (left) and without crack (right)
 
 
Noted that t
he number of background patches 
is
 
far 
more
 
than that of crack patches in an 
image
, the accuracy calculated in CNN may overestimate the possibility of crack. Therefore, t
he 
precision
 
(P)
, recall
 
(R)
 
and F1 score are applied as performa
n
ce criteria, which d
efined as 
follows:
 

Table
 
4.
1 shows the performa
n
ce of the proposed 
CNN in thi
s concrete crack classification task. 
It can be seen that CNN can learn the deep features from the concrete crack images and the cracks 
can be distinguished from the backgrounds with high accuracy. 
 
Table 4. 
1
 
Comparison 
result in Concrete Crack Data
 
Method
 
Precision
 
Recall
 
F1 Score
 
P
roposed CNN
 

30
 
 
4.2.2 Surface Defect Detection
 
The inspection to the steel surface is an important research 
area
114, 115
 
and a surface defect 
database, constructed by 
Northeastern University
 
(NEU), has been applied in feature extraction 
methods for defect recognition
116
-
118
.  
In th
is 
database, six kinds of typical surface defects of the 
hot
-
rolled steel strip are collected, i.e., rolled
-
in scale (RS), patches (Pa), crazing (Cr), pitted 
surface (PS), inclusion (In) and scratches (Sc). The database includes
 
1,800 grayscale images
, 
where each typical surface defect has 
300 samples.
 
Fig 4.
3
 
shows the sample images of six kinds 
of typical surface defects. 
E
ach image 
consists of
 

pixels
. 
In 
this
 
NEU surface 
database, both 
the intra
-
class defects 
of one ty
pe
 
and the inter
-
class defects exist large difference
s 
in appearance. For instance, there are horizontal, vertical and slanting scratches among the Sc 
surface defects while RS, Cr, and PS typed defects are varied
. 
Besides
, the 
changes in 
illumination and m
ateria
l influence the defect images. 
 
 
Figure 4. 
3
 
NEU surface defect sample image
 
31
 
 
In surface inspection, a large amount of training dataset is obtained which is costly for feature 
extraction. Therefore,
 
Ren et al. 
utilized 
a
 
pre
-
trained 
deep learning network: Decaf
119
 
to extract 
patch feature
s from input images 
and 
multinomial logisti
c regression (MLR) classifier is chosen 
to 
generate the defect heat map based on patch features
, 
and predict
ed
 
the defect area by 
thresholding and segmenting the heat map
117
. Decaf is previously trained on the ImageNet 
challenge to predict 1000 classes of objects and its weights and model structure ar
e reused as the 
feature extractor to the small data in another domain
120
-
122
. Comparison 
results of 
the proposed
 
model
 
and other benchmark method
s
 
are shown 
in 
Fig 4.4:
 
 
Figure 4. 
4
 
Comparison model accuracy in reference work
117
 
 
32
 
 
Figure 4. 
5
 
Model accuracy of the proposed network
 
 
T
he results 
in Fig 4.4 indicated
 
that the
 
proposed Decaf model with MLR classifier provided 
the
 
highest accuracy
: 
99.27%
 
and in Fig 4.5, t
he 
classification accuracy of my 
proposed 
CNN
 
can reach up to 99.30% within 500 epochs. 
Although training CNN requires a large amount of 
time, it show
s 
great 
performance in classifying surface defects 
on this 
NEU 
dataset.
 
In dealing 
with image classification
 
problems
, it is common to use a deep learning model pre
-
trained for a 
large and challenging image classification task
, but 
the choice of the approp
riate source data or 
source model is an open problem
.
 
 
4.2.3 Defect Detection on Eddy Current Testing
 
Eddy Current Testing (ECT) is another typical electromagnetic testing method 
in NDE
 
to 
detect and characterize 
defects in conductive materials. The 
elect
romagnetic induction 
is used to 
produce the alternating current and perturbations in the induced eddy current indicate the 
presence of defects
80
. An eddy current testing dataset, 
obtained from
 
EPRI 
(
Electric Power 
Research Institute, US
A),
 
consist
s
 
of multi
-
frequency ECT data from inspection of 37 
steam 
33
 
 
generator (
SG
)
 
tubes using array probes under four frequencies
, i.e., 
70 kHz, 250 kHz, 450 kHz
, 
and 650 kHz
123
. 
In previous work, 
robust princip
al
 
component analysis (RPCA) 
is 
utilized to 
preproce
ss the initial data to 
detect and 
enhance the 
potential flaw region, referred as the region 
of interests (ROIs),
 
and 
separat
e
 
the background.
 
Fig 4.6
 
shows 
an
 
example of segment
ed
 
initial 
image 
samples 
and 
its 
sparse component 
with enhanced
 
defect area
 
(RO
Is)
 
and 
suppressed
 
background. Then subsampling is performed to divide 
the 
individual raw 
image 
into
 
374
 
defect
 
images
 
and 
374 
non
-
defect images 
and each of them is of
 

pixels. Among the total 748 
sample images, 
648 are used for training the CNN model
 
while the others
 
are 
used for testing the 
performance
80
.
 
 
Figure 4. 
6
 
 
One initial ECT sample image (left) and its sparse component with ROIs (right)
 
 
In 
the reference work
80
, a classical CNN structure with a proper weighted loss function is 
adopted by applying lar
ger weight of errors resulted from defect samples to improve the 
performance of 
their
 
CNN model.
 
A
 
five
-
fold validation technique is implemented
 
to verify the 
network by setting different threshold values. F
ive training dataset
s
 
are
 
used separately 
to 
trai
n 
five CNN models and 
the results showed


dataset is of 
higher accuracy, shown in Fig
 
4.7
. 
From 
the reference result plot, the x
-
axis represents the threshold 

 
which is involved in assigning 
penalty in the proposed weighted loss function. When 

 
is 0
.5, either the defect or the non
-
defect 
34
 
 
images are of the same penalty. The performance of the proposed CNN trained on this ECT 
datasets with threshold 


and
 
no weighted loss function 
are shown in
 
Fig 4.8.
 
 
Figure 4. 
7
 
Comparison model accuracy in reference work
80
 
 
Figure 4. 
8
 
Example model accuracy and model loss of the proposed network
 
35
 
 
The above graphs show that 
the accuracy in both networks 
is
 
quite
 
good when assigning no 
penalty to the defect and non
-
defect ECT images. My proposed CNN
 
is a little unstable in 
convergence
, which is because the network is not specifically finetuned on this ECT defect 
classification tasks. However, its high classification accuracies prove that the defect area in ECT 
images can be distinguished from the bac
kground, in other words, the 
v
ersatility
 
of the proposed 
CNN is addressed. 
 
 
4.3
 
CNN Classification Result in MFL
 
 
T
he
 
simulated MFL signals described in Chap.3 are trained and tested in th
e
 
proposed CNN, 
and
 
each CNN model for different MFL classification tasks 
converges
 
within 
150 epochs.
 
In 
different classification tasks, the MFL signals are assigned with corresponding
 
labels. To be 
specific, Cu, Cy, and C are marked in the defect shape classification task, while 5mm, 8mm, 
and 10mm are marked in the defect size classification task. Each group of MFL signals consists 
of three 


sized matric
e
s, representing the axial,
 
tangential and radial components, 
respectively. S
imilar
 
to how RGB images are processed, these three MFL matric
e
s are stacked 
together to be passed through CNN. After a series of convolution, pooling and dropout 
operations, the MFL defect features can be 
extracted and combined to make the classification 
and the results will be discussed as follows:
 
 
Experiment 1: Defect shape and size classification tasks to MFL data in proposed CNN model.
 
 
36
 
 
Table 4. 
2
 
Classification accuracy for MFL signals
 
A
ccuracy
 
S
hape
 
L
ength
 
W
idth
 
D
epth
 
Proposed 
C
NN
 
1
00%
 
9
7.
26
%
 
9
5.89%
 
9
4.53%
 
 
From Table 4.2, it can be seen that the proposed network has shown superior performances in 
shape and size classification task for MFL 
signal data, especially when classifying different 

main characteristics of defect depth are represented from the axial peak and vale values of 
magnetic leakag
e field, which will also be affected by length and width
15
. Although CNNs are 
most commonly applied 
in
 
visual imager
y
, in this case, the essential 
defect shape and size 
features can still be learned from 3
-
D MFL signals with high classification accuracy.
 
 
Experiment 2:  CNN performance on distorted MFL data 
 
During the MFL 
defect
 
detection
, various 
measurement 
noise
 
such as
 
mechanical vibration
s
,
 
vel
ocity effect
, sensor lift
-
off variation, etc., greatly distorted
 
MFL 
signals
.
 
To
 
simulate this 
noise
-
degraded MFL data generation
,
 
Gaussian noise is generated to contaminate the MFL 
signals 
and
 
its probability density function 

 
is:
 

w
here 

 
represents the 
Gaussian 
nois
e level
,
 

and 


represents the mean value and variance. 
In this case, different percentage of signal points among each group of MFL 
m
a
tric
e
s
 
are 
randomly assigned with the additive Gaussian noise with the mean 
value
 
of
 
0.005 and variance 
of 0.001. The
re
fore, each new nois
y MFL signal point 


can be expressed as:
 

37
 
 
Where 

 
represent how much points is randomly selected among the 3
-
D MFL matric
e
s to add 
noise and 

 
is the specific chosen position, where eac
h matrix composed of 6561 points. 


a
nd 


are the original MFL signal point and gaussian noise point. Three noisy MFL datasets are 
generated by setting 

 
equals to 1%, 5%, and 10% respectively, which are then put into the 
proposed CNN to figure out how noisy MFL data affect the defect shape and s
ize classification 
accuracy.
 
Besides, 
during the MFL testing, t
he 
variation in defect 
location changes the
 
measured
 
magnetic field as well.
 
To
 
si
m
ulate
 
th
is variance
, 
different 
amoun
ts of 
defect
s described in 
Chap3,
 
are selected to be moved 
randomly
 
away f
rom the previous places (
center of 
the 
measured area
)
.
 
Their new locations are set within 5
mm
 
from the previous spot, while the 
measurement area is fixed.
 
A
mong the whole 
MFL 
data
,
 
5
%
, 10
%
, 15
%,
 
and 20
% 
defects
 
are 
evenly chosen to be randomly relocated, r
espectively, and then, four new MFL defect datasets 
with location variation are generated. The relationship between the altered defect location and 
the defect classification performance of the proposed network can be found by putting these 
four MFL dataset
s into CNN respectively and following the same training and testing process 
in the previous section.  Fig 4.9 
presents
 
examples of two groups of magnetic fields affected by 
different defect positions. Defect in the first row is placed on the left side of t
he original position, 
while the defect in the second row is on the lower right side. 
 
 
38
 
 
Figure 4. 
9
 
Magnetic fields corresponding to different located defect.
 
 
The variation in data increases the difficulty 
of 
the applied classification technique but in 
another aspect, 
represent
s th
is
 
CNN

ness in MFL inspection
.
 
The comparative accuracy 
results are shown in Fig 4.10 and Fig 4.11.
 
 
39
 
 
Figure 4. 
10
 
Noise influence on different defect classification tasks
 
 
40
 
 
Figure 4. 
11
 
Location influence on different defect classification tasks
 
 
It can be seen that the 
noise distortion has more negative influences on 
defect identification 
accuracy 
than the location variations. T
he 
more MFL signals are contaminated by noises, the 
more distortions in 
MFL signal
s
,
 
and therefore the lower the classification accuracy
. 
The proposed 
CNN has shown some 
resistance
 
to noise, especially in the shape and length identification task. 
Compared to the MFL data with no noise, the additive 10% noises reduce the accuracy to around 
80% in length classification and 85% in shape classif
ication, while in other two tasks, the 
41
 
 
classification accuracies 
fell
 
almost by half. The variation in defect location 
affect
 

performance, but the influence is quite small when compared with noise. It can be seen that, when 
20% of defects are rel
ocated, the variation in accuracies is less than 20% compared to the result 
of original MFL data. Therefore, the proposed defect identification network has good robustness 
in position variance and noise distortions.
 
 
4.4
 
Comparison with Other Machine Learn
ing Methods
 
 
In previous works, feature
-
based techniques are proposed to accomplish the defect 
identification in MFL measurements
22, 45, 47
. Support vector machine (SVM) and tree
-
based 
techniqu
es, e.g., d
ecision
 
t
ree
 
(DT),
 
are also popular tools in building the prediction models and 
solving the classification and regression problem
 
124
-
127
. SVM is based on the statistical learning 
theory while DT is built follow
ing
 
a multistage or hierarchical decision scheme. They have
 
both
 
shown gre
at advance
s
 
in multi
ple 
area
s: a
n alternative procedure to the Fisher kernel 
is used 
in 
SVM and applied in the multimedia classification task
128
; Y. Bazi and F. Melgani designed an 
optimal SVM classification system f
or hyperspectral imagery
129
.
  
P. Ye applied the DT on the 
visual expression identification and it shows com
prehensive and accurate recognition result
s
130
. 
In this 
section
, the principle of those two methods 
is
 
introduced and used as the direct inversion 
models to present the comparison results
 
with t
he proposed algorithms in this thesis
 
on the MFL 
simulation data
.
 
 
42
 
 
4.4.1 Support Vector Machine
 
Similar to neural network
s
, Support Vector Machine (SVM) is one of the powerful kernel
-
based learning 
algorithms
 
that
 
analyze data for classification and regression. 
When
 
the 
input 
data 
is 
not linearly separable
, the
 
non
-
linear SVM transforms the input space into a high
-
dimensional
 
output space
. D
ifferent distribution
s
 
in the feature space 
could fit
 
a linear 
hypersurface to separate all samples
 
into the classes
131
.
 
This
 
transformation can be performed 
by kernel function
 
which allows more simplified representation
s
 
of the data
, such as p
olynomial, 
s
igmoidal, and
 
G
aussian (RBF)
132
-
134
.
 
The various regularization al
gorithms and the kernel 
functions enable SVM to have better performance in generalization and reduce the risk of over
-
fitting based on the rigorous statistical learning theory. In MFL 
defect 
detection area, SVM has 
proven to be a
n effective
 
technique in 
th
e 
reconstruction of defects shape features in
 
48
, while a 
least
-
square SVM model 
is used to correlate the physics
-
based geometric and feature parameters 
to realize a fast reconstruction of 
3
-
D
 
defect profile
s
135
.
 
The main idea behind 
SVM is to find a hyperplane 
that can correctly separate the sample data 
points. G
iven the input data points 


and the corresponding class label 


, the classif
ying
 
hyperplane is constructed as:
 

where 


is 
a
 
non
-
linear function and similar to neural network
s,
 
this
 
function
 
maps the input 
space to a high
, 
possibly infinite
,
 
dimensional feature space. The boundary condition should 
sat
isfy: 
 

43
 
 
The optimization problem is
 
then
 
transferred to choose weights 

 
and bias 

,
 
and select the 
proper kernel functions.
 
SVM performs well when dealing with evenly unstructured and semi
-
structure
d data with low 
overfitting risk and high generalization
136
-
138
. However, it is quite difficult to choose a 
perfect
 
kernel function. In multi
-
class classification problems, it needs 
a 
long training 
period, so
 
their 
training on a large dataset is still a bottleneck.
 
4.4.2 Decision Tree
 
Decision Tree (DT) is another widely used supervised machine learning 
technique by
 
building 
a classification or 
regression models in the form of a tree
-
like structure
139
. 
The final result is a 
tree with decision nodes a
nd leaf nodes. 
D
ecision node
s 
represent
 
where the data to be split
 
and 
each
 
l
eaf node represent
s
 
a class label or 
a 
decision.
 
The complexity of the decision rules 
increases along with 
the 
depth of the tree. DT can learn from the data to approximate a sinus
oid 
function based on specific decision rules. Unlike the black
-
box algorithms
, e.g.,
 
SVM
 
and
 
NN, 
DT
 
interpret
s
 
the data 
by 
follow
ing
 
a strict logic. 
It 
is a non
-
parametric method without the 
assumption of the distribution of the data and the 
structure of the real model
140
. The steps 
involved in the DT construction process
 
are
 
splitting, pruning and selecting 
the 
tree
.
 
 
a)
 
Splitting: The decision tree is built by dividing training data into smaller subsets repeatedly 
according to predictor variables. 
 
b)
 
Pruning: 
To
 
avoid the extra calculation in the searching process, branches of the tree 
are
 
shortened by converting some branch nodes to leaf nodes and deleting the leaf nodes under the 
original branch. It is an effective strategy to solve the overfitting problem. 
 
44
 
 
c)
 
Selecting Tree: It is the process of finding the smallest and the most efficient tree to fit the 
data according to various decision rules. Normally the lowest cross
-
validated error is set as the 
evaluation index.
 
In the NDT area, DT approach is commonly ap
plied as the comparison feature
-
based network 
to provide comparable results. 
D'Angelo
 
and 
Rampone
 
proposed a 
content
-
based image retrieval 
(CBIR) 
solution to classify 
the aerospace structure defects
 
detected by eddy current non
-
destructive testing
 
and the 

performance in defect recognition
141
. Later, DT and other feature
-
based networks are used to 
compare with the proposed neural networks in MFL defect
 
detection task
22
. 
In general, Decision 
Tree is easy to understand and is considered as the fastest way to identify the most significant 
variables and 
the 
relation between the variables
.  However, 
in multi
-
class problems
,
 
the 
probability of o
verfitting is relatively high 
and 
prediction accuracy
 
is
 
low
.  
 
 
4.4.3 Comparison Results
 
In MFL 
defect
 
detection
 
task, t
he proposed CNN network is compared 
with two feature
-
based 
machin
e learning models: 
SVM and Decision Tre
e. 
SVM is trained with the 
S
igmoid
 
kernel
, while 
in DT, ID3 (
Iterative Dichotomiser 3
) algorithm is used to generate the tree.
 
Table 4. 
3
 
Network comparison result in MFL
 
A
ccuracy
 
S
hape
 
L
ength
 
W
idth
 
D
epth
 
Proposed 
C
NN
 
1
00%
 
9
7
.
26
%
 
9
5.89%
 
9
4.53%
 
S
VM
 
65.75%
 
7
1.23%
 
8
3.56%
 
89.0
4
%
 
D
ecision Tree
 
9
0.41%
 
8
7.67%
 
9
3.15%
 
8
6.30%
 
 
45
 
 
T
he comparative result
s
 
are
 
presented in Table 
4.3
. 
T
he accuracy of the proposed 
CNN
 
is 
much 
better than the other model
s
. 
In this simulated MFL defect dataset, there exist data variations in 
each classification task. Take the shape detection, for example, although there are 81 groups of 
MFL signals under each label, the corresponding defect size are not fixed. To SVM and DT, 
the 
extracted features are sensitive to variation in data, especially for small defects; however, CNN 
can suppress the adverse interference of this variation and therefore, 
outperformance in pinpointing 
the distinguishing features of MFL signals. 
 
 
46
 
 
Chapter 5
:
 
Uncertainty Estimation in MFL NDE
 
 
Based on the discussion in section 2.3, this chapter explicitly
 
explains how the Bayesian 
variance inference 
applied
 
in 
CNN
 
to 
obtain
 
the aleatoric and epistemic uncertainties
, which is 
proposed by the referen
ce work
142
. T
hen
 
this uncertainty estimation approach is applied in my 
proposed CNN to MFL defect detection, which helps to explain the relationship between 
data 
and model variation 
and
 
aleatoric and epistemic uncertainties
 
 
5.1 
Aleatoric Uncertainty and Epistemic
 
Uncertainty in CNN
 
 
In the 
n
eural 
n
etwork, as explained before in 
section 
2.3
, the dropout result is of Gaussian 
distributions


, where 


are learned with the training dataset. Since most of the 
classification problems are discrete and fi
nite,
 
w
ith the Monte Carlo integration,
 
the
 
approximat
ing 
variational
 
predictive
 
posterior
 
distribution
 
can be constructed as:
 

w
here 


is the optimized variational parameter to minimize eq.8 and
 
T
 
is the number of samples
 
are set to obtain the distribution
,


are the realized weight vectors derived from variational 
distribution.


and 


represent the new input and corresponding one hot encode output. 
As there 
is no explicit function between the categorical result and Gaussian distributions, the variance of 
the variational predictive distribution allows us to evaluate how much the model is
 
confident in 
its prediction, that is to say
,
 
the 
uncertainty
 
can be quantified
. According to the definition, the 
variance is given by
 

47
 
 
Based on 
a
 
variant of 
law of the total 
variance
, 
eq.23 can be derived as
142
:
 

is a diagonal matrix with 
the 
element of the vector 

 
and 


. 


,
 
w
here a new output 


i
s made for a given input 


,
 
different features are 
generated with randomly assigned 

 
, and each feature is weighted differently to produce the 
posterior di
stribution. 
The first term in eq.24 is defined as the aleatoric uncertainty as its 
expectation is over 


,
 
which 
captures 
the 
inherent randomness of an output
. The second term 
in eq.24 is epistemic uncertainty as its expectation is only related to the network weight parameter 

,
 
which is related to the model only. To estimate uncertainties in CNN classification task, based 
on the previous derivation, 
Kwon defi
ned
 
the predictive uncertainty estimators 
as
:
 
A
leatoric:  


E
pistemic: 


where 


and 


. W
ith
 
increasing 
T
, the 
summation of these 
two terms
 
converges in probability to 
eq.
24
142
. Each output element of softmax function has a 
certain probability and consist to a vector, thus the 
variability of the predictive distribution
 
can 
be obtained by T times calculations.
 
To be more specific, in aleatoric uncertainty estimator, the diagonal matrix of expected output 


is subtracted by the 
inner
 
product
 
of the s
oftmax
-
generated vector
s
. This operation is repeated 
T times and then b
eing averaged by T 
to make it tractable
. The variability of the output is 
48
 
 
considered to be from the inherent noise in data., therefore, the aleatoric uncertainty is 
considered to be related to the data variation. In the MFL inspection, this variance is fro
m the 
physical model which generated the simulated MFL signals. In the epistemic estimator, based 
on the variational distribution 


,
 
the expected outcome 


is represented by the average 
of the softmax
-
generated vectors of T samples. Then this a
verage is subtracted from the softmax
-
generated output. Afterward, for each element, the subtraction is multiplied by its transpose and 
the summation of the output matrix make the process tractable. As the variability of the output 
coming from the model, r
eferred as the proposed CNN model in this thesis, the epistemic 
uncertainty is considered to capture the model variation and is not proportional to the validation 
accuracy. In this way, 
the underlying distribution of the outcom
e can describe 
the inherent 
v
ariability 
of data and model 
and 
has
 
numerical stability
 
as well
.
 
 
5.2 
Uncertainty Estimation on MFL  
 
 
5.2.1 Uncertainty estimation in the proposed CNN on MFL
 
The
 
proposed uncertainty quantification method
142
 
has 
p
erformed well
 
in 
i
schemic 
s
troke 
l
esion 
s
egmentation task
 
by providing 
additional assistance to a more informed decision
. 
Inspired by this, 
I 
involved
 
the
 
predictive
 
estimator
 
(eq.
26 and eq.27
)
 
and 
utilized the various 
outputs of a Dropout function to de

ne a distribution
 
in my convolutional network
. In the next 
section, the 
epistemic and aleatoric uncertainty 
will be estimated 
in 
my
 
MFL classification task 
based on the predictive estimator 
and reasonable interpretation
s
 
will be explained 
for each 
uncertai
nty.
 
 
49
 
 
The principle is
 
appl
ying
 
the variability in modeling the last layer of the neural network 
in
 
order to divide the uncertainty into aleatoric and epistemic uncertainty respectively
 
based on the 
predictive estimator formula. In my proposed CNN, the sof
tmax activation function is already 
assigned to produce the final output. Besides, 
previous
 
review
s 
in
 
Chap 2 have justified that 
dropout is an approximate inference process. To obtain the variability distribution of the output, 
during the prediction stage
, each 
group
 
of
 
testing data is fed into every dropout layer for 100 
times and the output will be normalized every 10 times (which is T in the predictive estimator 
formula). Therefore, for 
each
 
testing MFL sample, there are 10 aleatoric uncertainty results
 
and 
10 epistemic 
uncertainty
 
results respectively, which could provide a tractable distribution to 
describe those two uncertainties.
 
 
5.2.2 Uncertainty E
stimation
 
Result on MFL
 
Two
-
thirds of the MFL data are used to train the network and in the 
prediction stage, the others 
are used to test my proposed network and evaluate the inherent uncertainties in data and model 
through their uncertainty distribution. T
he uncertainty maps
 
are considered to provide
 
extra 
information in addition to the 
MFL defe
ct detection
.
 
Noted that, in each uncertainty plot, x
-
axis 
and y
-
axis represent the uncertainty values and 
occurrences
 
to the corresponding uncertainty 
respectively.
 
Experiment 1:  Uncertainties in MFL defect size classification tasks.
 
I
n this experiment, 
the uncertainty estimation is applied to evaluate
 
the defect size classification 
results and to explore the influence of different size
s
 
on uncertainties. The uncertainty distributions 
are shown in Fig 5.1.
 
50
 
 
Figure 5. 
1
 
Epistemic and aleatoric uncertainty in MFL size classification tasks
 
51
 
 
It can be seen from the results that: in each size classification task of length, width, and depth, 
t
he size parameters bring no 
difference in 
aleatoric or e
pistemic uncertaint
ies. However, different 
tasks bring some variances in uncertainties. Further, the average values of aleatoric and epistemic 
uncertainties are computed 
fo
r each classification task, which are compared with the corresponding 
classification
 
accuracies. The results are described in Table 5.1. 
 
Table 5. 
1
 
Comparison of accuracy, averages of total aleatoric and epistemic uncertainties
 
 
L
ength
 
W
idth
 
D
epth
 
A
ccuracy
 
9
7
.89%
 
9
5.89%
 
9
4.53%
 
A
leatoric Uncertainty
 
0.0142
 
0.0197
 
0.0547
 
E
pistemic Uncertainty
 
0.0021
 
0.0048
 
0.0066
 
 
The results show that the classification accuracy is related 
to uncertainty. It can be seen that t
he 
accuracy is negative
ly
 
correlated with aleatoric
 
uncertainty, that the 
better
 
classification
 
performance
 
is, the less aleatoric uncertainty is, but the epistemic uncertainty is not proportional 
to this change. 
 
Experiment 2:  Uncertainties in MFL defect shape classification tasks.
 
The uncertainty estimation is then applied to evaluate
 
the influence of different defect shapes on 
aleatoric and epistemic uncertainties
. 
The uncertainty distribution and their 
corresponding 
average values are presented in Fig 5.2 and Table 5.2.
 
52
 
 
Figure 5. 
2
 
Epistemic and aleatoric uncertainty in MFL shape classification task
 
 
Table 5. 
2
 
Comparison of aleatoric and epistemi
c uncertainties of each shape
 
 
Cu
 
Cy
 
C
 
A
leatoric Uncertainty
 
0.0386
 
0.0787
 
0.1323
 
E
pistemic Uncertainty
 
0.00
62
 
0.0
127
 
0.0
209
 
 
Unlike the size classification, variations in 
defect
 
shape
 
affect
 
both aleatoric and epistemic 
uncertainties
, 
es
pecially
 
in 
the aleatoric uncertaint
y. B
ased
 
on the comparison results both in 
aleatoric uncertainties and epistemic uncertainties, 
there
 
exist at most around ten
-
fold numerically 
differences among different defect shapes. Because C shaped defects are the 
most irregular shapes 
compared with the other two, the uncertainties are raised. 
 
Experiment 
3
: Different percentage additive Gaussian noise and different MFL data size on 
uncertainty. 
 
53
 
 
Here, 0%, 1%, 5% and 10% Gaussian Noise is added to the whole MFL data
set and applied in 
the shape and size classification task. In order to 
clearly
 
and intuitively reflect the noise impacts 
on uncertainties, the classification results under one label are chosen to estimate uncertainties, 
which are the cubical shaped defect,
 
defects 
of
 
length 5mm,
 
defects 
of
 
width 5mm, and defects 
of
 
depth 5mm.
 
 
Figure 5. 
3
 
Aleatoric and Epistemic uncertainty are computed on the MFL signal with different percentage 
noise
 
 
54
 
 
Figure 5. 
3
 

It can be 
seen from the Fig 5.3, no matter in which classification task, noise in data brings much 
more uncertainties to the aleatoric part than that to the epistemic part: with the noise inferences, 
the average values of aleatoric uncertainties are almost twice lar
ger than previous average values. 
In general, the more noises added to MFL data, the larger the aleatoric uncertainty is, but the 
epistemic uncertainty is barely changed. T
h
is result is consistent with the theory that aleatoric 
uncertainty captures the dat
a inherent variation. 
 
55
 
 
Besides, different sized MFL data sets are tested in the proposed CNN and Fig 5.4 shows the 
corresponding uncertainty results. Original MFL data consists of 243 groups of MFL signals which 
is the marked as Data 1. Data 2 is generated
 
by increasing the amount of original MFL data to 324 
groups of MFL signal while Data 3 are of 405 groups. Notably, the added MFL data are the 
previous MFL data with location alteration. To some extent, the increasing sized data brings 
variation (noise) to
 
data as well. 
 
 
56
 
 
Figure 5. 
4
 
Aleatoric, epistemic uncertainty and average uncertainties are computed on each shaped defect
 
under different data size
 
 
From the uncertainty distribution
s
 
and the trends in 
average
 
uncertainty 
value
s of each shaped 
defect in Fig 5.4, it can be seen that with the increased 
data siz
e
,
 
the aleatoric and epistemic 
uncertainties are 
barely 
changed. Because, in this case, the size of the dataset is not greatly 
increased, it is 
difficult
 
to directly 
explore the relationship between e
pistemic
 
uncertainty and 
the 
57
 
 
CNN 
model
. 
However, 
it has been proved that the aleatoric uncertainty is related wi
th data and in 
this experiment 
the 
aleatoric 
uncertainty does not change 
greatly, therefore, these increased MFL 
data 

 
bring much data variances
. 
C
ombin
ed
 
with 
previous performances that epistemic 
uncertainties are barely affected by data 
intrinsic
 
ra
ndomness
,
 
in turn
, epistemic uncertainty 
accounts for the model variation. 
 
 
58
 
 
CONC
LUS
ION
S
 
 
T
o address the problem of defect 
feature
 
identi

cation in MFL, this 
thesis
 
work
 
proposed a 
novel method based on CNN. Although characteristics of general CNN 
make
 
it 
well suited to deal 
with images and objects recognition and classification problems, the proposed CNN is applied to 
extract defect features directly from the simulated M
FL signals and to classify the size and shape 
of defects. Further, in the MFL inspection, either the uncertainty in data or model affect prediction 
capabilities. Therefore, in order to assess the reliability of the classification results, a Bayesian 
infere
nce method is involved in the proposed Convolutional Neural Network to describe the 
aleatoric and epistemic uncertainties in physical and machine learning model on defect 
identification in MFL inspection. The following conclusions are obtained:
 
1.
 
Although CNNs are most commonly applied 
in
 
visual imager
y
, the proposed CNN provided 
good performances in recognizing defect shape and size directly from 3
-
D MFL signals. 
Besides, the proposed CNN has good robustness in position variance and noise distorti
ons on 
MFL inspection, especially compared with the traditional machine learning approaches.
 
2.
 
The comparable performances of the proposed method with previous work in three different 
NDT datasets show that the proposed CNN shows great versatility in defect 
detection in NDE 
related areas.
 
3.
 
The proposed CNN is then combined with a Bayesian inference method to analyze the final 
classification results and make the uncertainty estimation to the physical model as well as the 
applied classification model on defect i
dentification in MFL inspection. 
 
4.
 
The intrinsic variances in data are proven to be related to the aleatoric uncertainty while the 
model variations are described through epistemic uncertainty. In size classification tasks, the 
59
 
 
different size brings identica
l uncertainties. Besides, the classification accuracy of the proposed 
CNN model is addressed to be negative correlated with the aleatoric uncertainty. 
 
 
60
 
 
FUTURE WORK
 
 
To address the problem of defect feature identi

cation in MFL, this thesis work proposed 
a 
novel method based on CNN. According to previous work, 
the 
CNN model is a useful tool to 
detect and characterize defects in MFL inspection. However, in practical applications, the 
defect shape and size vary and normally, there 
is
 
more than one defect in 
a measurement area, 

problem. Besides, in industry, there are large amount
s
 
of MFL signals collected in the pipeline 
inspection, and CNNs are good at processing 

classifying practical MFL data. 
 
In addition, the uncertainty estimation approach applied in this thesis only focus on the data 
and model
. As mentioned before, there exist different kinds of uncertainties i
n the 
physical 
model, it is necessary to clarify how 
these uncertainties affect the produced data. If successful, 
a reliable MFL defect detection and characterization system could be established and could be 
further applied in other NDT techniques, such as ECT. 
 
 
61
 
 
BIBLIOGRAPHY
 
 
62
 
 
BIBLIOGRAPHY
 
 
1.
 
Cartz, L., Nondestructive testing. 
1995
.
 
 
2.
 
Okolo, C. Modelling and experimental investigation of magnetic flux leakage 
distribution for hairline crack detection and 
characterization. Cardiff University, 
2018.
 
 
3.
 
Rao, B., Magnetic flux leakage technique: basics. 
J Non Destr Test Eval 2012,
 
11
 
(3), 7
-
17.
 
 
4.
 
Okolo, C. K.; Meydan, T., Pulsed magnetic flux leakage method for hairline 
crack detection and characterization.
 
AIP Advances 2018,
 
8
 
(4), 047207.
 
 
5.
 
Zeng, Z.;  Udpa, L.;  Udpa, S. S.; Chan, M. S. C., Reduced magnetic vector 
potential formulation in the finite element analysis of eddy current nondestructive 
testing. 
IEEE transactions on magnetics 2009,
 
45
 
(3), 964
-
967.
 
 
6.
 
Piao, G.;  Guo, J.;  Hu, T.;  Deng, Y.; Leung, H., A novel pulsed eddy current 
method for high
-
speed pipeline inline inspection. 
Sensors and Actuators A: Physical 
2019
.
 
 
7.
 
Snarskii, A.;  Zhenirovskyy, M.;  Meinert, D.; Schulte, M., An integral 
eq
uation model for the magnetic flux leakage method. 
NDT & E International 2010,
 
43
 
(4), 343
-
347.
 
 
8.
 
Wang, Y.;  Liu, X.;  Wu, B.;  Xiao, J.;  Wu, D.; He, C., Dipole modeling of 
stress
-
dependent magnetic flux leakage. 
NDT & E International 2018,
 
95
, 1
-
8.
 
 
9.
 
Zuoying, H.;  Peiwen, Q.; Liang, C., 3D FEM analysis in magnetic flux 
leakage method. 
Ndt & E International 2006,
 
39
 
(1), 61
-
66.
 
 
10.
 
Alobaidi, W. M.;  Alkuam, E. A.;  Al
-
Rizzo, H. M.; Sandgren, E., 
Applications of ultrasonic techniques in oil and gas pip
eline industries: a review. 
American Journal of Operations Research 2015,
 
5
 
(04), 274.
 
 
11.
 
Nestleroth, J. B.; Davis, R. J., Application of eddy currents induced by 
permanent magnets for pipeline inspection. 
NDT & E International 2007,
 
40
 
(1), 77
-
84.
 
63
 
 
12.
 
X
i, G.;  Tan, F.;  Yan, L.;  Huang, C.; Shang, T., Design of an oil pipeline 
nondestructive examination system based on ultrasonic testing and magnetic flux 
leakage. 
Revista de la Facultad de Ingeniería 2016,
 
31
 
(5), 132
-
140.
 
 
13.
 
Buonsanti, M.;  Cacciola, 
M.;  Calcagno, S.;  Morabito, F.; Versaci, M. In 
Ultrasonic pulse
-
echoes and eddy current testing for detection, recognition and 
characterisation of flaws detected in metallic plates
, Proceedings of the 9th European 
Conference on Non
-
Destructive Testing, C
iteseer: 2006.
 
 
14.
 
Joshi, A.;  Udpa, L.;  Udpa, S.; Tamburrino, A., Adaptive wavelets for 
characterizing magnetic flux leakage signals from pipeline inspection. 
IEEE 
transactions on magnetics 2006,
 
42
 
(10), 3168
-
3170.
 
 
15.
 
Shi, Y.;  Zhang, C.;  Li, R.;  Cai, M.; Jia, G., The
ory and application of 
magnetic flux leakage pipeline detection. 
Sensors 2015,
 
15
 
(12), 31036
-
31055.
 
 
16.
 
Mukhopadhyay, S.; Srivastava, G., Characterisation of metal loss defects 
from magnetic flux leakage signals with discrete wavelet transform. 
Ndt & E 
I
nternational 2000,
 
33
 
(1), 57
-
65.
 
 
17.
 
Mukherjee, D.;  Saha, S.; Mukhopadhyay, S., Inverse mapping of magnetic 
flux leakage signal for defect characterization. 
NDT & E International 2013,
 
54
, 
198
-
208.
 
 
18.
 
Chen, Z.;  Yusa, N.; Miya, K., Some advances in nu
merical analysis 
techniques for quantitative electromagnetic nondestructive evaluation. 
Nondestructive testing and evaluation 2009,
 
24
 
(1
-
2), 69
-
102.
 
 
19.
 
Ravan, M.;  Amineh, R. K.;  Koziel, S.;  Nikolova, N. K.; Reilly, J. P., Sizing 
of 3
-
D arbitrary defe
cts using magnetic flux leakage measurements. 
IEEE 
transactions on magnetics 2009,
 
46
 
(4), 1024
-
1033.
 
 
20.
 
Chen, Y.;  Jiang, H.;  Li, C.;  Jia, X.; Ghamisi, P., Deep feature extraction and 
classification of hyperspectral images based on convolutional neura
l networks. 
IEEE 
Transactions on Geoscience and Remote Sensing 2016,
 
54
 
(10), 6232
-
6251.
 
 
21.
 
He, K.;  Zhang, X.;  Ren, S.; Sun, J., Spatial pyramid pooling in deep 
convolutional networks for visual recognition. 
IEEE transactions on pattern analysis 
and machine intelligence 2015,
 
37
 
(9), 1904
-
1916.
 
64
 
 
22.
 
Feng, J.;  Li, F.;  Lu, S.;  Liu, J.; Ma, D., 
Injurious or noninjurious defect 
identification from MFL images in pipeline inspection using convolutional neural 
network. 
IEEE Transactions on Instrumentation and Measurement 2017,
 
66
 
(7), 
1883
-
1892.
 
 
23.
 
Der Kiureghian, A.; Ditlevsen, O., Aleatory or epi
stemic? Does it matter? 
Structural Safety 2009,
 
31
 
(2), 105
-
112.
 
 
24.
 
Hansen, E., 
Measure Theory
. 4th ed.; 2006.
 
 
25.
 
Pijush K. Kundu, I. M. C., David R. Dowling, 
Fluid Mechanics
. Elsevier Inc: 
2012.
 
 
26.
 
Cook, R. D., 
Concepts and applications of finite el
ement analysis
. John Wiley 
& Sons: 2007.
 
 
27.
 
Bishop, C. M., 
Pattern recognition and machine learning
. springer: 2006.
 
 
28.
 
Segura, J., Orthogonal Polynomials: Computation and Approximation. 
JSTOR: 2006.
 
 
29.
 
Jäggi, S. B.; Elsener, B., Macrocell Corrosion of Steel in Concrete: 
Experiments and numerical modelling. In 
Corrosion of reinforcement in concrete: 
mechanisms, monitoring, inhibitors and rehabilitation techniques
, CRC Press: 2007; 
Vol. 38, pp 75
-
104.
 
 
30.
 
Zhang, Y.;  Ye, Z.; Wang, C., A fast method for rectangular crack sizes 
reconstruction in magnetic flux leakage testing. 
Ndt & E International 2009,
 
42
 
(5), 
369
-
375.
 
 
31.
 
Keshwani, R. T., Analysis of magnetic flux leakage signals of instrumented 
pipeline 
inspection gauge using finite element method. 
IETE Journal of Research 
2009,
 
55
 
(2), 73
-
82.
 
 
32.
 
Pechenkov, A.;  Shcherbinin, V.; Smorodinskiy, J., Analytical model of a pipe 
magnetization by two parallel linear currents. 
Ndt & E International 2011,
 
44
 
(8)
, 
718
-
720.
 
 
33.
 
Silvester, P. P.; Ferrari, R. L., 
Finite elements for electrical engineers
. 
Cambridge university press: 1996.
 
65
 
 
34.
 
Zhou, P.
-
b., 
Numerical analysis of electromagnetic fields
. Springer Science & 
Business Media: 2012.
 
 
35.
 
Lord, W.; Udpa, L., Imaging of electromagnetic NDT phenomena. 
1986
.
 
 
36.
 
Schifini, R.; Bruno, A., Experimental verification of a finite element model 
used in a magnetic flux leakage inverse problem. 
Journal of Physics D: Applied 
Physics 2005,
 
38
 
(12), 1875
.
 
 
37.
 
Chen, Z.;  Preda, G.;  Mihalache, O.; Miya, K., Reconstruction of crack shapes 
from the MFLT signals by using a rapid forward solver and an optimization 
approach. 
IEEE transactions on magnetics 2002,
 
38
 
(2), 1025
-
1028.
 
 
38.
 
Mandache, C.; Clapham, L.
, A model for magnetic flux leakage signal 
predictions. 
Journal of Physics D: Applied Physics 2003,
 
36
 
(20), 2427.
 
 
39.
 
Ramuhalli, P.;  Udpa, L.; Udpa, S. S., Electromagnetic NDE signal inversion 
by function
-
approximation neural networks. 
IEEE transactions
 
on magnetics 2002,
 
38
 
(6), 3633
-
3642.
 
 
40.
 
Xu, C.;  Wang, C.;  Ji, F.; Yuan, X., Finite
-
element neural network
-
based 
solving 3
-
D differential equations in MFL. 
IEEE Transactions on Magnetics 2012,
 
48
 
(12), 4747
-
4756.
 
 
41.
 

ace mapping and defect correction. 
Computational Methods in Applied Mathematics Comput. Methods Appl. Math. 
2005,
 
5
 
(2), 107
-
136.
 
 
42.
 
Amineh, R. K.;  Koziel, S.;  Nikolova, N. K.;  Bandler, J. W.; Reilly, J. P., A 
space mapping methodology for defect characterization from magnetic flux leakage 
measurements. 
IEEE Transactions on Magnetics 2008,
 
44
 
(8), 2058
-
2065.
 
 
43.
 
Hoole, S. R. H., Art
ificial neural networks in the solution of inverse 
electromagnetic field problems. 
IEEE transactions on Magnetics 1993,
 
29
 
(2), 1931
-
1934.
 
 
44.
 
Ramuhalli, P.;  Udpa, L.; Udpa, S., Neural network algorithm for 
electromagnetic NDE signal inversion. In 
Electr
omagnetic Nondestructive 
Evaluation (V)
, IOS: 2001; pp 121
-
128.
 
66
 
 
45.
 
Joshi, A., Wavelet transform and neural network based 3D defect 
characterization using magnetic flux leakage. 
International Journal of Applied 
Electromagnetics and Mechanics 2008,
 
28
 
(1
-
2)
, 149
-
153.
 
 
46.
 
Priewald, R. H.;  Magele, C.;  Ledger, P. D.;  Pearson, N. R.; Mason, J. S., 
Fast magnetic flux leakage signal inversion for the reconstruction of arbitrary defect 
profiles in steel using finite elements. 
IEEE Transactions on Magnetics 2012
,
 
49
 
(1), 
506
-
516.
 
 
47.
 
Hari, K.;  Nabi, M.; Kulkarni, S., Improved FEM model for defect
-
shape 
construction from MFL signal by using genetic algorithm. 
IET science, 
measurement & technology 2007,
 
1
 
(4), 196
-
200.
 
 
48.
 
Lijian, Y.;  Gang, L.;  Guoguang, Z.; Songwei, G. In 
Oil
-
gas pi
peline magnetic 
flux leakage testing defect reconstruction based on support vector machine
, 2009 
Second International Conference on Intelligent Computation Technology and 
Automation, IEEE: 2009; pp 395
-
398.
 
 
49.
 

-
Hill
, New York. NY: 1997.
 
 
50.
 
Olden, J. D.;  Lawler, J. J.; Poff, N. L., Machine learning methods without 
tears: a primer for ecologists. 
The Quarterly review of biology 2008,
 
83
 
(2), 171
-
193.
 
 
51.
 
Phillips, S. J.;  Anderson, R. P.; Schapire, R. E., Maximum e
ntropy modeling 
of species geographic distributions. 
Ecological modelling 2006,
 
190
 
(3
-
4), 231
-
259.
 
 
52.
 
De'ath, G.; Fabricius, K. E., Classification and regression trees: a powerful 
yet simple technique for ecological data analysis. 
Ecology 2000,
 
81
 
(11),
 
3178
-
3192.
 
 
53.
 
Drake, J. M.;  Randin, C.; Guisan, A., Modelling ecological niches with 
support vector machines. 
Journal of applied ecology 2006,
 
43
 
(3), 424
-
432.
 
 
54.
 
Cho, E.; Chon, T.
-
S., Application of wavelet analysis to ecological data. 
Ecological In
formatics 2006,
 
1
 
(3), 229
-
233.
 
 
55.
 
Bação, F.;  Lobo, V.; Painho, M. In 
Self
-
organizing maps as substitutes for k
-
means clustering
, International Conference on Computational Science, Springer: 
2005; pp 476
-
483.
 
67
 
 
56.
 
Hopfield, J. J., Neural networks and physical systems with emergent 
collective computational abilities. 
Proceedings of the national academy of sciences 
1982,
 
79
 
(8), 2554
-
2558.
 
 
57.
 
Grounds, M.; Kudenko, D., Parallel reinforcement learning with linear 
func
tion approximation. In 
Adaptive Agents and Multi
-
Agent Systems III. 
Adaptation and Multi
-
Agent Learning
, Springer: 2005; pp 60
-
74.
 
 
58.
 
Tsitsiklis, J. N., Asynchronous stochastic approximation and Q
-
learning. 
Machine learning 1994,
 
16
 
(3), 185
-
202.
 
 
59.
 
Mo
usavi, S. S.;  Schukat, M.; Howley, E. In 
Deep reinforcement learning: an 
overview
, Proceedings of SAI Intelligent Systems Conference, Springer: 2016; pp 
426
-
440.
 
 
60.
 

-
column deep 
neural network f
or traffic sign classification. 
Neural networks 2012,
 
32
, 333
-
338.
 
 
61.
 
Schmidt, U.; Roth, S. In 
Shrinkage fields for effective image restoration
, 
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 
2014; pp 2774
-
2781.
 
 
62.
 
Graves, A.;  Eck, D.;  Beringer, N.; Schmidhuber, J. In 
Biologically plausible 
speech recognition with LSTM neural nets
, International Workshop on Biologically 
Inspired Approaches to Advanced Information Technology, Springer: 2004; pp 127
-
136.
 
 
63.
 
Gers, F
. A.; Schmidhuber, E., LSTM recurrent networks learn simple context
-
free and context
-
sensitive languages. 
IEEE Transactions on Neural Networks 2001,
 
12
 
(6), 1333
-
1340.
 
 
64.
 
Tkachenko, Y., Autonomous CRM control via CLV approximation with deep 
reinforcement
 
learning in discrete and continuous action space. 
arXiv preprint 
arXiv:1504.01840 2015
.
 
 
65.
 
Schmidhuber, J., Deep learning in neural networks: An overview. 
Neural 
networks 2015,
 
61
, 85
-
117.
 
 
66.
 
Zupan, J.; Gasteiger, J., 
Neural networks for chemists: an 
introduction
. John 
Wiley & Sons, Inc.: 1993.
 
68
 
 
67.
 
Khotanzad, A.; Chung, C., Application of multi
-
layer perceptron neural 
networks to vision problems. 
Neural Computing & Applications 1998,
 
7
 
(3), 249
-
259.
 
 
68.
 
Graves, A.; Schmidhuber, J., Framewise phoneme c
lassification with 
bidirectional LSTM and other neural network architectures. 
Neural networks 2005,
 
18
 
(5
-
6), 602
-
610.
 
 
69.
 
Schmidhuber, J.;  Wierstra, D.; Gomez, F. J. In 
Evolino: Hybrid 
neuroevolution/optimal linear search for sequence prediction
, Procee
dings of the 
19th International Joint Conferenceon Artificial Intelligence (IJCAI), 2005.
 
 
70.
 
Anumanchipalli, G. K.;  Chartier, J.; Chang, E. F., Speech synthesis from 
neural decoding of spoken sentences. 
Nature 2019,
 
568
 
(7753), 493.
 
 
71.
 
Lawrence, S.;  Giles, C. L.;  Tsoi, A. C.; Back, A. D., Face recognition: A 
convolutional neural
-
network appro
ach. 
IEEE transactions on neural networks 1997,
 
8
 
(1), 98
-
113.
 
 
72.
 
Ren, S.;  He, K.;  Girshick, R.; Sun, J. In 
Faster r
-
cnn: Towards real
-
time 
object detection with region proposal networks
, Advances in neural information 
processing systems, 2015; pp 91
-
9
9.
 
 
73.
 
Siam, M.;  Elkerdawy, S.;  Jagersand, M.; Yogamani, S. In 
Deep semantic 
segmentation for automated driving: Taxonomy, roadmap and challenges
, 2017 
IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), 
IEEE: 2017; pp 1
-
8.
 
 
74.
 
Ito, Y., Representation of functions by superpositions of a step or sigmoid 
function and their applications to neural network theory. 
Neural Networks 1991,
 
4
 
(3), 385
-
394.
 
 
75.
 
Maas, A. L.;  Hannun, A. Y.; Ng, A. Y. In 
Rectifier nonlinearities improve 
neural network acoustic models
, Proc. icml, 2013; p 3.
 
 
76.
 
Glorot, X.;  Bordes, A.; Bengio, Y. In 
Deep sparse rectifier neural networks
, 
Proceedings of the fourteenth international
 
conference on artificial intelligence and 
statistics, 2011; pp 315
-
323.
 
 
69
 
 
77.
 
Ramachandran, P.;  Zoph, B.; Le, Q. V., Searching for activation functions. 
arXiv preprint arXiv:1710.05941 2017
.
 
 
78.
 
Nam, H.; Han, B. In 
Learning multi
-
domain convolutional neu
ral networks 
for visual tracking
, Proceedings of the IEEE conference on computer vision and 
pattern recognition, 2016; pp 4293
-
4302.
 
 
79.
 
Havaei, M.;  Davy, A.;  Warde
-
Farley, D.;  Biard, A.;  Courville, A.;  Bengio, 
Y.;  Pal, C.;  Jodoin, P.
-
M.; Larochell
e, H., Brain tumor segmentation with deep 
neural networks. 
Medical image analysis 2017,
 
35
, 18
-
31.
 
 
80.
 
Zhu, P.;  Cheng, Y.;  Banerjee, P.;  Tamburrino, A.; Deng, Y., A novel 
machine learning model for eddy current testing with uncertainty. 
NDT & E 
Interna
tional 2019,
 
101
, 104
-
112.
 
 
81.
 
Li, F.;  Feng, J.;  Lu, S.;  Liu, J.; Yao, Y. In 
Convolution neural network for 
classification of magnetic flux leakage response segments
, 2017 6th Data Driven 
Control and Learning Systems (DDCLS), IEEE: 2017; pp 152
-
155.
 
 
8
2.
 
Urbina, A., 
Uncertainty quantification and decision making in hierarchical 
development of computational models
. 2009; Vol. 73.
 
 
83.
 
Ma, Y.;  Wang, L.;  Zhang, J.;  Xiang, Y.;  Peng, T.; Liu, Y., Hybrid 
uncertainty quantification for probabilistic corros
ion damage prediction for aging 
RC bridges. 
Journal of Materials in Civil Engineering 2014,
 
27
 
(4), 04014152.
 
 
84.
 
Hemez, F. M.;  Roberson, A.; Rutherford, A. C. In 
Uncertainty quantification 
and model validation for damage prognosis
, the 
Proceedings of the 4th International 
Workshop on Structural Health Monitoring, Stanford University, Stanford, 
California, 2003.
 
 
85.
 
Alberts, E.;  Rempfler, M.;  Alber, G.;  Huber, T.;  Kirschke, J.;  Zimmer, C.; 
Menze, B. H. In 
Uncertainty quantification 
in brain tumor segmentation using CRFs 
and random perturbation models
, 2016 IEEE 13th International Symposium on 
Biomedical Imaging (ISBI), IEEE: 2016; pp 428
-
431.
 
 
86.
 
Neal, R. M. In 
Bayesian learning via stochastic dynamics
, Advances in neural 
informatio
n processing systems, 1993; pp 475
-
482.
 
 
70
 
 
87.
 
Graves, A. In 
Practical variational inference for neural networks
, Advances in 
neural information processing systems, 2011; pp 2348
-
2356.
 
 
88.
 
Blundell, C.;  Cornebise, J.;  Kavukcuoglu, K.; Wierstra, D., Weight
 
uncertainty in neural networks. 
arXiv preprint arXiv:1505.05424 2015
.
 
 
89.
 
Fortunato, M.;  Blundell, C.; Vinyals, O., Bayesian recurrent neural networks. 
arXiv preprint arXiv:1704.02798 2017
.
 
 
90.
 
Shridhar, K.;  Laumann, F.; Liwicki, M., A comprehensive guide to bayesian 
convolutional neural network with variational inference. 
arXiv preprint 
arXiv:1901.02731 2019
.
 
 
91.
 
Neklyudov, K.;  Molchanov, D.;  Ashukha, A.; Vetrov, D., Variance networks: 
When 
expectation does not meet your expectations. 
arXiv preprint 
arXiv:1803.03764 2018
.
 
 
92.
 
Srivastava, N.;  Hinton, G.;  Krizhevsky, A.;  Sutskever, I.; Salakhutdinov, R., 
Dropout: a simple way to prevent neural networks from overfitting. 
The journal of 
machi
ne learning research 2014,
 
15
 
(1), 1929
-
1958.
 
 
93.
 
Wang, S.; Manning, C. In 
Fast dropout training
, international conference on 
machine learning, 2013; pp 118
-
126.
 
 
94.
 
Kingma, D. P.;  Salimans, T.; Welling, M. In 
Variational dropout and the local 
reparamet
erization trick
, Advances in Neural Information Processing Systems, 2015; 
pp 2575
-
2583.
 
 
95.
 
Gal, Y.; Ghahramani, Z., Bayesian convolutional neural networks with 
Bernoulli approximate variational inference. 
arXiv preprint arXiv:1506.02158 2015
.
 
96.
 
Gal, Y.
; Ghahramani, Z. In 
Dropout as a Bayesian approximation: Insights 
and applications
, Deep Learning Workshop, ICML, 2015; p 2.
 
 
97.
 
Kendall, A.; Gal, Y. In 
What uncertainties do we need in bayesian deep 
learning for computer vision?
, Advances in neural infor
mation processing systems, 
2017; pp 5574
-
5584.
 
 
98.
 
Feng, J.;  Lu, S.;  Liu, J.; Li, F., A sensor liftoff modification method of 
magnetic flux leakage signal for defect profile estimation. 
IEEE Transactions on 
Magnetics 2017,
 
53
 
(7), 1
-
13.
 
71
 
 
99.
 
Lu, S.;  Feng, J.;  Li, F.;  Liu, J.; Zhang, H. In 
Extracting de
fect signal from 
the MFL signal of seamless pipeline
, 2017 29th Chinese Control And Decision 
Conference (CCDC), IEEE: 2017; pp 5209
-
5212.
 
 
100.
 
Afzal, M.;  Polikar, R.;  Udpa, L.; Udpa, S. In 
Adaptive noise cancellation 
schemes for magnetic flux leakage si
gnals obtained from gas pipeline inspection
, 
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. 
Proceedings (Cat. No. 01CH37221), IEEE: 2001; pp 3389
-
3392.
 
 
101.
 
Chen, L.;  Li, X.;  Qin, G.; Lu, Q., Signal processing of magneti
c flux leakage 
surface flaw inspect in pipeline steel. 
Russian Journal of Nondestructive Testing 
2008,
 
44
 
(12), 859
-
867.
 
 
102.
 
Ji, F.;  Wang, C.;  Sun, S.; Wang, W., Application of 3
-
D FEM in the 
simulation analysis for MFL signals. 
Insight
-
Non
-
Destructive
 
Testing and Condition 
Monitoring 2009,
 
51
 
(1), 32
-
35.
 
 
103.
 
Langton, C.;  Pisharody, S.; Keyak, J., Comparison of 3D finite element 
analysis derived stiffness and BMD to determine the failure load of the excised 
proximal femur. 
Medical engineering & physi
cs 2009,
 
31
 
(6), 668
-
672.
 
 
104.
 
Lewis, R.;  Huang, H.;  Usmani, A.; Cross, J., Finite element analysis of heat 
transfer and flow problems using adaptive remeshing including application to 
solidification problems. 
International journal for numerical methods
 
in engineering 
1991,
 
32
 
(4), 767
-
781.
 
 
105.
 
Ansari, S.; Farquharson, C. G., 3D finite
-
element forward modeling of 
electromagnetic data using vector and scalar potentials and unstructured grids. 
Geophysics 2014,
 
79
 
(4), E149
-
E165.
 
 
106.
 
Martin, H. C.; Carey, G. F., 
Introduction to finite element anal
ysis: theory and 
application
. McGraw
-
Hill College: 1973.
 
 
107.
 
Yang, S., Finite element modeling of current perturbation method of 
nondestructive evaluation application. 
2000
.
 
 
108.
 
Jianming, J., 
The Finite Element Method of the Electromagnetic [M]
. Xidian
 

.
 
72
 
 
109.
 
Gupta, A.; Chandrasekaran, K., Finite element modeling of magnetic flux 
leakage from metal loss defects in steel pipeline. 
Journal of Failure Analysis and 
Prevention 2016,
 
16
 
(2), 316
-
323.
 
 
110.
 
Wolfe, J.;  Jin, X.;  Bahr, T.; Holzer, N., Application of softmax regression 
and its validation for spectral
-
based land cover mapping. 
The International Archives 
of Photogrammetry, Remote Sensing and Spatial Information Sciences 2017,
 
42
, 
455.
 
 
111.
 
Marti
ns, A.; Astudillo, R. In 
From softmax to sparsemax: A sparse model of 
attention and multi
-
label classification
, International Conference on Machine 
Learning, 2016; pp 1614
-
1623.
 
 
112.
 
Zhang, L.;  Yang, F.;  Zhang, Y. D.; Zhu, Y. J. In 
Road crack detection 
using 
deep convolutional neural network
, 2016 IEEE international conference on image 
processing (ICIP), IEEE: 2016; pp 3708
-
3712.
 
 
113.
 
Özgenel, Ç. F., Concrete Crack Images for Classification. 2018.
 
 
114.
 
Li, W.
-
b.;  Lu, C.
-
h.; Zhang, J.
-
c., A lower envel
ope Weber contrast detection 
algorithm for steel bar surface pit defects. 
Optics & Laser Technology 2013,
 
45
, 
654
-
659.
 
 
115.
 
Martin, D.;  Guinea, D. M.;  García
-
Alegre, M. C.;  Villanueva, E.; Guinea, 
D., Multi
-
modal defect detection of residual oxide scal
e on a cold stainless steel strip. 
Machine Vision and Applications 2010,
 
21
 
(5), 653
-
666.
 
 
116.
 
Song, K.;  Hu, S.; Yan, Y., Automatic recognition of surface defects on hot
-
rolled steel strip using scattering convolution network. 
Journal of Computational 
In
formation Systems 2014,
 
10
 
(7), 3049
-
3055.
 
 
117.
 
Ren, R.;  Hung, T.; Tan, K. C., A generic deep
-
learning
-
based approach for 
automated surface inspection. 
IEEE transactions on cybernetics 2017,
 
48
 
(3), 929
-
940.
 
 
118.
 
He, Y.;  Song, K.;  Meng, Q.; Yan, Y., An End
-
to
-
end Steel Surface Defect 
Detection Approach via Fusing Multiple Hierarchical Features. 
IEEE Transactions 
on Instrumentation and Measurement 2019
.
 
73
 
 
119.
 
Donahue, J.;  Jia, Y.;  Vinyals, O.;  Hoffman, J.;  Zhan
g, N.;  Tzeng, E.; Darrell, 
T. In 
Decaf: A deep convolutional activation feature for generic visual recognition
, 
International conference on machine learning, 2014; pp 647
-
655.
 
 
120.
 
Cimpoi, M.;  Maji, S.;  Kokkinos, I.;  Mohamed, S.; Vedaldi, A. In 
Descri
bing 
textures in the wild
, Proceedings of the IEEE Conference on Computer Vision and 
Pattern Recognition, 2014; pp 3606
-
3613.
 
 
121.
 
Yosinski, J.;  Clune, J.;  Bengio, Y.; Lipson, H. In 
How transferable are 
features in deep neural networks?
, Advances in neu
ral information processing 
systems, 2014; pp 3320
-
3328.
 
 
122.
 
Sharif Razavian, A.;  Azizpour, H.;  Sullivan, J.; Carlsson, S. In 
CNN features 
off
-
the
-
shelf: an astounding baseline for recognition
, Proceedings of the IEEE 
conference on computer vision and p
attern recognition workshops, 2014; pp 806
-
813.
 
 
123.
 
Virkkunen, I.;  Koskinen, T.;  Jessen
-
Juhler, O.; Rinta
-
Aho, J., Augmented 
Ultrasonic Data for Machine Learning. 
arXiv preprint arXiv:1903.11399 2019
.
 
 
124.
 
Pal, M.; Mather, P. M., An assessment of the effectiveness of decision tree 
methods for land cover classification. 
Remote sensing of environment 2003,
 
86
 
(4), 
554
-
565.
 
 
125.
 
Srivastava, A.;  Han, E.
-
H.;  Kumar, V.; Singh, V., Parallel formulations of 
decis
ion
-
tree classification algorithms. In 
High Performance Data Mining
, Springer: 
1999; pp 237
-
261.
 
 
126.
 
Mathur, A.; Foody, G. M., Multiclass and binary SVM classification: 
Implications for training and classification users. 
IEEE Geoscience and remote 
sensin
g letters 2008,
 
5
 
(2), 241
-
245.
 
 
127.
 
Foody, G. M.; Mathur, A., Toward intelligent training of supervised image 
classifications: directing training data acquisition for SVM classification. 
Remote 
Sensing of Environment 2004,
 
93
 
(1
-
2), 107
-
117.
 
 
128.
 
Moreno
, P. J.;  Ho, P. P.; Vasconcelos, N. In 
A Kullback
-
Leibler divergence 
based kernel for SVM classification in multimedia applications
, Advances in neural 
information processing systems, 2004; pp 1385
-
1392.
 
74
 
 
129.
 
Bazi, Y.; Melgani, F., Toward an optimal SVM c
lassification system for 
hyperspectral remote sensing images. 
IEEE Transactions on geoscience and remote 
sensing 2006,
 
44
 
(11), 3374
-
3385.
 
 
130.
 
Ye, P. In 
The decision tree classification and its application research in 
personnel management
, Proceedings of
 
2011 International Conference on 
Electronics and Optoelectronics, IEEE: 2011; pp V1
-
372
-
V1
-
375.
 
 
131.
 
Nazari, Z.; Kang, D., Density based support vector machines for classification. 
International Journal of Advacnced Research in Artificial Intelligence (I
JARAI) 
2015,
 
4
 
(4).
 
 
132.
 
Kecman, V., 
Learning and soft computing: support vector machines, neural 
networks, and fuzzy logic models
. MIT press: 2001.
 
 
133.
 
Abe, S., 
Support vector machines for pattern classification
. Springer: 2005; 
Vol. 2.
 
 
134.
 
Vapnik, V.; Vapnik, V., Statistical learning theory Wiley. 
New York 1998
, 
156
-
160.
 
 
135.
 
Piao, G.;  Guo, J.;  Hu, T.;  Leung, H.; Deng, Y., Fast reconstruction of 3
-
D 
defect profile from MFL signals using key physics
-
based parameters and SVM. 
NDT & E Inter
national 2019,
 
103
, 26
-
38.
 
 
136.
 
Pradhan, S. S.;  Ward, W. H.;  Hacioglu, K.;  Martin, J. H.; Jurafsky, D. In 
Shallow semantic parsing using support vector machines
, Proceedings of the Human 
Language Technology Conference of the North American Chapter of t
he 
Association for Computational Linguistics: HLT
-
NAACL 2004, 2004; pp 233
-
240.
 
 
137.
 
Barghout, L., Spatial
-
taxon information granules as used in iterative fuzzy
-
decision
-
making for image segmentation. In 
Granular Computing and Decision
-
Making
, Springer: 2
015; pp 285
-
318.
 
 
138.
 
Statnikov, A.;  Hardin, D.; Aliferis, C., Using SVM weight
-
based methods to 
identify causally relevant and non
-
causally relevant variables. 
sign 2006,
 
1
 
(4).
 
 
139.
 
Sharma, P.; Kaur, M., Classification in pattern recognition: A review. 
International Journal
 
of Advanced Research in Computer Science and Software 
Engineering 2013,
 
3
 
(4).
 
75
 
 
140.
 
Kamavisdar, P.;  Saluja, S.; Agrawal, S., A survey on image classification 
approaches and techniques. 
International Journal of Advanced Research in 
Computer and Communicat
ion Engineering 2013,
 
2
 
(1), 1005
-
1009.
 
 
141.
 
D'Angelo, G.; Rampone, S. In 
Shape
-
based defect classification for non 
destructive testing
, 2015 IEEE Metrology for Aerospace (MetroAeroSpace), IEEE: 
2015; pp 406
-
410.
 
 
142.
 
Kwon, Y.;  Won, J.
-
H.;  Kim, B. J.; 
Paik, M. C., Uncertainty quantification 
using bayesian neural networks in classification: Application to ischemic stroke 
lesion segmentation. 
2018
.