UNCERTAINTY QUANTIFICATION FRAMEWORK WITH INTERDEPENDENT
DYNAMICS OF DATA, MODELING, AND LEARNING IN NONDESTRUCTIVE
EVALUATION

By

Zi Li

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Electrical and Computer Engineering—Doctor of Philosophy

2023

ABSTRACT

Even after extensive efforts to enhance our understanding of materials, modeling, and system

processes, uncertainty continues to be an inevitable factor that impacts system behavior, especially

at the operational limits. The evaluation of uncertainty is now a common practice in engineering

and scientific fields, encompassing the analysis of experimental data, as well as numerous compu-

tational models and process simulations. Non-destructive evaluation (NDE) techniques are widely

utilized across a range of industries and applications to guarantee the safety, quality, and depend-

ability of components, systems, and structures. However, NDE processes are often challenged by

uncertainties stemming from factors such as material variations, environmental conditions, and

measurement limitations, which can introduce complexities into the assessment process. There-

fore, there is a need to quantify uncertainties in NDE, which can enhance our comprehension

of the constraints and potential inaccuracies linked to NDE inspections and aid in making NDE

assessments more robust and reliable. In this thesis, a comprehensive uncertainty quantification

(UQ) framework: the Three-Legged Stool (TLS) is proposed to provide systematic guidance in

uncertainty analysis for NDE applications.

A Magnetic Flux Leakag (MFL) based defect characterization algorithm is proposed to classify

the defect and handling uncertainties for pipeline inspection. The research compares Convolutional

Neural Network (CNN) and Deep Ensemble (DE) methods for handling input uncertainties from

MFL response data, while also employing Autoencoder for data augmentation to address limited

experimental data. The study evaluates prediction accuracy and explores uncertainty analysis,

emphasizing the importance of reliability assessment in MFL-based NDE decision-making.

To estimate the fatigue life of martensitic-grade stainless-steel turbine blades, a magnetic

Barkhausen noise (MBN) technique is applied. This work involves the extraction of time and

frequency domain features, followed by the application of techniques such as Principal Component

Analysis (PCA) and probabilistic neural network (PNN) for classifying and estimating the remaining

fatigue life.

An IMU-assisted robotic Structured light (SL) sensing system was developed for pipeline

detection. This system improves registration and defect estimation through a RANSAC-assisted

cylindrical fitting algorithm, integrates inertial and odometry measurements for precise 3D profiling,

and employs customized defect sizing techniques to offer a reliable 3D defect reconstruction solution

for various defect shapes and depths.

The proposed TLS-based UQ framework highlights the interdependent dynamics among data,

models, and learning when addressing uncertainties in NDE processes. Some advanced and

commonly used techniques have been introduced to illustrate how uncertainties in the inputs or

parameters of an NDE system, model, or measurement are propagated to the outputs or predic-

tions. The uncertainty propagation is considered in terms of the forward modeling and inverse

learning process separately. In order to demonstrate the efficiency and applicability of the proposed

framework for NDE applications, the uncertainties in the previously mentioned NDE cases are

investigated and quantified using the techniques outlined in the TLS model.

In summary, the proposed UQ framework is able to provide guidance in dealing with uncer-

tainties in NDE inspection with efficient and reliable solutions. It holds great promise and opens

up avenues for further research and advancement within the industry.

Copyright by
ZI LI
2023

I would like to express my dedication to this work to my parents, my aunt’s family, my friend, and
my entire family for their unconditional support throughout my life’s journey, as well as their
encouragement to confront and overcome all challenges, which have contributed to shaping the
person I am today.

v

ACKNOWLEDGEMENTS

I want to express my sincere appreciation to all those who have been by my side and provided

assistance throughout my doctoral journey.

I extend my thanks to my advisor and mentor, Dr.

Yiming Deng, for his exceptional support, guidance, and motivation during my doctoral studies.

He consistently believed in me and guided not only my research but also my career and personal

life. It is a great fortune for me to work with such a knowledgeable and caring advisor. I am grateful

to my doctoral committee: Dr. Lalita Udpa, Dr. Xiaobo Tan, Dr. Ming Han, and Dr. Chih-Li

Sung, for their valuable time and suggestions. I’d also like to express my gratitude to my fellow

colleagues at NDEL for their help, support, and the insightful discussions we’ve had. Lastly, I’d

like to express my deep gratitude to my family and friends for their support and encouragement.

vi

TABLE OF CONTENTS

CHAPTER 1

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
1.1 Uncertainty Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Uncertainty Analysis in NDE . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Objective and Scope

1
1
3
5

CHAPTER 2

7
UNCERTAINTY IN THREE-LEGGED STOOL FRAMEWORK . . . .
2.1 Uncertainty Sources in TLS UQ Framework . . . . . . . . . . . . . . . . . . .
7
2.2 Uncertainty Propagation in Three-Legged Stool (TLS) UQ Framework . . . . . 13

CHAPTER 3

FORWARD MODELING BASED UNCERTAINTY PROPAGATION .

. 15
3.1 Methods of Uncertainty Propagation in Forward Modeling . . . . . . . . . . . 15
3.2 TLS-based Forward Uncertainty Modeling Application . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Conclusion . .

. .

.

CHAPTER 4

INVERSE LEARNING BASED UNCERTAINTY PROPAGATION . . . 29
4.1 Probability Theory for Inverse Uncertainty Quantification . . . . . . . . . . .
. 29
4.2 Methods of Uncertainty Propagation in Physics-informed Learning . . . . . . . 31
4.3 Methods of Uncertainty Propagation in Data-driven Learning . . . . . . . . .
. 33
4.4 Methods of Uncertainty Propagation in Hybrid Learning . . . . . . . . . . . . 34
. 40
4.5 TLS-Based Inverse Uncertainty Learning Application . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.6 Conclusion . .

. .

.

CHAPTER 5

RELIABILITY EVALUATION TO NDE PROCESS WITH UQ . . . . . 65
5.1 Probability of Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 GUM-based Measurement Uncertainty Evaluation . . . . . . . . . . . . . . . . 68
5.3 Magnetic Barkhausen Noise-based Material Fatigue Detection . . . . . . . .
. 72
5.4 Structure Light Sensing based Defect Reconstruction . . . . . . . . . . . . . . 84
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Conclusion . .

. .

.

CHAPTER 6

6.1 Conclusion . .
6.2 Future Work .

CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . .

. 105
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
. 107
.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .
.

. .

BIBLIOGRAPHY . .

. .

.

.

.

.

.
.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

vii

CHAPTER 1

INTRODUCTION

1.1 Uncertainty Quantification

Despite years of efforts to enhance knowledge of the material, modeling, and systems processes,

uncertainty remains an unavoidable element affecting the behavior of systems and more so with

respect to their limits of operation. Uncertainty assessment is becoming increasingly common

in fields of engineering and science and includes almost all experimental data processing as

well as computational modeling and simulation processes. Therefore, uncertainty analysis is a

key component of model-based risk analysis and decision-making, as it provides risk assessors

and decision-makers with information about the reliability of model outputs [1]. Uncertainty

quantification (UQ) involves the quantitative assessment of uncertainty from various sources (input

variables, parameters, or equations) throughout the process and derives the uncertainty distribution

for each output variable [2]. The purpose of this analysis is to determine how uncertainty in certain

components (eg: inputs, parameters, equations) translates into uncertainty in the final output. For

example, it can be applied to compute the probability of an output variable of interest exceeding a

certain threshold. The methodology and extent of uncertainty estimation, and result interpretation,

vary widely depending on the nature and context of each assessment and the degree of uncertainty

that exists. In the domain of prognostics and health management, uncertainties in mathematical

and computational models of complex processes and data are analyzed following the steps of

representation, quantification, and management [3–6]. Specifically, inspired by [7], the process

for modeling and quantifying uncertainty can be divided into three main activities: Uncertainty

Categorization, Uncertainty Handling, and Uncertainty Characterization and Estimation.

Uncertainty quantification (UQ) involves identifying and categorizing various sources of un-

certainty that may affect prognostics and sensitivity analysis. It is an important step to incorporate

these sources of uncertainty into models and simulations as accurately as possible. The goal of

this step is to resolve each of these uncertainties separately and quantify them using probabilis-

tic/statistical methods. The most commonly used forms of uncertainty sources are classified as

1

aleatory (statistical) uncertainty and epistemic (systematic) uncertainty [8–10]. The former refers

to the notion of variability and inherent uncertainty, from the natural variability of the physical

system. As opposed to aleatoric uncertainty, in theory, epistemic uncertainty is reduced on the

basis of additional information, such as quality control [11], structural testing [12], non-destructive

inspection [13], and etc.

There are several theories that guide uncertainty handling based on the interpretation of the

input uncertainties and the detailed application. The most widely and classical uncertainty theory

is probability theory [14]. Under the probability principles, the input parameters are random

variables represented using probability distributions. Specifically, with the context of the probability

framework, the Bayesian approach provides a more suitable uncertainty interpretation based on

prior knowledge and provides a way to update the probability of that event (posterior probability)

given new or additional evidence [15]. Besides the empirical probability theory, there are works

that shed new light on various interpretations of fuzzy sets and clarify their links with probability

theory; conversely, Zadeh’s logical point of view on fuzzy sets suggests a set-theoretic perspective

on uncertainty measures, that brings together numerical quantification and logic [16]. Evidence

theory, also called Dempster–Shafer theory, is proposed as an alternative to the classical probability

theory to handle limited and imprecise data situations as an alternative to the classical probability

theory [17, 18]. Besides, some researchers extend the notion from risk to uncertainty by invoking

the principle of bounded subadditivity: an event has greater impact when it turns impossibility

into possibility, or possibility into certainty than when it merely makes a possibility more or less

likely. A series of studies provide support for this possibility theory in decisions under both risk

and uncertainty and shows that people are less sensitive to uncertainty than to risk [19, 20]. Further,

various theories can be combined to take advantage of the ability of each theory. Elishakoff et al.

illustrate that the probabilistic analysis and the non-probabilistic, interval analyses are compatible

with each other in an applied mechanics problem [21]. Chutia et al. combined imprecise probability

and fuzzy knowledge and applied it to atmospheric dispersion with a case study [22]

Uncertainty characterization is then considered to provide a statistical description of the iden-

2

tified uncertainty sources based on the categorized uncertainty theory. Under the probability

framework, if the probability distribution can be determined based on the prior knowledge, it is

considered a parametric approach in probabilistic handling; otherwise, for nonparametric methods,

the distribution comes only from the data but does not follow a specific distribution [23, 24]. In the

parametric approach, identifying the distribution fitting functions is the first step which could be

performed based on expert knowledge or a model selection method such as the Bayesian method

[25, 26], Bayesian information criterion [27], and Akaike information criterion [28]. Next, some

state of art tests, such as chi-squared [29] and Anderson–Darling [30], are applied to fit the pa-

rameters of the estimated distribution function. Note that in most scenarios, it is hard to have

accurate knowledge of the input probability distribution. Also, if the random variable follows a

nonparametric distribution or the given amount of data is insufficient, the nonparametric method

is preferred to provide a more reliable prior probability distribution [31]. However, one distinct

limitation of the nonparametric study is that the quality is highly correlated with the quality of

the given data, therefore, one usually applies some hybrid technique, such as interval analysis, to

overcome this problem [32]. Note that uncertainty characterization is the first stage of UQ, further

the input uncertainty will be propagated through the analytical or mathematical model to obtain

random responses or output. The corresponding uncertainty analysis to input, model, and output

construct a full uncertainty quantification procedure [33]. In all state-of-the-art UQ-based tech-

niques, the selection of uncertainty modeling and analysis will vary depending on the application

and the problem that needs to be solved.

1.2 Uncertainty Analysis in NDE

Non-destructive evaluation (NDE) is an important aspect of any facility maintenance program

while ensuring a product’s integrity, reliability, safety, and long-term productivity by implementing

a qualified testing and inspection program. NDE with UQ represents an advanced and holistic

approach to the inspection and assessment of materials, components, and structures [34]. It goes

beyond traditional NDE by incorporating the rigorous analysis and management of uncertainties

that can influence the outcomes of inspections and measurements. By quantifying and addressing

3

uncertainties, the limitations and potential errors in NDE results can be understood, therefore,

more reliable and informed decision-making can be made. Besides, it helps in the early detection

of defects or potential failures, allowing for timely maintenance or replacement.

Identifying

uncertainties in UQ is also contributes to cost savings and efficient resource utilization, and

unnecessary maintenance or replacement could be avoided by distinguishing between false alarms

and genuine issues.

A successful uncertainty analysis of an inspection system requires a deep comprehension of

the system itself. Normally, a quantitative NDE system can be described through the forward and

inverse modeling approach. A full NDE system is usually analyzed through forward and inverse

approaches, which are the fundamental components of the assessment and analysis of materials,

components, and structures [35]. In the forward learning process, some established knowledge,

models, and methods are used to predict or simulate the expected outcomes of NDE inspections or

measurements and make predictions based on known factors such as material properties, inspection

parameters, and equipment characteristics. The inverse learning process, on the other hand, involves

the analysis of actual NDE data or measurement results to extract meaningful information to infer the

material properties, defects, or characteristics of the component being inspected. Some optimization

techniques or statistical methods are usually involved to match the observed data with a model,

allowing for the identification of defects or property variations [36].

The approach and scope of uncertainty analysis, as well as the interpretation of results, exhibit

considerable variability. This variation is contingent on the specific characteristics and context of

each NDE and diagnostic procedure, as well as the extent of inherent uncertainty. In the realm

of prognostics and health management, the analysis of uncertainties in intricate mathematical

and computational models and data follows a structured process encompassing representation,

quantification, and management, as highlighted in previous works [3–6]. Additionally, the choice

of the NDE method employed is a critical consideration for researchers in proposing effective

UQ solutions based on the specific application’s requirements. For example, a comprehensive

examination of the factors that affect measurement uncertainty in the context of Ultrasonic Testing

4

(UT) has been proposed in [37]. They considered the influence of test material parameters and the

choice of operating parameters, while also referencing relevant standards. In [38], the performance

of Guide of Measurements (GUM) and Monte Carlo (MC) methods are compared for the estimation

of measurement uncertainty in Electromagnetic (EM) compatibility testing. A hybrid random/fuzzy

approach for uncertainty quantification was proposed in electromagnetic modeling, which combines

probability and possibility theory to properly account for both aleatory and epistemic uncertainty,

respectively [39]. Janousek et al. reduced the uncertainty in depth estimation of partially conductive

cracks with eddy current testing [40]. Beyond its traditional applications in material characterization

and pipeline inspection, UQ-based NDE has found extensive usage in diverse fields such as medical

imaging, geophysics, computer vision, and various other industries [41–43].

1.3 Objective and Scope

As demonstrated in the last section, the context of uncertainty interpretation is varied for different

applications and there are multiple ways to describe the uncertainty propagation in different NDE

techniques. In this dissertation, I proposed a new framework to redefine the UQ in NDE scope by

dividing the sources of uncertainty into data, model, and learning, which is called UQ Three-Legged

Stool (TLS) Framework. Related NDE applications will be discussed within the scope to validate

the feasibility of the proposed structure, which provides great potential for further NDE-related

uncertainty analysis.

Chapter 2 demonstrates the proposed TLS UQ framework by explicitly explaining the uncer-

tainty sources, which is called the "Leg" in this thesis work. Besides, based on the characteristics

of the NDE process, a brief introduction to the propagation of uncertainty is presented.

Chapter 3 describes the forward "Modeling" based propagation of uncertainty within the pro-

posed uncertainty sources under the TLS framework. State-of-the-art uncertainty propagation

techniques are applied to provide a comprehensive introduction to this process, Besides, a Capaci-

tive sensing-based experimental work is illustrated to validate the feasibility of the proposed TLS

UQ framework. The relationship between the "Data" and the output signal is described through a

"Learning" process based on a surrogate model, which provides a basis for studying the uncertainty

5

propagation from the "data" to the "modeling".

Chapter 4 addresses the inverse "Learning" based propagation of uncertainty within the pro-

posed uncertainty sources under the TLS framework. The Bayesian theory, as the most popular

foundation for uncertainty analysis, is introduced, which provides a good basis for further un-

certainty propagation discussion. Effective methods are introduced to understand and analyze

involved uncertainty depending on the characters of different scenarios. Furether, a Magnetic flux

leakage-based defect classification work is used as an practical example to quantify the uncertainty

between "Data" and "Learning". The impact of uncertainties in learning and data is investigated

through the Convolutional Neural Network (BCNN) and the Deep Ensemble-based(DE) technique

with data-augmented techniques.

Chapter 5 introduced two typical approaches that have been extensively applied in NDE inspec-

tions. Probability of Detection (POD) is introduced to evaluate the effectiveness and reliability of

defect detection under uncertainties. Additionally, the discussion extends to the topic of measure-

ment uncertainty analysis, which aims to assess and quantify the overall uncertainties associated

with NDE measurements. Two NDE applications regarding measurement uncertainty analysis

are presented as supporting examples. The first application is for material fatigue detection and

prediction using the Magnetic Barkhausen Noise (MBN) technique. In this work, The feasibility

of the MBN technique is investigated in detecting early-stage fatigue, which is associated with

plastic deformation in ferromagnetic metallic structures. Another one is to quantify the measure-

ment uncertainty in structured light sensing-based defect characterization. The uncertainties in this

sensing system are investigated and then the total reconstruction uncertainty and the measurement

uncertainty are studied to illustrate the reliability of the proposed system.

6

CHAPTER 2

UNCERTAINTY IN THREE-LEGGED STOOL FRAMEWORK

In this chapter, possible sources and forms of uncertainty in the proposed Three-Legged Stool

(TLS) framework are discussed to provide a thorough understanding of the influential parameters

in the system. Uncertainties coming from Data, Modeling, and Learning lay a strong foundation

for understanding how the uncertainties are propagated within the inspection process, which is

presented in Fig. 2.1. Nonetheless, definitions and supporting discussions of uncertainty-related

examples will be presented in this chapter.

2.1 Uncertainty Sources in TLS UQ Framework

2.1.1 Data Uncertainty

Under the NDE context, "Data" is divided into two parts: input and inspection output. Inspection

output usually referred to as inspection data, is generated from simulation or experiment induced

by the material under test or defects (if present). The uncertainties are considered mainly from

three aspects: the inspected material, defect geometry, and sensing/experimental measurement.

The research on material uncertainties is an important part of material integrity applications.

In the NDE scope, the structural design needs to consider overstress, fracture, and fatigue failure

mechanisms in the context of material degradation, which is a random process possible for a crack to

initiate. Uncertainties are highly related to incomplete knowledge of material property changes and

Figure 2.1 Illustration of UQ for NDE in TLS framework.

7

should be investigated to ensure the reliability of inspected material. Based on the literature review,

the quantified data on material properties can be represented in the form of statistical ranges or

probabilistic form based on stochasticity or possibilistic form with fuzzy variables to schematically

distinguish one material from another while significant variability was observed in the respective

material properties [44]. For example, Taddei et al. investigated the effects of material uncertainty

for low-to-high frequency vibration analyses of thin plates utilizing a statistical moment-based

probabilistic approach [45]. Silva et al. quantified the variability in material properties of sisal

fibers using a Weibull distribution, which is a form of probabilistic distribution, to correlate

sisal microstructures with tensile strength [46].

In [47], finite element-based probabilistic and

possibilistic methods are discussed and compared in the modulus of elasticity analysis. Generally,

the material properties are divided into microstructural properties, known as phase content and

grain size, and mechanical properties, such as hardness, strength, residual stress, etc. For both

microstructural and mechanical properties, Ultrasonic testing (UT) is a powerful NDE technique to

provide parameters, which are correlated with grain size, hardness, yield strength, residual stress,

etc [48]; electromagnetic techniques is a typical NDE method to characterize material using EM

induction process. The interaction between the electromagnetic and microstructural properties,

such as grain boundaries, precipitations, as well as mechanical properties, is a good determination

of material state [49]. Besides, some hybrid NDE methods, the combination of several NDE

testing methods, perform well in characterizing material residual stresses, hardness, yield strength,

etc. Those hybrid techniques are applied with respect to specific problems while considering the

material’s physical information and the characters of measurement methods [50].

In the process of NDE-based defect identification, the forward problem is a direct method to

obtain the response from the defect through experimental measurement or numerical simulation,

while the solution of the inverse problem involves identifying the studied defect from the response.

Therefore uncertainties of the defect geometry size, shape, or orientation result in uncertainties in

the associated inspected response, which further affect the overall inspection uncertainty. Various

mathematical and analytical procedures are used to reduce uncertainty in estimating parameters

8

of defect geometry. Houssem et al. addressed the effect of defect depth on response signals and

then optimized 3D defects with eddy current measurements [51]. In [52], the variety of the defect

volume, presented in different shapes, is directly related to the probability of failure to affect the

system reliability for eddy currents control. In an X-ray Computed Tomography (XCT) based flaw

detection case, the effects of flaw locations and orientations in the phantoms are incorporated and

investigated with a multi-level Bayesian model [53]. In some cases, defects manifest as changes in

the physical properties of conductive materials. Therefore, conductivity is considered as a random

variable expanded in a series of Hermitian polynomials as one of the sources of uncertainty [54, 55].

During experiments, the inspection reliability depends not only on defect variability and material

heterogeneity but also on the measurement quality, which is related to the noise resistance related

to the sensor and experimental setup, as well as the superimposed measurement noise from the

operation of inspectors. The noise resistance during the simulation or experiment is a very important

criterion in NDE inspection estimation [56]. Lift-off, which is the distance between the sensor

and the inspected material, is one of the most common sources to induce uncertainties, which

are brought by cases like uneven surfaces, non-conducting coatings, or irregular paintings on the

material [57]. Besides, there are other artifacts to affect the measured NDE data, such as the probe

resolution and orientation, fluctuations in sampling rates, ambient electromagnetic disturbances etc

[58–60]. The noise resistance from the aforementioned uncertainties will affect the sensitivity to

capture changes in the material under the tests, which will further give rise to detection errors [61].

In another hand, the operator-influenced measurement uncertainty can be divided as systematic

and random. Systematic measurement error is a consistent uncertainty related to incomplete

knowledge of the inspection or setup, such as changes in frictional force between the probe and

material surfaces as well as misalignment of the probe and coordinates of the material under the

tests. The random error is only some incorrect measurements which rarely happen. The Guide to

the Expression of Uncertainty in Measurement (GUM) [62], is the most commonly used method to

analyze measurement uncertainty and evaluate the measurement quality, which will be discussed

in the next chapter.

9

2.1.2 Modeling Uncertainty

In TLS framework, "Modeling" denotes describing the forward procedure from the input

(Inspected material) to the obtained NDE inspection data. NDE inspection methods primarily rely

on scientific principles of physics to model the sensing problems, which range from capillary action

(dye/fluorescent penetrant methods) to wave propagation (ultrasound, microwave, and terahertz

methods) to high energy interactions of elementary particles (radiography) [63].

The process of physics-based mathematical modeling is to decompose and refine the complex

system and then attempt to reveal the basic principle behind a phenomenon. There are two types

of uncertainties affecting the confidence of the given modeling procedure: parametric uncertainty

and structural uncertainty. Parametric uncertainty arises because of insufficient knowledge of the

predefined model parameters such as empirical quantities, initial conditions, boundary conditions,

etc [64]. The effective definition of the initial condition and the selection of empirical quantity,

which is considered as an initial system property, is an important uncertainty factor for "Modeling"

design [65]. Some researchers have proved that boundary conditions are a significant source of

uncertainty in structural dynamics modeling, where even small changes in boundary conditions

can cause significant changes in the model predictions, such as estimated model parameters, dy-

namic time domain response, and frequency response functions [66–68]. Theoretically, parametric

uncertainty can be identified and reduced by parameter refinement during the modeling, which

is usually realized through the parametric study, such as the perturbation method [69], hierarchy

method [70], and Neumann expansion method [71]. However, in most cases, without sufficient

theoretical support coupled with complex physical systems or engineering problems, it’s hard to

access and restructure mathematical models directly to describe the NDE modeling procedure. In

this case, some coarse or approximate equations are applied to simply the true underlying physics.

Since the physics behind is hard to be fully understood, the lack of complete knowledge will make

the outputs of the simplified modeling have a large disparity to the true results, therefore, other

than the parametric uncertainty, structural uncertainty will exist in the system as well. Based on the

definition, the structural uncertainty is aleatoric which can be reduced only by improving param-

10

eterized schemes, refining model dynamics, or implementing state-of-the-art numerical methods

[64].

In the physics-based mathematical modeling procedure, the input and the material parameters

are assumed known, whereas forward models can be developed to predict or estimate the output

response function. In contrast, there are cases where the measured data are obtained from undefined

input functions or the material properties, where the corresponding procedure can be described as

empirical modeling. The uncertainty of the empirical modeling can not be identified and resolved

directly, therefore, learning will be applied as the inverse procedure for uncertainty investigation,

which will be described in the next section.

2.1.3 Learning Uncertainty

Inversion techniques are used to obtain quantitative estimates of the size, shape, and natural

properties of defects in materials based on either measured or computed experimental data. Usually,

the inversion is hard to be solved directly. Therefore, the approximate functional mapping between

the input and the desired output can be described through a simplified model by fitting an optimal

statistical distribution to data, which can be named as a surrogate model. In the TLS framework,

the process of constructing an optimized surrogate model is described as "Learning". Two kinds

of learning techniques are defined as the surrogate model, depending on the availability of physical

prior information to the measured or inspected data during the "Modeling" stage. Measured

data, from physics-based mathematical modeling, provide good physical theoretical support in the

inverse problem, whose process is identified as physics-informed learning. Otherwise, the inversion

process of empirical modeling’s output is called data-driven learning. Artificial intelligence(AI)

based approaches (such as Neural Network clusters [72, 73], fuzzy logic [74, 75], etc.) and the

statistical-based approaches (such as Regression related methods [76, 77], hidden Markov model

[78], etc.) are commonly applied as the data-driven learning. While Bayesian inference plays an

important role in integrating the available physics model with the selection of estimation algorithms,

such as the Bayesian method [79] and filtering related methods [80, 81], in the scope of physics

informed learning [82].

11

The benefit of physics physics-informed learning model arises from its use of physics parameters

to assist in describing the damage behavior. In this case, the integration of the physical model and

measured data is able to assist in identifying model parameters and predicting future behavior.

Therefore, even with limited data, the detection and characterization quality can be ensured.

However, when constrained by the knowledge to describe the problem, information from previously

collected data is needed for training data-driven learning process to identify the characteristics of

the currently measured damage state and to predict the future trend.

It is a fitting process for

constructing the surrogate model by optimizing the learning without complex analytical theory

cooperating. Compared with the physics-based learning model, this is computationally cheaper

to evaluate and compute, but this requires large efforts in training and optimizing to find the

optimal solution [83]. Besides, some works combine the concepts from both approaches as hybrid

approaches to improve the performance [84, 85]. The aforementioned methods all have different

properties that contribute to the preference of researchers’ choice.

As "Learning" is identified as the optimization process, the optimal model and parameters are

usually determined through the iterative process to reach a reliable solution. Therefore, similar to

uncertainties in "Modeling", structural and parametric uncertainty exist in "Learning". Structural

uncertainty is related to the selection of a learning model, and in physics-informed cases, an

appropriate estimation model can ensure the optimal posterior distribution is learned and updated

from the prior knowledge or information of the unknown parameters. When comes to data-driven

learning, the structural uncertainty can be highly reduced by applying an efficient model to ensure

the relationship between the training input and output can be learned and accurately described.

Regarding the parametric uncertainty, updating and refining the learning model’s parameters are

essential for uncertainty reduction during optimization. For example, the determination of the

number of nodes, weight parameters, and initial values in NN-related data-driven learning [86]

and identification and correlation to parameters of the likelihood function and mechanism [87] in

physics-informed case, are all important in ensuring the completeness and reliability in learning.

The less uncertainty in learning will give us a better insight to uncertainties from the data itself.

12

2.2 Uncertainty Propagation in Three-Legged Stool (TLS) UQ Framework

Generally, an effective uncertainty propagation analysis of an NDE problem requires a thorough

understanding of the system and detailed knowledge of all the influential parameters and their

effects. The uncertainties during the NDE inspection process can be described through the forward

"Modeling" and "inverse Learning":

1. Forward Modeling: The output signal from NDE sensors conveys the uncertainties from the

geometric and material parameters of the problem. The aleatoric uncertainty during this stage

referred to the variations in material properties, and geometry [88]. Uncertainties in material

property, such as hardness [89], strength [90], and micro-structure [91], are usually analyzed

in material characterization applications. Besides, in the application of damage detection,

aleatoric uncertainties are not only from the material itself, but the geometric variance in

defect design [92]. On the other hand, examples of epistemic uncertainty are related to the

processing parameters for simulation as well as for experiment testing. For example, the

mesh parameters in FEM-based simulation solutions [93]; the variability associated with

setup procedures [94] and environmental noise [95] in the experimental inspection. The

addressed variability will be incorporated in further modeling and analysis, which results in

uncertainty propagation throughout the whole inspection system.

2. Inverse Learning: In this procedure, a mathematical or analytical framework is applied to

obtain the predicted parameters to describe the system from the observed measurement or

simulated output from the forward procedure [96]. The output often provides information

about physical parameters that we cannot directly observe or infer. The variation of input

uncertainty is not only propagated from the previous forward stage, there will be additional

variability associated with measurement procedures [97]. Besides, the application of the

inversion model will bring epistemic uncertainty to this process which is related to learning

model parameters and model.

All sources of uncertainty have an impact on system response, and the way these errors propagate

13

Figure 2.2 Illustration of the proposed TLS framework.

and interact can affect the overall accuracy of NDE detection. Therefore, the optimal design of NDE

projects should quantify the uncertainty in system output performance propagated from uncertain

inputs, which is described as Uncertainty Propagation (UP). A major problem when performing

uncertainty analysis in many models is the large dimensionality of the uncertain parameters, which

manifest themselves as multiple parameters in complex models, or as stochastic input fields. It is of

great importance to decompose the uncertainty into unique, shared, and synergistic contributions

originating from the different "Legs". The proposed TLS UQ framework is presented in Fig. 2.2.

Specifically, forward UP modeling can be categorized into two primary problems, which depend on

the presence or absence of a mathematical model. On the other hand, the assessment of the inverse

learning process can be approached from the perspectives of physics-informed, data-driven, and

hybrid scenarios. Detailed UP discussions for "Forward Modeling" and "Inverse Learning" will be

presented in the following chapters.

14

CHAPTER 3

FORWARD MODELING BASED UNCERTAINTY PROPAGATION

3.1 Methods of Uncertainty Propagation in Forward Modeling

Considering the uncertainty in the input data, this section will describe how the uncertainty from

the inputs is propagated through the model to affect the overall system response in the forward NDE

process. For general UP problems, forward modeling can be classified into two main problems

depending on the availability of a mathematical model: intrusive and non-intrusive. Theoretically,

intrusive methods involve incorporating input uncertainties in the explicit model equations or

algorithms; while non-intrusive methods treat the models as ’black boxes’, which could introduce

additional approximations during the uncertainty propagation process [98]. Moreover, if the input

uncertainty could be described in terms of probability distributions or random variables within

the model, the uncertainty propagation analysis is categorized as probabilistic. In contrast, there

are situations where uncertainty inherently exhibits a fuzzy nature, and these are categorized as

non-probabilistic approaches [99]. Popular UP methods have been categorized into different groups

theoretically, such as simulation-based methods, local expansion-based methods, most probable

point-based methods, functional expansion-based methods, numerical integration-based methods,

and evidence theory [99]. Those methods have been widely investigated and compared based on

their advantages and applications [100–106]. Based on the characteristics of different forward UP

approaches, several popular UP methods are discussed and compared for different NDE applications.

The Monte Carlo simulation (MCS) method, also known as random event simulation technology,

is a popular simulation-based mathematical method in UP [107–109]. MCS applies random

sampling from a specific probability distribution of the random variable to predict the output of

a model. With a known simulation or computational model derived from physics, the combined

output is then collectively analyzed to understand the statistical variability of the random system

[110]. As the focus of MCS and its variant techniques in UP is to assess how variations in

input parameters impact the model’s output in terms of relative relation, they do not have a strict

requirement for given physical models. Even with the unknown true probability distribution of

15

the stochastic input parameter, given repeated sampling from the input quantities following an

assumed probability distribution, the numerical output distribution in outputs is still able to provide

useful insights into uncertainty [111]. For NDE inspections, based on the characteristics of MCS,

MCS and its variants have been widely for investigating forward parametric-based uncertainty

propagation. For example, MCS was applied in investigating uncertainties from material properties

in Resonant Ultrasound Spectroscopy (RUS) inspection. Uncertainty bounds from MC analysis

provide effective insights into how the uncertainty from modulus, density, Poisson’s ratio, length,

and diameter are affecting the measured results [112]. Matthias et al. repeated thousands of times

of integrated MC simulation into a ray tracing simulation system for Flywheel Rotors to understand

how the uncertainties from each influence parameter affect the final decision-making with given

distribution [113]. Theoretically, it is able to deal with high-dimensional uncertainty inputs or

complex mathematical models with high accuracy, but it will be computationally intensive, which

makes it impractical for real-time applications or for large-scale problems.

There is an error in the statistics due to the error in each sample and due to the sampling error.

Alternatively, we can use a small number of evaluations to approximate the response surface to

realize UP, such as Polynomial Chaos Expansion (PCE), and Stochastic collocation (SC). PCE is

a deterministic method that uses polynomials to represent the uncertain input variables and then

employs these polynomial representations as inputs to solve the model or system [54]. The polyno-

mial basis functions are chosen to be orthogonal with respect to the joint probability distribution of

these random variables. Common choices of polynomials include Hermite, Legendre, or Laguerre

polynomials, depending on the distribution of the inputs, such as normal, uniform, or exponential

distributions, respectively. The model’s response can be expanded as a series of expansion coeffi-

cients, which insights into the global sensitivity of the response concerning the expansion variables,

resulting in a set of deterministic equations. When there’s probabilistic uncertainty in the system

parameters, the Polynomial Chaos Expansion is able to propagate uncertainty through the model,

while the uncertainty can be quantified by calculating the moments (mean, variance, etc.) of the

expansions. Generally, methods for calculating PCE coefficients of model outputs are classified

16

into intrusive and nonintrusive [114, 115]. In the intrusive approach, PCE needs to modify the

specific mathematical model or simulation code by projecting the model on the polynomial basis,

while for nonintrusive PCE, similar to MCS, the input uncertainty is approximated by the sampling

strategy. Intrusive PCE offers more control and efficiency in UP but is difficult in solving real-time

problems under many parametric uncertainties. Comparatively, non-intrusive PCE is modeled with

reduced-order polynomial, which provides more flexibility in real applications. Noted that non-

intrusive PCE are computationally expensive when many model evaluations are required, therefore

many variant approaches have been developed to reduce the number of collocation points in the

PCE process to reduce the computational costs [116, 117]. Non-intrusive PCE and its variants are

efficient UP methods, which have been applied in material and defect characterization within NDE-

related inspection, such as UT, EC, and EM [118–121]. Moreover, PCE has been widely applied for

meta modeling (surrogate or approximate models) using experimental or simulation data when the

mathematical behind is hard to compute or unavailable [122]. Meta-modeling replaces a complex

time-consuming computational model with a statistically equivalent computationally inexpensive

model by learning from a relatively small set of inputs[123]. Other common meta-models in UP

include Kriging [124], Polynomial chaos Kriging [33], and Canonical low-rank approximations

(LRA) [125], which have been widely applied and compared in NDE applications, such as NDE

simulation modeling, sensitivity analysis, defect detection [111, 126, 127]. Within the UP discus-

sion, Stochastic collocation (SC) is another stochastic expansion technique comparable to PCE. It

is also a non-intrusive method to propagate uncertainties, where the collocation is evaluated at a

fixed set of realizations, which are further used to approximate quantities. Different from PCE the

polynomial coefficients are estimated as known orthogonal polynomial basis functions, SC relies

on Lagrange interpolation functions to derive the expansion polynomials. The polynomial approxi-

mation from SC allows for interpolation and extrapolation of model responses between and beyond

the collocation points, which is particularly useful for estimating responses at untested parameter

values and dealing with high dimensional NDE problems [128].

If there is a lack of physics information due to ignorance or at the early stage of product develop-

17

ment, the uncertainty in data is inherently fuzzy. Therefore, the probabilistic nature of uncertainty

is not well understood or justified, there are several useful non-probability-based methods to deal

with non-probabilistic uncertainty. Common methods such as Fuzzy logic, allow for the represen-

tation of uncertainty using fuzzy sets and linguistic variables. Fuzzy sets can be used to describe

the imprecise or qualitative nature of uncertainty. Fuzzy logic is a nonlinear mapping of an input

feature vector into a scalar output, which can be expressed as a linear combination of fuzzy basis

functions [129]. By integrating descriptive knowledge and numerical data into a fuzzy model,

approximate models can be applied to describe the relationships between the fuzzy sets of input

parameters and the fuzzy sets of output parameters for UP analysis. In NDE-based structural health

monitoring applications, fuzzy logic-based UP has been widely applied in NDE-based structural

health monitoring applications, usually realized with common model approximation techniques

(e.g. generalized polynomial chaos, Monte Carlo, etc.) simulation [130–132]. Besides, inter-

val analysis and evidence theory are also commonly used nonprobabilistic UP methods in NDE

inspection [96, 133, 134].

Overall, only some of the popular NDE-based forward uncertainty propagation techniques are

discussed and there’s no correct way when deal with specific problems. Depending on the nature

and amount of available data, the selection of appropriate UP approaches can be determined by

considering the following aspects:

• Input data characteristic: probabilistic or non-probabilistic; low-dimension or high dimen-

sion;

• Uncertainty characteristic: parametric or non-parametric;

• Model availability: intrusive or non-intrusive; physics-informed or evidence-related.

Usually, to provide a more reliable UP analysis, more than one method is integrated when dealing

with different applications. Overall, forward modeling uncertainty propagation can assist in iden-

tifying the most influential sources and parameters that affect the uncertainty, evaluate the level of

confidence and risk, and prioritize them for further investigation or system design improvement. In

18

the next section, an MCMC-based ’Forward Modeling’ uncertainty propagation is illustrated on a

Capacitive sensing system.

3.2 TLS-based Forward Uncertainty Modeling Application

3.2.1

Introduction

Microwave imaging systems, such as capacitive imaging systems, have been and still are

routinely used to characterize and image defects in composite materials[135–137].

In order to

understand how uncertainty in data affects modeling performance, the process of error propaga-

tion is formulated and evaluated through a microwave-sensing-based experiment. The shortwave

capacitive probe’s low frequency enables it to penetrate deeply into the layers of carbon fiber,

revealing defects at greater depths. One example is the parallel plate capacitor probe, which excels

at detecting variations in dielectric permittivity. The sensing system under investigation employs

capacitive sensors that can identify changes in the dielectric coefficient. This capability enables

the measurement of high-resolution contrast images for detecting and identifying defects in carbon

fiber-reinforced polymer composites (CFRP).

A widely used dielectric probe incorporates an inductance-capacitance (LC) circuit, forming a

resonant tank. In this design, the inductance and capacitance are considered constant, while the

resonant frequency shift relies entirely on an unfixed variable. The LC tank probe used in this

study is detailed in [138]. It includes an I/Q demodulator and a local oscillator (LO) to capture

the reflected power, from which the phase can be calculated using the total impedance 𝑍. The LC

tank probe architecture is known for its significantly higher Q factor, which results in increased

sensitivity for detecting changes in dielectric permittivity (𝜖) and inferring the defect’s location and

size. In this Uncertainty Quantification (UQ) study, our focus is solely on the process of inferring

impedance. The probe’s impedance is defined as follows:

𝑍 =

√︁𝑅2 + (𝑋𝐿 − 𝑋𝐶)2

(3.1)

where

19

and

𝑋𝐿 = 2𝜋 𝑓0 ∗ 𝐿

𝑋𝐶 =

1
2𝜋 𝑓0 ∗ 𝐶

whereas the Q factor for an LC tank is simplified as [138]:

𝑄 =

2𝜋 𝑓0 ∗ 𝐿
𝑅

(3.2)

(3.3)

(3.4)

where 𝑓0 is the resonant frequency of the capacitive probe, at which with the highest 𝑄 factor, the

reflection coefficient of the probe is close to a maximum, allowing total reflection. Resistance R is

a negligible constant, which is related to the sensor itself. The introduction to inspected material

changes the resonant frequency according to the 𝑄 value and has a response to the dielectric value

of common defects.

The designed ring separation 𝑑 of the applied LC tank resonance probe is 2mm. The probe is

designed to have a fixed inductance value 𝐿 of 75𝜇𝐻 and the resonant frequency is determined at

5𝑀 𝐻𝑧 to realize the total reflection. Therefore, according to Eq.3.1, any change in total impedance

is related to capacitance 𝐶. Based on the work from [], the effective permittivity 𝜖𝑟 of the sensing

field of view, is proven to be the function of the sensor’s spatial location. Since the design of the

parallel plates contains a fixed area, the variance in spatial location is referred to as the variety in

sensor liftoff (𝑙𝑜). Specifically, with the different sensor liftoff, the permittivity component under

the inspected region will change accordingly, which can be described in Fig. 3.1.

Theoretically, the sensor capacitance is positively related to the relative permittivity 𝜖𝑟, which

can be described as:

𝜖𝑟 (−→𝑟 ) ∝

1
𝑙𝑜
𝐴𝜖0𝜖𝑟 (−→𝑟 )
𝑑

𝐶 (−→𝑟 ) ∝

20

(3.5)

(3.6)

Figure 3.1 The relation between the liftoff and effective permittivity.

where 𝜖0 is the permittivity of vacuum; 𝐴 is the electrode surface area; and 𝑑 is the separation

distance between two electrode rings, which is determined by sensor design and configuration.

Therefore, combining Eq.3.7 and Eq.3.5, there exists an negative correlation between the sensor
capacitance 𝐶 (−→𝑟 ) and the sensor liftoff 𝑙𝑜, which is defined as:

𝐶 (−→𝑟 ) = 𝑓 (𝑙𝑜)

(3.7)

The function 𝑓 (.) is unknown in this case, so in the next section, several meta-modeling

methods are applied to construct a reliable mathematical model to describe this process based

on the experimental data in a simplified manner, for further forward uncertainty propagation

investigation.

3.2.2 Meta-Modeling based "Empirical Modeling" in Capacitive Sensing System

3.2.2.1 Experimental Setup

To build up a true model describing capacitance versus liftoff, 39 groups of measurements are

recorded with different sensor liftoff. The experimental setup is shown in Fig. 3.2:

The sensor is attached to an AGS1000 Direct-Drive Gantry for accurately controlling the liftoff

changes; while an impedance analyzer is connected to obtain the real-time capacitive reactance

reading. The liftoff change ranges from 0.1𝑚𝑚 to 1.3𝑚𝑚 at 0.1𝑚𝑚 intervals and is repeated

three times. The relation between the capacitance and sensor liftoff is presented in Fig. 3.3 with

example readings, where a 2nd order polynomial fitting curve with 95% confidence curve is applied

21

Figure 3.2 Experimental setup.

Figure 3.3 Example of experimental data for capacitance VS liftoff.

for a preliminary illustration. For building a more robust computational model to describe this

correlation, several functional approximations are discussed in the next section.

3.2.2.2 Surrogate Modeling Methods

Meta modeling (surrogate modeling) is a process for reducing the associated computational costs

by substituting an expensive computational model with inexpensive-to-compute surrogate models.

It is an efficient modeling process by learning from a relatively small set of inputs and corresponding

model responses to generate a high-confidence mathematical-based approximation, which has been

22

applied in a wide variety of engineering contexts[138–140]. Meta-model 𝑀 𝑀𝑒𝑡𝑎 (𝑥)is can be

considered as a statistical approximation process to the original finite variance computational model

𝑌 = 𝑀 (𝑥). In this study, three popular surrogate modeling methods are applied to describe the

experimental data, which are Kriging, Polynomial Chao Expansion(PCE), and Canonical low-rank

approximations (LRA).

Kriging is a stochastic interpolation algorithm that considers the computational model as a

realization of a Gaussian process, indexed by the parameters in the input space 𝑥.

It can be

described as[141, 142]:

𝑦 = 𝑀 (𝑥) ≈ 𝑀 𝐾 (𝑥) = 𝛽𝑇 𝑓 (𝑥) + 𝜎2𝑍 (𝑥, 𝑤), 𝑥 ⊂ 𝑅𝑀

(3.8)

where 𝛽𝑇 𝑓 (𝑥) is the mean value of the Gaussian process, which is usually named as a trend; 𝜎2 is

the variance of the Gaussian process, and 𝑍 (𝑥, 𝑤) is a zero mean, unit variance, stationary Gaussian

process which is characterized by a correlation function 𝑅 and its hyperparameters 𝜃.

The selection of the functional basis of the Kriging trend is an important part of building up

the Kriging model. In this work, linear trend, as one of the most commonly used polynomial basis

trends is applied, where the mean value in Eq.3.8 can be expressed as:

𝛽𝑇 𝑓 (𝑥) = 𝛽0 +

𝑀
∑︁

𝑖=1

𝛽𝑖𝑥𝑖

(3.9)

Besides, the process for estimating the unknown hyper-parameter from the available data is con-

sidered as the optimization problem, which is important for calculating other unknown Kriging

parameters (e.g., 𝛽)[142].

In this work, the Covariance matrix adaptation–evolution strategy

(CMA-ES)[143], is applied for solving the optimization problem. It is a derandomized stochastic

search algorithm, where the covariance matrix of a normal distribution is adapted for improving

the objective function in the recent past iterations are more likely to be sampled again.

Polynomial Chaos Expansion (PCE) has proven to be a powerful tool for developing meta-

models in a wide range of applications, such as structural systems[144], computational dosimetry[145],

sensitivity analysis[146], and so on. PCE is achieved by expanding the model response to a basis

23

consisting of multivariate polynomials as the tensor product[147]. Consider a random input vector

𝑋 with independent components expressed by the joint PDF 𝑓𝑋 and finite variance computational

model can be expressed with the polynomial chaos expansion:

𝑦 = 𝑀 (𝑥) ≈ 𝑀 𝑃𝐶𝐸 (𝑥) =

𝑦𝛼Ψ𝛼 (𝑥)

∑︁

𝛼∈𝐴

(3.10)

where Ψ𝛼 (𝑥) are multivariate polynomials orthonormal to 𝑓𝑋, 𝐴 ⊂ 𝑁 𝑀 is the set of selected

multi-indices of multivariate polynomials, and 𝑦𝛼 are the corresponding coefficients.

Canonical low-rank approximation (LRA), also known as separated representations, has recently

been applied as a promising tool for effectively dealing with high-dimensional model input. The key

idea is to approximate the response quantity of interest by the sum of a small number of appropriate

rank-one tensors that are products of univariate functions. Under the scope of meta-modeling,

LRA is built with polynomial functions. Similar to PCE, the model response is expanded over an

orthogonal multivariate polynomial obtained as the tensor product of the univariate polynomials in

the input parameters. By employing the definition of canonical rank and expanding the univariate

function onto a polynomial basis, the rank-R approximation of 𝑦 = 𝑀 (𝑥) can be written as:

𝑦 = 𝑀 (𝑥) ≈ 𝑀 𝐿𝑅 𝐴 (𝑥) =

𝑅
∑︁

𝑙=1

𝑏𝑙 (

𝑀
(cid:214)

𝑝𝑖
∑︁

(

𝑖=1

𝑘=0

𝑘,𝑙 𝜙(𝑖)
𝑧(𝑖)

𝑘 (𝑥𝑖)))

(3.11)

where 𝜙(𝑖)
sponding maximum degree. 𝑧(𝑖)

𝑘 (𝑥𝑖) is the 𝑘th degree univariate polynomial in the 𝑖-th input variable, 𝑝𝑖 is the corre-
𝑘,𝑙 is the coefficient of univariate polynomial in the 𝑙th rank-one
component. 𝑏𝑙, 𝑙 = 1, ..., 𝑅 are scalars that can be viewed as weighing factors. Compared with

Eq.3.10 and Eq.3.11, they use the same univariate polynomial family as the basis, while the main

difference is that LRA retains the tensor-product form and PCE uses the spectral expanded form.

For building up the above meta-models, 70% of the available experimental data is split as

training sets while the left is for validation to evaluate the model. After creating meta models

𝑀 𝑀𝑒𝑡𝑎 = [𝑀 𝐾, 𝑀 𝑃𝐶𝐸 , 𝑀 𝐿𝑅 𝐴], it is of interest to predict model 𝑀 (𝑥) for a new point 𝑥, given

the (observed) experimental design 𝑥 = 𝑙𝑜(1), ..., 𝑙𝑜(𝑁) and the corresponding model responses

24

Figure 3.4 Comparison meta-modeling results for one example validation data set.

𝑦 = 𝑋 (1)

𝐶 = 𝑀 (𝑙𝑜(1)), ..., 𝑋 (𝑁)

𝐶 = 𝑀 (𝑙𝑜(𝑁)). After all three meta-models are set up, the comparison

modeling results for a new set of input are shown in Fig. 3.4

Results show that all three models can reveal the correlation between 𝑙𝑜 and 𝐶. The variation

between the 𝑌𝑇𝑟𝑢𝑒 and the modeling results is very small. To further evaluate the performance of

all meta-models in a statistical way, the relative generalization error on a new set of independent

data is considered through the term validation error for the same validation data set. The input pair
can be defined as (𝑥 (𝑖)

𝑣𝑎𝑙), 𝑖 = 1 : 𝑁𝑣𝑎𝑙, therefore the validation error can be referred as:

𝑣𝑎𝑙, 𝑦 (𝑖)

𝑒𝑟𝑟𝑣𝑎𝑙 =

𝑁𝑣𝑎𝑙 − 1
𝑁𝑣𝑎𝑙

[

(cid:205)𝑁𝑣𝑎𝑙
𝑖=1
(cid:205)𝑁𝑣𝑎𝑙
𝑖=1

(𝑀 (𝑥 (𝑖)

𝑣𝑎𝑙 − 𝑀 𝑚𝑒𝑡𝑎 (𝑥 (𝑖)
𝑣𝑎𝑙) − ˆ𝜇𝑌𝑣𝑎𝑙)2

((𝑀 (𝑥 (𝑖)

𝑣𝑎𝑙))2

]

(3.12)

where ˆ𝜇𝑌𝑣𝑎𝑙 = 1
𝑁𝑣𝑎𝑙

(cid:205)𝑁𝑣𝑎𝑙
𝑖=1

𝑀 (𝑥 (𝑖)

𝑣𝑎𝑙) is the sample mean of the validation set responses. In this work,

the prediction is repeated five times with different liftoff-capacitance combinations for validating

each meta-model, while the final validation error is obtained by taking the average among all five

repetition results. The mean validation error is presented in Table 3.1

Table 3.1 Mean Validation for Three Meta Models

Validation Error Kriging
0.0005

Mean Error

PCE
0.0009

LRA
0.0015

25

According to the calculated mean error, all three methods have shown high fidelity in learning

from the input pair and making the prediction. Meta models, especially the Kriging model, have

shown great advance in providing a substitution model to describe the relationship between the

liftoff and capacitance, which provides a good basis for uncertainty propagation in the next section.

3.2.3 Uncertainty Propagation from Data to "Modeling"

In real experiments, because of the roughness of the material under the test, the sensor liftoff

is not a strict constant, which is considered as the uncertainty imposed on data. As mentioned

before, the liftoff variance is affecting permittivity, which further changes the resultant capacitance

and total impedance. Therefore, in this section, we will investigate how the liftoff-induced data

uncertainty propagates the modeling results. Based on the discussion in the previous discussion,

the applied meta-model 𝑀 𝑀𝑒𝑡𝑎 is applied as the approximation mathematical model to make up the

complex unknown true model 𝑀 (𝑥) for realizing the uncertainty propagation.

Monte Carlo (MC) method, is a sampling-based approach that has been widely used for quan-

tification and propagation of uncertainties [107–109]. As a classical UQ analysis method, MC is

non-intrusive, robust, and simple for implementation [111]. The process of "Modeling" is consid-

ered as the black box which is evaluated at random samples of the input probability distribution.

Outputs at these realizations are then used to approximate quantities such as expectation or variance.

When examining real CFRP pipes, we can assume that the standard liftoff is 0.5 mm. However,

it’s important to note that irregular bumps on the pipe surface introduce variations to the liftoff.

We can characterize this variance as following a normal distribution with a known mean of 0.5

mm and a standard deviation of 0.15 mm. Consequently, the input probability distribution can be

modeled as a Gaussian distribution. To analyze the system, we sample the inputs 10,000 times,

which enables us to derive the distributions and statistics for the final output, represented by the

total impedance 𝑍. Utilizing equations from Eq. 3.1 to Eq. 3.3 and disregarding the resistance 𝑅

of the sensor itself, the output can be described as:

𝑍 = 𝑋𝐿 − 𝑋𝐶 = 𝜔𝐿 −

1
𝜔𝑀 𝑀𝑒𝑡𝑎 (𝑙𝑜)

(3.13)

26

Figure 3.5 Illustration for uncertainty propagation process: a) Probability distribution of input:
liftoff; b) Probability distribution of output from PCE meta modeling: total impedance.

The process of uncertainty propagation is depicted in Figure 3.5. This process reveals how

uncertainty stemming from the liftoff stage is carried over to the "Modeling" output. The resultant

output distributions from the three meta-models we proposed are displayed in Table 3.2. All of the

meta-models yield similar mean and variance values. By averaging their results, it is determined that

the probability distribution of the total impedance 𝑍 follows a normal distribution with parameters

𝑁 (721.187, 63.3013).

Table 3.2 Output probability distribution of Three Meta Models

Probability Distribution Mean
722.599
720.706
721.187

Kriging
PCE
LRA

Variance
63.7365
63.0021
63.1652

3.3 Conclusion

This chapter delves into the critical aspect of uncertainty propagation in the context of forward

’Modeling’ within the proposed TLS UQ framework. Recognizing the diversity of forward UP

approaches, several widely employed UP methods, offer a detailed comparison across various NDE

applications. These applications encompass Monte Carlo Simulation (MCS), Polynomial Chaos

27

Expansion (PCE), and Fuzzy Logic, each tailored to suit different scenarios and requirements. To

exemplify the potency of the forward ’Modeling’ scheme in action, we focus on a practical case

study centered around Capacitive Sensing, an NDE method with significant real-world applications.

The application outlines the application of MCS, emphasizing its role in unraveling the ramifications

of liftoff uncertainty on the critical parameter, the total impedance 𝑍. This in-depth analysis serves

as a concrete illustration of the uncertainties introduced and their impact within this specific NDE

context.

28

CHAPTER 4

INVERSE LEARNING BASED UNCERTAINTY PROPAGATION

In this chapter, the UP will be addressed in the inverse Learning process, which is an essential

part of the proposed TLS UQ framework. Fundamental probability theory will be introduced

to provide a basis to establish the groundwork for subsequent UP discussions and application

examples.

4.1 Probability Theory for Inverse Uncertainty Quantification

The dominant view dealing with uncertainty assumes that expectations are based on statistical

analyses of past data and market signals that provide information on objective probabilities. In

another aspect, probability is a mathematical concept that allows predictions to be made in the face

of uncertainty.

The goal of statistical inference, especially for UQ problem, is to draw conclusions about the

distribution of a random variable (𝑋) based on a particular statistical index (𝜃). This process can

be described by a certain statistical model 𝑓 (𝑋; 𝜃), 𝜃 ∈ Θ, where 𝑓 (𝑥; 𝜃) denotes a probability

mass function (pmf) or a probability density function (pdf) and Θ is the parameter space. Full

knowledge of the true value of 𝜃 is equivalent to a complete understanding of the distribution

of interest. However, due to incomplete knowledge or numerical approximation errors in the

underlying physics, 𝜃 often deviates from the true value. Consequently, there is a need to infer

𝜃 as part of the inference problem. There are two most popular paths, commonly known as the

frequentist, and the Bayesian approach.

Frequentist models are objective, where probabilities are defined as the frequency with which

an event occurs if the experiment is repeated a large number of times. In the frequentist approach,

the parameter 𝜃 is unknown but is fixed. Quantity and only the information coming from the

sampling data are relevant for inference. Thus, we do not take into account the prior belief on the

parameter [148]. Specifically, random data samples are collected from a consistent and repeatable

process to find the 𝜃 that would maximize the likelihood 𝑓 (𝑥|𝜃) to find the optimal solutions. To

statistically build confidence interval in the estimation process, one constructs estimators such as

29

OLS estimators or maximum likelihood estimators [149].

While frequentist approaches rely on ensembles of models empirically to approximate the

posterior distribution, Bayesian methods can directly estimate the posterior distribution over the

model’s parameters.

In contrast to frequentist theory, Bayesian theory treats probabilities as a

distribution of subjective values. Priors in Bayesian modeling are critical for providing information

about past experience in a sample space, where the prior distribution 𝜋(𝜃) is updated on the basis

of the likelihood function through Bayes’ theorem. Parameters 𝜃 are viewed as random variables

with associated densities, which are quantified based on the observation 𝑥 = [𝑥1, 𝑥2, ...𝑥𝑛] and the

solution to the parameter estimation problem is the posterior probability density, which summarizes

the information in both the prior distribution and in the data.

Let 𝑓 (𝑥|𝜃) denote the conditional distribution of the sample given 𝜃, which is equivalent to the

likelihood function. We can apply the classical Bayes’ theorem to derive the posterior distribution,

denoted as 𝑝(𝜃|𝑥). This posterior distribution represents the updated knowledge about 𝜃 and is

determined based on the observed sample, denoted as 𝑥:

𝑝(𝜃|𝑥) =

𝑓 (𝑥|𝜃)𝜋(𝜃)
𝑚(𝑥)

(4.1)

where 𝑚(𝑥) is the marginal Probability density function(PDF) of the data as a normalization factor,

which can be written as:

𝑚(𝑥) =

∫ ∞

−∞

𝑓 (𝑥|𝜃)𝜋(𝜃) 𝑑𝜃

(4.2)

The mean, which is often used to summarize the posterior distribution of the parameter, is a

Bayesian point estimator of 𝜃. Sometimes, the mode or the median of the posterior distribution

is used instead. Even applied with different prior distributions, continued observations eventually

force their conclusions to converge to different parts of the prior parameter space Θ, which can prove

how prior beliefs should be changed by observing data [150]. Bayesian models have considerable

flexibility in incorporating and modeling multiple sources of uncertainty. Standard inclusions are

parameter uncertainty, residual process error, and observation uncertainty [149]. As it provides a

30

density that can be propagated through the model, thus, the Bayesian approach can provide a more

significant value for model uncertainty quantification. which will be discussed in the next Chapter.

Both Bayesian and Frequentist methods offer substantial flexibility when it comes to incor-

porating and modeling multiple sources of uncertainty in the inverse NDE process. Typically,

these sources include parameter uncertainty, residual process error, and observation uncertainty

[149][151], often represented as 𝛿(𝑥). These methods are advantageous as they provide a density

that can be propagated throughout the model. Consequently, statistical inference-based approaches

prove to be highly valuable for conducting comprehensive uncertainty propagation analyses from

’Learning’ to ’Data’. The physics basis is important for improving the reliability of the inverse

analysis, especially for ill-posed NDE problems. As mentioned in the previous section, the ’Learn-

ing’ based UP process can be categorized based on the availability of the physics information.

Organizing all kinds of UP methods into a uniformly structured taxonomy is challenging since

there are numerous potential approaches to consider. The following sections will introduce several

classical and popular uncertainty propagation methods from measurements or observation data to

determine the variations in posterior distribution during ’Learning’ in NDE applications. They are

classified with regard to the type of distribution hypothesis for NDE problems, which are presented

in Fig. 4.1

4.2 Methods of Uncertainty Propagation in Physics-informed Learning

If the physics-based governing equations and essential conditions for describing the inverse NDE

process could be fully understood, it is considered as a direct inverse NDE solution. Typically,

Kubo.S et al. concluded indispensable information for constructing a solid direct inverse process,

which is domain boundary, governing equation of the physical quantity, force or source term, and

material properties [152]. With a full understanding of that information, the desired output could

be obtained with traditional analytical or numerical techniques like the finite element method, the

boundary element method, and the finite difference method. In this case, the structural uncertainty

can be omitted, while the parametric uncertainty from measurement data can be investigated. W.

Lord investigated three partial differential equation types to illustrate the use of numerical methods in

31

Figure 4.1 Overwiew of Learning based uncertainty propagation methods in NDE.

developing efficient inverse NDE applications, including Magnetic flux leakage based on the elliptic

Poisson equation, Eddy current with parabolic diffusion equation, and the Ultrasonic inspection

with a hyperbolic wave equation. The paper highlighted that a comprehensive understanding of

the forward problem is essential for the development of effective inverse algorithms [153]. For

example, Ultrasonic Testing needs to have known properties and geometry to detect defects [154];

X-ray Radiography should have knowledge of X-ray source parameters and material properties to

obtain an expected image or radiograph of an object [155]; Eddy current testing with known coil

properties and defect geometry is important to determine the electromagnetic field distribution in

a conductive material with known coil properties and defect geometry [156]. A typical way for

investigating uncertainty propagation in direct NDE inverse problems is to compare the differences

between the measured and calculated signals by considering the variations in the given physics

information. For example, K. Grondzak applied the optimization idea for investigating how the

convergence criteria affect the uncertainty interval of the ECT-based defect property determination

[157]. Generally, NDE methods are applied to find the location, shape, and structure of inclusions,

defects, and sources (of heat, oscillations, stress, and pollution). Within such a wide range of

engineering applications of NDE technique, it is hard to have full knowledge of the conditions of

existence, or stability of the solution under small variations of the problem data. Therefore, most

32

NDE problems are ill-posed, and thus Data-driven ’Learning’ is gaining more population in UP

analysis.

4.3 Methods of Uncertainty Propagation in Data-driven Learning

Data-driven based ’Learning’ is useful when there’s no robust mathematical support from

sufficient physics prior models. The inverse UP is realized based on the long-run collected data

without knowledge of the prior distribution 𝜋(𝜃). Therefore, frequentist statistics is applied to make

inferences about the unknown but fixed model parameters 𝜃 relying on the optimization theory from

the available sample of data/observations [158]. For example, NDE-based fatigue life assessment

approaches, the long-period monitoring data provide a good basis for frequentist statistical analysis

for uncertainty quantification [159]. The concept of likelihood is a fundamental aspect of statistical

inference, and maximum likelihood estimation (MLE) is a classical frequentist method applied in

uncertainty propagation. MLE aims to find the parameter values 𝜃 within the selected model that

maximize the likelihood of observing the given data 𝑋. It is basically an optimization algorithm

(e.g., gradient descent) for adjusting the model parameters iteratively until the maximum likelihood

is achieved [160]. The maximizer of the likelihood function can be described as follows:

𝑈𝑀 𝐿𝐸 = 𝐶 𝐼 [𝜃, ˆ𝜃 𝑀 𝐿𝐸 ], ˆ𝜃 𝑀 𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥( 𝑓 (𝑋 |𝜃))

Therefore variability and uncertainty of parameter 𝜃 can be represented by the random variables of

a statistical distribution from the given measured data 𝑓 (𝑋; 𝜃), which are then evaluated using the

Confidence Intervals(CI). As addressed in [160], MLE-based inverse UP could infer the ’Learning’

model’s uncertainty (structural uncertainty) through the resultant likelihood function shape: when

the likelihood function is peaked, the inferred model parameter𝜃 is with less uncertainty; while

high uncertainty can be inferred when a flat likelihood function is obtained. Other than MLE, other

common estimators such as Ordinary Least Squares(OLS) are able to statistically build confidence

intervals in the estimation process [149]. The key in this uncertainty propagation is to calculate the

variance of the estimator based on the probability distribution from independent observations.

Generally, it is hard to obtain a completely independent sample set, which requires a large

33

amount of repeated measurements. Therefore, more popular UP analysis can be realized through

Bootstrap, which is an ensembling-based frequentist method [161]. Bootstrap empirically assesses

the uncertainty associated with input (measurement) quantities in situations where modeling tech-

niques and analytical solutions are not readily applicable [162, 163]. It stimulates the frequentist

concept of obtaining the probability distribution of observations from repeated similar experiments

based on resampling to approximate the posterior empirically. The selected bootstrap samples are

considered ’almost-independent’ and are able to approximate the variance of the estimator. This

statistical technique consists of generating samples of size 𝐵 (called bootstrap samples) from an

initial dataset of size 𝑁 by randomly drawing with replacement 𝐵 observations. The selection of 𝑁

needs to consider each subsample could approximate drawing samples from the real distribution;

while the dataset 𝑁 should be large enough to minimize correlations between samples. As the

Bootstrap method follows standard probability estimation, it is mostly concerned with parametric

uncertainty [160]. Kass et al proposed and derived the initial usage and formula for bootstrap-based

uncertainty propagation [164? ]. For NDE applications, bootstrap has been integrated with other

techniques to provide reliable confidence intervals for uncertainty evaluation. For example, Felix

H. Kim et al. constructed the response probability of detection analysis in an X-ray Computed

Tomography (XCT) system with Additive Manufacturing defects, where the bootstrap technique

is applied to quantify the uncertainty for the POD curve [165]. Regarding a remaining useful life

estimation of fatigue cracks, a bootstrap method was developed for calculating the lower confidence

limit for parameter estimation from expectation maximization (EM) and stochastic EM algorithms

[159].

4.4 Methods of Uncertainty Propagation in Hybrid Learning

While Data-driven approaches approximate the probability distribution of observation empiri-

cally, Bayesian approaches are able to incorporate the prior information from the available physical

model, which is considered ’Hybrid Learning’ in this framework. Prior physics information pro-

vides built-in regularization for the ’Learning’ process with objective physical constraints. Even

with limited measurements and ill-posed NDE problems, the stabilization of the inversion process

34

can be improved. In the Bayesian approach, in addition to the likelihood function, information

regarding the prior physics model parameter is updated to find the posterior distributions based on

a comparison of the inverse model and observation data [158]. In the context of deep learning, the

Bayesian neural network (BNN) is a basic method for investigating uncertainty propagation from

Learning to data. The uncertainty captured by BNN is realized through posterior weight distribution

𝑃(𝜃|𝑥) by specifying a prior distribution over the weight parameters [166]. As mentioned before,

the marginal data distribution 𝜋(𝑥) is hard to obtain analytically, the posterior must be approximated

using the proportionality. The posterior distribution and model can be used to obtain the probability

distribution of the prediction. Under the Bayesian inference, the uncertainties are obtained from

the variance of the predictive posterior probability distribution 𝑝(𝜃|𝑥), when approximations are

made to fit the true posterior distribution.

In Bayesian inference, a typical sampling-based approach to derive the posterior distribution

of parameters is called the Markov chain Monte Carlo (MCMC). MCMC reduces the calculation

effort by sampling from complex probability distributions, particularly when analytical solutions

are difficult to obtain [158]. Specifically, MCMC generates a Markov chain of samples, and as the

chain progresses, it converges toward the target posterior distribution of interest. These samples

can be used to estimate the statistical variation of the model parameters and predictions in terms

of statistical terms, such as means, variances, and credible intervals. Moreover, there exists lots

of MCMC variants in uncertainty analysis depending on the applicability, convergence time, or

model formulation: such as Hamiltonian Monte Carlo which incorporates Hamiltonian dynamics

to accelerate the convergence of the sample distribution; transdimensional MCMC which uses

reversible jumps for model selection. A more comprehensive Monte Carlo-based UP method is

addressed with examples in [111]. MCMC-based uncertainty propagation methods have been

applied for NDE-based defect detection[167], characterization [168], and image reconstruction

[169].

Although MCMC could provide high-fidelity results from drawing samples from the posterior,

its computational cost is pretty high as thousands of samples are needed, which highly restricts its

35

UP-related application, especially for dealing with highly multimodal scenarios. Variational Infer-

ence (VI) is a widely applied posterior approximation method under the Bayesian framework, which

could approximate the posterior over the model weights with a simpler variational distribution. In

this process, a variational distribution 𝑞𝑤 (𝜃) with variational parameter 𝑤, is used to approximate

the true posterior 𝑝(𝜃|𝑥) [170]. The distance between them can be realized by minimizing the

Kullback-Leibler divergence with regard to 𝑤, which is equivalent to maximizing the log evidence

lower bound [171], which can be expressed as:

∫

𝐿 =

𝑞𝑤 (𝜃) log (𝑃 (𝜃|𝑥, 𝜃) 𝑑𝜃 − 𝐾 𝐿 (𝑞𝑤 (𝜃)|𝜋 (𝜃)))

(4.3)

As this VI-based approximation is constructed over the ’Learning’ model’s parameter, it is able

to capture the structural uncertainty during the process. There are various ways to realize VI

approximation in the Bayesian framework, such as Gaussian distributions along with diagonal

covariance matrices [172], stochastic gradient VI [173], empirical Bayes (EB)[174], and etc. A

popular and basic example of VI in deep learning applications is Monte Carol Dropout-based

variational inference. It establishes a relation between (variational) Bayesian inference and using

Dropout as a learning technique for neural networks [175]. MC Dropout can be applied before the

weighted layer in the neural network and is used as an approximating variational reasoning scheme

in the deep Gaussian process as they are marginalized over its covariance function parameters

[176]. Dropout layers are usually used as a regularization technique during training, where neurons

are randomly dropped out at a certain probability during a forward pass through the network

to create variation in the model’s outputs. Specifically, in the context of a neural network, the

𝐿2 regularization term is employed to enhance the dropout procedure with weighted decay 𝜆.

The minimization of this objective function is proven as a good approximation to the variational

inference [177]. The output predictive distribution can be further approximated by sampling

model weights from the estimated posterior distributions, the variance from this distribution can be

quantified as a measure of total uncertainty [160]. The total uncertainty can be further decomposed

to obtain a meaningful interpretation of the uncertainty in terms of the randomness of the model

36

(Epistemic) and the variability of the given data (Aleatoric)[178], which corresponds to parametric

and structural uncertainty in this work. The total prediction uncertainty can be expressed as follows:

𝑉 𝑎𝑟 (𝑞𝑤 (𝜃)) = 𝐴𝑙𝑒𝑎𝑡𝑜𝑟𝑖𝑐 + 𝐸 𝑝𝑖𝑠𝑡𝑒𝑚𝑖𝑐

= 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑟𝑖𝑐 + 𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑎𝑙

(4.4)

With increasing prediction repetitions, the uncertainty from both aspects are convergeing in proba-

bility to system variance. Lots of studies have used the dropout-based method for UP analysis in the

NDE area. For vision-based applications, including crack detection, local damage identification,

and bridge component detection, the uncertainty for each application is quantified with MC dropout

in terms of variations in softmax probability and entropy [179]. Li et al. applied a Dropout-assisted

Convolutional Neural Network model in a magnetic flux leakage-based crack classification case,

which is able to quantify the addressed aleatoric and epistemic uncertainties during the inspection

[180].

BNN-based methods are effective but are computationally expensive when dealing with large-

scale networks.

In recent years, deep ensembles (DE), have been widely applied as a good

alternative to traditional Bayesian NNs with readily parallelization and less hyperparameter tuning

in deep learning area [181]. DE is a more straightforward procedure that combines multiple base

models for delivering dependable predictive uncertainty estimates by randomly initializing their

weights and training with the same dataset. Model ensembling can also be considered as a sampling

process in Bayesian inference, and the variation of this process is considered an approximation to the

uncertainty from prior model parameters. The idea of model aggregation is to reduce the bias and/or

variance of weak learners to create a strong ensemble model that achieves better performances in

uncertainty analysis. Generally, there are two typical ensemble learning techniques: Bagging

and Boosting, which encourage diversity through different importance sampling schemes [182].

Bagging (Bootstrap Aggregating) involves training multiple instances of the same model in parallel

on different subsets of the training data, typically using resampling with replacement (Bootstrap).

The final prediction is often a combination of the predictions from each model, such as taking

37

a majority vote (for classification) or averaging (for regression). The random forest approach

is a typical bagging method that combines bagging with an additional layer of randomness by

considering only a random subset of features at each split in each tree. Sampling over features

could ensure each tree doesn’t rely on identical information to make decisions, thus reducing

the correlation between their respective outputs. Thus, the concepts of bagging and random

feature subspace selection of random forest are beneficial in reducing overfitting and enhancing

generalization [182]. Different from Bagging, Boosting is an iterative ensemble scheme that focuses

on training multiple models sequentially. Each consecutive model gives more weight to the data

that the previous models predicted erroneously, therefore, the ultimate ensemble is comprised of all

the weak learners, each assigned an appropriate weight based on their performance. Examples of

boosting include Adaboost and Gradient boosting. Adaboosting, also known as Adaptive Boosting,

uses weak learners as base models such as decision stumps (shallow decision trees with a single

split).

Instead of trying to find all the coefficients and weak learners that give the best overall

additive model, it focuses on reweighting data points to correct misclassifications from each new

weak learner. During the process, more weights are added to misclassified points to focus on the

samples that are difficult to classify correctly. Different from Ada boosting, Gradient Boosting

employs decision trees as base learners, while focusing on minimizing the residuals or errors made

by the previous models. It adjusts the target values for each iteration to correct the errors made by

the previous model. Gradient boosting typically includes a learning rate, often referred to as the

shrinkage parameter, which scales the contribution of each base learner [183, 184]. Conclusively,

adaptative boosting tries to address a specific ’local’ optimization problem during each iteration,

whereas gradient boosting employs a gradient descent approach and is more adaptable to a wide

range of loss functions. Therefore, gradient boosting can be seen as an extension of Adaboost and

has great potential for handling various differentiable loss functions. There are more techniques

applied within ensemble learning for improved accuracy, and robustness, such as Stacking [185],

Stochastic Gradient boosting [186], XGboosting [187], The choice of ensemble method depends

on the specific problem and dataset. Specifically, Deep ensembles have been used for estimating

38

uncertainty for deep learning-based NDE applications, such as Ultrasonic Tomography [188],

Magnetic Flux Leakage Testing [94], Guided wave array imaging [189] and etc. Li et al. investigate

the impact of the liftoff uncertainty in measurement data in a magnetic flux leakage-based defect

depth classification. Prediction accuracy and uncertainties are estimated and compared between

the Bayesian Neural Network and Deep Ensemble methods, which demonstrated the efficacy and

feasibility of DE in uncertainty propagation [94].

In an ultrasonic tomography-based speed of

sound reconstruction application, the ensemble approach is able to provide robust uncertainty in

terms of estimation variance and model variance [188].

Generally, there exists a set of candidate models to describe the inverse process, it is hard to

obtain the optimal posterior model probability between them. Therefore, the Bayesian solution is

able to simplify the model selection by taking the weighted average of all candidate models and

weighing each by its marginal posterior probability [190, 191]. This strategy is called Bayesian

model averaging (BMA), which is considered a transition method between Bayesian NN and Deep

Ensemble method. Generally, for constructing BMA, the prior distributions over all parameters in all

models and prior probabilities of all models must be specified. As the priors on model parameters

need to be specified for determining the posterior model weights, any insufficient or uncertain

knowledge of those prior model information will lead to uncertainty in final predictive probability

distribution. Therefore, BMA is a useful method for investigating the structural uncertainty from

the ’Learning’ model. Theo S et al. investigated how different model prior distributions affect the

economic growth determination results [192]. In an NDE application of strength estimation for

aging materials, BMA is applied to understand the model uncertainty which also could provide a

reliable approximation of the actual underlying model for predicting the bulk mechanical properties

[193].

There are other popular uncertainty propagation methods for inverse learning,

including

Bayesian Active Learning (BAL), Deep Gaussian process, Bayes by Backprop(BBB), etc. Besides,

lots of researchers often combine more than one method to enhance the estimation performance

by combining the advantages of multiple methods, such as Cyclical stochastic gradient MCMC

39

(CsgMCMC) [194] or BNN with MC dropout [195], which are applied for handling different

scenarios in investigating uncertainty propagation from learning to data. Only a selection of well-

known NDE-based inverse learning uncertainty propagation techniques are covered and there’s no

one-size-fits-all approach for addressing specific problems. Conclusively, the selection of inverse

UP approaches can take the following criteria into account:

• Measurement data: data amount; prior distribution;

• Learning Model: direct or indirect; model parameter distribution; bias and variance; Com-

putation efficiency;

• Estimated Uncertainty type: structural, parametric, or total uncertainty.

4.5 TLS-Based Inverse Uncertainty Learning Application

4.5.1

Introduction

Pipeline infrastructure is a crucial part of society, with over 2.5 million miles of natural gas,

petroleum, and hazardous liquid pipelines operating across all states, as reported by the Pipeline

and Hazardous Material Safety Administration (PHMSA) of the United States Department of

Transportation (DOT). Information about incidents on gas and liquid pipelines is accessible to the

public [196]. Pipelines are susceptible to damage from internal and external corrosion, cracks,

manufacturing defects, etc., which cause severe leakage to induce safety concerns, therefore, a reli-

able assessment scheme for monitoring and maintaining that critical infrastructure is imperatively

critical. Cracks, which are caused by different mechanisms, can appear in the pipeline at any stage

during manufacturing, installation, or throughout its service life [197].

Crack sizing and profiling are of critical importance in monitoring crack growth to define

inspection intervals for safety-critical components, therefore, nondestructive evaluation (NDE)

methods have been applied in detecting and characterizing cracks’ size, shape, and orientation

[198]. Ultrasonic testing (UT) methods are considered suitable for structural integrity damage

monitoring systems and characterizing surface cracks by applying Rayleigh wave and acoustic

40

emission[199, 200]. Also, a variety of electromagnetic (EM) methods, such as eddy current

testing (ECT), microwave, MFL, etc., are advanced in detecting and identifying surface and sub-

surface cracks in metal, based on the electromagnetic principles [201–205]. In MFL-based pipe

inspection, lots of effort has been put into characterizing metal loss defect inversion problem, which

is associated with length, width, and wall loss (%WL), obtained from the measured three-axis MFL

signals in terms of magnetic flux density [206–208].

For the widely used magnetic flux leakage (MFL) inspection method, the collected signal

quality is greatly affected by various uncertainty factors, such as material property, inspection

variations, shape irregularity, noises, etc. Therefore, a quantified uncertainty estimation in NDE

is indispensable. Deep learning proves to be an effective approach for handling extensive NDE

data at scale without requiring prior physics expertise [209–211]. Nevertheless, conventional deep

learning methods lack the capability to assess the reliability of predictions. In contrast, Bayesian-

based techniques excel in estimating uncertainty by incorporating Bayesian posterior inference

across the neural network parameters [212].

This study delves into the impact of uncertainties arising from the dynamic magnetization

process, specifically due to the relative motion between a magnetic flux leakage (MFL) sensor and

the material being tested in both axial and circumferential directions. In the context of MFL inline

inspection, the surface roughness of the material plays a pivotal role in influencing sensor liftoff and

stands out as a significant source of uncertainty affecting inspection outcomes. Consequently, this

research delves into the uncertainties stemming from sensor liftoff, examining their propagation

throughout the sensing system and their influence on output data. Given the intricacies involved

in describing the forward uncertainty propagation process, the study employs Deep Ensemble, a

learning-based non-Bayesian uncertainty estimation method. This approach addresses the input

uncertainty originating from the response MFL data. To assess performance, a three-dimensional

finite element method (FEM) based model generates simulation data for MFL-based defect depth

classification. Experimental data are then used to validate MFL-based defect size classification.

The study conducts prediction accuracy and uncertainty calibration, proving its value in enhancing

41

prediction performance assessment and quantifying uncertainties. Additionally, an Autoencoder

method is applied to compensate for the shortage of experimental data available for training the

uncertainty estimation model. This approach extends to address the challenge of insufficient

experimental data in generalized non-destructive evaluation (NDE) problems.

4.5.2 Uncertainty in MFL-based Defect Classification

Without the formulation of models based on complex physics knowledge, Deep Learning

methods relied on available comprehensive NDE data, which have shown great advance in aiding

nondestructive evaluation methods. NDE data is often complex, massive, discordant, and noisy, it

is very necessary to jointly develop UQ with an appropriate deep learning model to efficiently deal

with existing uncertainty to improve the safety of the inspection system. The predictive probability

obtained from deep learning assists in deciding the probability interpretation and quantifying the

uncertainties of the predictions to accomplish statistical inference [213]. Because of the uncertainty

among the whole system, the network predictions are usually misleading, therefore a good predictive

uncertainty score can quantify how reliable the model’s prediction is, which is considered a good

basis for assessing the performance.

To develop a reliable NDE-based inspection system, a thorough understanding of the system

and its influential parameters is essential. The NDE system is characterized through the forward

and inverse modeling process, as detailed in [36]. In the forward stage, variations in geometric

parameters (e.g., defect size and shape) and material properties (e.g., hardness and strength) are

considered as aleatoric uncertainty in applications related to material characterization and damage

detection [88, 89, 91, 92]. Processing parameters related to simulation (e.g., mesh parameters,

boundary conditions) and experimental testing (e.g., setup process, experimental noise) are regarded

as epistemic uncertainty [93–95]. This variability constitutes input uncertainty, which is integrated

into the subsequent inversion stage. In this stage, modeling and analysis are employed to derive

predicted parameters describing the system based on observed measurements or simulated output

from the forward procedure [96]. During the inversion process, epistemic uncertainty is introduced,

associated with the learning model parameters and the model itself.

42

During MFL in-line inspections (ILI), surface irregularities like changes in coating thickness,

welds, or hardness deposits introduce variations in the liftoff distance, complicating the inspection

process. These fluctuations impact the amplitude of MFL signals, influencing detection sensitivity

[214]. Therefore, exploring liftoff distance is a critical uncertainty factor in MFL inspection.

Other considered uncertainty factors include sensor velocity [215], defect size and shape [180],

microstructural changes, and mechanical properties [216], among others. Additionally, during the

inversion process, NDE field inspection results are often sensitive to environmental conditions and

signal processing methods [217]. In this study, uncertainties arising from liftoff are examined in

the inverse process for defect classification, where uncertainty from MFL data and the machine

learning model is quantified using the approximated Bayesian Inference modeling process.

4.5.3 Autoencoder for Data Augmentation

The application of machine learning algorithms in analyzing NDE experimental data is often

hindered by the challenging and costly nature of data collection procedures. To tackle this issue, a

potential solution is to employ data augmentation methods, which can extend the existing dataset

by generating more diverse and comprehensive training data. Autoencoder, a typical unsupervised

multilayer neural network, is utilized to compress and decompress input data for data augmentation

purposes. The core concept of Autoencoders is not to perfectly recreate the input data; instead, a

controlled amount of error or noise is introduced intentionally. The goal of the Autoencoder is to

train the network to minimize the discrepancies between the input data and the reconstructed data

with proper loss function while retaining some certain similarity between the original input and the

recreated output for enriching the original dataset. Autoencoder neural networks’ capability has

been demonstrated in areas such as image reconstruction [218], feature extraction [219], augmenting

data for anomaly detection [220], noise reduction in medical images [221]. It consists of a pair of

an encoder and a decoder, where the encoder is able to generate the compact representation for a

whole set of data, which is then passed to the decoder to reconstruct the original data from this

simplified representation with high fidelity [222].

To boost the learning efficiency of the Autoencoder on MFL experimental data, an initial pre-

43

trained phase is conducted on simulation data. This utilizes transfer learning, a technique where

knowledge acquired from one task is applied to a related task. The context of this study involves

using a pre-trained model as a starting point and fine-tuning it for MFL classification on experimental

data. Transfer learning proves advantageous when there is limited data for the new task, as it

leverages knowledge from the original task [223, 224]. Widely applied in various ML studies,

transfer learning has demonstrated effectiveness in tasks like translation, image recognition, and

image classification [225, 226]. The similarity in format and sensing methods between experimental

and simulated MFL data, along with the larger size of the simulated data, enhances the effectiveness

of transfer learning in this study. Through transfer learning, we enhanced the performance of pre-

trained Autoencoder models using our experimental dataset, leading to improved results and a

reduced need for a high number of experiments.

The applied Autoencoder model architecture and transfer learning process is illustrated in Figure

4.2. Specifically, the encoder stage of the proposed model employs two pairs of convolutional layers

with max pooling operations, activated by the ReLU non-linear activation function, to capture

essential representations. To prevent overfitting, a dropout layer is added as regularization. The

number of kernels ensures a consistent number of activations across layers. These layers serve as

feature extractors, creating a compressed feature representation space (𝑍). The encoder parameters

are initially pre-trained on a large simulation MFL dataset, establishing a foundation for learning

general MFL signal features. The pre-trained encoder facilitates the model’s adaptation to specific

experimental data features due to the intrinsic connection between simulation and experimental

data.

During the processing of MFL experimental data, the pre-trained encoder layers’ weights remain

frozen, preventing the loss of valuable information. The subsequent trainable decoding layers mirror

the encoder, learning to reconstruct the original images. The decoder includes upsampling layers to

restore 𝑍 to the original image size. This Autoencoder model transforms old MFL signal features,

providing predictions trained with the experimental dataset. The transfer learning process fine-tunes

the model, allowing it to adapt to the unique characteristics of experimental data. This approach,

44

Figure 4.2 Schematic representation of the Autoencoder architecture and transfer learning process.

compared to training from scratch, typically results in improved performance.

Once the fine-tuned Autoencoder model is optimized, it serves the purpose of augmenting the

experimental MFL data. This involves feeding MFL images into the Autoencoder for reconstruction,

and then adding the resulting reconstructed images to the original experimental dataset. The original

dataset is labeled "OR," while the combined dataset of the original and newly generated MFL data is

labeled "GE." The primary goal is to enhance the training set for learning-based networks, leading

to improved classification and prediction performance.

4.5.4 Applied "Learning" Models for Uncertainty Estimation

As mentioned in Chapter 4.1, Bayesian theory is considered as the primary approach to address

uncertainties through the "Learning" model, which aims to comprehend and describe uncertainty

in the inverse solution based on observations data, and other sources of information (e.g., prior

distributions). As a result, probabilistic predictions can be made under the addressed uncertainties

to assist in optimizing experimental design.

Bayes theorem is applied to the inference of a parameter given observed training input, which

45

is generated from a probability distribution depending on an unknown parameter 𝜔.

In this

application, the obtained probability distribution 𝑃(𝜔 | 𝑋𝑚 𝑓 𝑙) is used to describe the relationship

between the input MFL image data 𝑋𝑚 𝑓 𝑙 and their associated defect classes 𝐷, 𝐷 ∈ 1, 2, 3. The

uncertainty in this process can be obtained from the variance of the predictive posterior probability

distribution 𝑃(𝜔 | 𝑋𝑚 𝑓 𝑙), which can be expressed as:

𝑝(𝜔 | 𝑋𝑚 𝑓 𝑙, 𝐷) =

𝑝(𝐷 | 𝑋𝑚 𝑓 𝑙, 𝜔) 𝑝(𝜔)
𝑝(𝐷 | 𝑋𝑚 𝑓 𝑙)

,

(4.5)

in which 𝑝(𝐷 | 𝑋𝑚 𝑓 𝑙, 𝜔) is the likelihood of the model denoting the probability distribution of

observed data 𝑋𝑚 𝑓 𝑙 given the parameter 𝜔; 𝑝(𝜔) serves as the prior information of 𝜔 to describe

the learning model, which is independent of any observation. During the modeling process of

𝑝(𝜔 | 𝑋𝑚 𝑓 𝑙, 𝐷). Both prior and likelihood are considered known parts of the assumed model,

while the probability distribution of the predictive distribution of the output for a new MFL input

image 𝑥∗ and its output class 𝑑∗, 𝑝(𝐷 | 𝑋𝑚 𝑓 𝑙) can be expanded as:

𝑝(𝑑∗ | 𝑥∗, 𝑋𝑚 𝑓 𝑙, 𝐷) =

∫

𝑝(𝑑∗ | 𝑥∗, 𝜔) 𝑝(𝜔 | 𝑋𝑚 𝑓 𝑙, 𝐷) 𝑑𝜔

(4.6)

As the learning process of this posterior distribution is intractable in higher dimensions, some

approximation techniques are used to fit the true posterior distribution to find an analytical way to

evaluate the process with tractable approximating variational distribution 𝑞𝜃 (𝜔) parametrized by

variational parameters 𝜃. This process is usually described as the Bayesian Inference [227].

As we discussed before, the sensor lift-off is the main aleatoric uncertainty source in this exper-

imental MFL inspection, in order to investigate how these factors affect the inspection performance,

two typical deep learning methods: Convolutional Neural Network (CNN) with Dropout and Deep

Ensemble (DE) are proposed and compared to estimate predictive uncertainty with the realization

of approximate Bayesian inference.

4.5.4.1 Convolutional Neural Network with Dropout

In the realm of ML-based networks, the Dropout technique, which randomly discards some of

the model units during training, is not only effective in avoiding overfitting but can also serve as

46

an approximation of the Bayesian process [177]. In this CNN-based application, the conventional

Bernoulli Dropout is applied for sampling each unit output with a certainty probability. Besides,

considering this is a classification problem, the predictive probability 𝑃(𝑑 | 𝑥, 𝜔) is a categorical

distribution that corresponds to the softmax likelihood [228], which can be written as:

𝑃𝜔 (𝑑𝑖 | 𝑥𝑖) = ˆ𝑑𝑖 = 𝑆𝑜 𝑓 𝑡𝑚𝑎𝑥( 𝑓 𝜔 (𝑑𝑖))

(4.7)

where 𝑓 𝜔 (𝑥) represents the neural network. As proven in [178, 180] that the sampling process

from 𝑞𝜃 (𝜔) is the same as the dropout operation, the Bayesian inference process 𝐿 as addressed

in the previous section, that can be approximated as 𝐿𝑑𝑟𝑜 𝑝𝑜𝑢𝑡. Therefore, a CNN model with

dropout is applied as the realization of the Bayesian approximate learning process for addressing

the uncertainty in this defect classification application.

The detailed learning process of the proposed CNN model is shown in Figure 4.3, where the

Convolutional layers with Maxpooling are used to extract features from the MFL input image. The

following fully connected layers with the Dropout layer are employed to combine extracted high-

level features for classification purposes. The outputs from these layers are then passed through the

softmax activation function, which assigns probabilities to each class label. For obtaining the pos-

terior probability distribution for further uncertainty quantification, 𝑇 times repetitive predictions

are made for each MFL sample data. In the model learning process, all parameters are optimized

by minimizing misclassification errors, resulting in a reliable output probability distribution for

subsequent uncertainty estimation. This makes the approximation process a promising and effec-

tive universal approach for addressing uncertainties in classification-based NDE inverse problems

within machine learning frameworks. It can lead to improved decision-making and risk mitigation

across various NDE applications.

For obtaining the aleatoric and epistemic uncertainty of this work, as presented in [178, 180] the

uncertainty is equivalent to the variance of the prediction probability of the network. Decomposing

the prediction variance leads to a meaningful interpretation of the uncertainty, with the aleatoric

uncertainty representing the randomness of the prediction defect class 𝐷 and the epistemic uncer-

tainty representing the variability coming from the proposed CNN model. Given the new MFL

47

Figure 4.3 Schematic representation of the CNN process.

input image 𝑥∗, 𝑇 times of prediction will be made for generating the corresponding new predic-

tive probability 𝑑∗

𝑡 , (𝑡 = 1, ...𝑇). Equation 4.8 introduces the correlation between the variance of

the prediction probability and the uncertainty, which represents the total prediction uncertainty,

comprising aleatoric and epistemic. the uncertainty can be expressed as:

𝑉 𝑎𝑟𝑞 ˆ𝜃 (𝑑∗) = 𝐴𝑙𝑒𝑎𝑡𝑜𝑟𝑖𝑐 + 𝐸 𝑝𝑖𝑠𝑡𝑒𝑚𝑖𝑐

∫ (cid:104)

=

𝑑𝑖𝑎𝑔(𝐸 𝑝 [𝑑∗]) − 𝐸 𝑝 [𝑑∗] 𝐸 𝑝 [𝑑∗]𝑇 (cid:105)

𝑞 ˆ𝜃 (𝜔 | 𝑋, 𝐷) 𝑑𝜔

∫

+

(𝐸 𝑝 [𝑑∗]) − 𝐸𝑞 [𝑑∗]) (𝐸 𝑝 [𝑑∗] − 𝐸𝑞 [𝑑∗])𝑇 𝑞 ˆ𝜃 (𝜔 | 𝑋, 𝐷) 𝑑𝜔.

(4.8)

where 𝑑𝑖𝑎𝑔(.) denotes the diagonal matrix. Since the softmax output is one-hot coded, in the

variance, 𝐸 𝑝 [𝑑∗] square can be simplified as 𝑑𝑖𝑎𝑔(𝐸 𝑝 [𝑑∗]). As addressed in [178], the first term is

the expectation is over 𝑞 ˆ𝜃, which captures the inherent randomness of output defect classes, while

the second term is only related to the network weight parameter 𝜔, therefore, the Equation 4.8 can

be split as Aleatoric 𝐴𝑖 and Epistemic 𝐸𝑖 of 𝑑∗
𝑖 :

𝐴𝑡 =

1
𝑇

𝑇
∑︁

𝑛=1

𝑑𝑖𝑎𝑔 (cid:2)𝑃𝜔 (𝑑∗
𝑡

| 𝑥∗)(cid:3) − (cid:2)𝑃𝜔 (𝑑∗
𝑡

| 𝑥∗)(cid:3) ⊗2

(4.9)

48

𝐸𝑡 =

1
𝑇

𝑇
∑︁

𝑡=1

(cid:2)𝑃𝜔 (𝑑∗
𝑡

| 𝑥∗) − ¯𝑃𝜔(cid:3) ⊗2

(4.10)

where ¯𝑃𝜔 = 1
𝑇

(cid:205)𝑇

𝑡=1

𝑃𝜔 (𝑑∗
𝑡

| 𝑥∗). As the number of prediction repetitions (𝑇) increases, the sum of

𝐴𝑡 and 𝐸𝑡 tends to converge in probability to the system variance. Aleatoric uncertainty is associated

with liftoff variance, while epistemic uncertainty is linked to the model parameters of the proposed

model. To evaluate aleatoric and epistemic uncertainty in this application, a dropout layer with

the Softmax activation function is employed to generate predictions. During the prediction stage,

each set of testing MFL data is predicted 10 times (𝑇 = 10) to obtain the variability distribution of

the output. Consequently, for each testing MFL sample, there are ten aleatoric uncertainty results

and ten epistemic uncertainty results, providing a manageable distribution to describe both types

of uncertainties.

4.5.4.2 Deep Ensemble

In this MFL application, the random forest technique is utilized to classify crack depth and offer

additional predictive uncertainty estimation. This helps examine the impacts of velocity variance

and liftoff variance. Since the aleatoric uncertainty brought by the MFL signal is the focus of this

work, the model uncertainty that comes from the hyperparameter is reduced by fixing the number

of subtrees to ten. Besides, in order to achieve random initialization, k-fold Cross-Validation with

a certain number of repeats is applied to split training and testing data into k-folds with a uniform

probability distribution and randomized subsample to have an unbiased performance estimation.

Unlike random train-test splits where a given example may be used to evaluate a model many

times, this method is less biased because each example in the dataset is used only once in the

test dataset to estimate model performance. Besides, addressing the limitation of k-fold cross-

validation where the models tend to be highly similar in subsequent ensemble learning, random

forest employs a bootstrapping technique to create different sub-datasets for each tree, bootstrap

involves selecting examples randomly with replacement. Replacement refers to the practice of

metaphorically returning the same example to the pool of candidate rows. This means that a

specific example can be selected again, possibly multiple times, within a single sample from the

49

training dataset. Specifically, a set of decision trees is trained from a randomly selected subset of

the new training data, which helps to reduce the correlation among the prediction results of the

subtrees. It then grows a decision tree that is allowed to use only a random subset of features at

each split. This diversity enhances the model’s performance. Finally, the random forest averages

the output of each decision tree to determine the final results. Calibrated outputs the predictive

probability of each ensemble, which is set as a uniformly weighted mixed model. After repeating

a certain number of evaluations to test data, the predictive probability is combined and averaged

to make the final uncertainty prediction based on the scoring rules. Specifically, 𝑘 = 10 in the

DE model, while the first nine folds are used to train a model, the left holdout fold is used as the

test set and each of the folds is given an opportunity to be used as the holdout test set. Totally ten

models are fit and evaluated with three repetitions (𝑟 = 3), and the final performance of the model

is calculated as the mean of these runs.

4.5.4.3 Predictive Uncertainty Estimation Scoring Index

Scoring rules are essential to evaluate the quality of predictive uncertainty, which is realized

through a proper loss function in ML models. The training criterion of both CNN and DE is to

minimize the cross-entropy loss for optimizing the predictive model and find the optimal model

parameter 𝜔, which shares the same formula as Log Loss, which can be presented as:

𝐿𝐿(𝜔) = −𝑆( ˆ𝑑, 𝑑) = −𝐸 (𝑑)𝑙𝑜𝑔( ˆ𝑦)

(4.11)

where ˆ𝑑 and 𝑑 are the predictive probability and true distribution respectively of input 𝑥. Each

predicted probability is compared to the actual class output value (0 or 1) and a score is calculated

that penalizes the probability. Besides, Brier score is a popular scoring rule in calculating the mean

square error between the predictive probability and true classes, which can be expressed as follows:

𝐵𝑟𝑖𝑒𝑟 = 𝐸 (cid:2)( ˆ𝑑 − 𝑑)2(cid:3)

(4.12)

Both the log loss and Brier score act as the evaluation of the predictive uncertainty, specifically, the

higher the log loss and Brier score, the higher the uncertainty is buried in the system.

50

For ML modeling, the probability scores are overconfident or under-confident in some cases,

which will bring bias to predictions that should be near zero or one and further affect the subsequent

averaged prediction result. Therefore, calibration of predictions is an essential step to improve the

reliability of probability predictions to make accurate probability estimates. Calibrated outputs the

predictive probability of each ensemble, which is set as a uniformly weighted mixed model. It is a

scaling operation to adjust the obtained probability distribution to match the expected distribution

observed in the data [229]. Especially in the Random Forest method, because of the feature

subset, the basic level trees are trained with a relatively high variance, which will bring errors to

predictions that should be near zero or one and further affect the subsequent averaged prediction

result. Therefore, the calibration of the log loss and Brier score is indispensable.

Platt scaling (Platt calibration) is a typical calibration method, which transforms predictions to

posterior probabilities by passing them through sigmoid. Each calibrated predictive distribution

can be presented as:

ˆ𝑑𝑐𝑖 =

1
1 + 𝑒𝑥 𝑝( 𝐴 ˆ𝑑𝑖 + 𝐵)

(4.13)

where ˆ𝑦𝑖 is the uncalibrated predictive output of true label of sample 𝑖. 𝐴 and 𝐵 are real numbers to be

determined when fitting the regressor via maximum likelihood. The calibrated prediction is further

applied to obtaining the calibrated scoring index for estimating the total predictive uncertainty in

terms of Prediction Accuracy, Log Loss Score, and Brier score for evaluating and comparing the

performance of the proposed CNN and DE model.

4.5.5 Performance Evaluation with Uncertainty Analysis

4.5.5.1

3D FEM Simulation Modeling of MFL

Maxwell’s equations are applicable to the analysis of the electric as well as the magnetic field

within MFL systems. In this work, permanent magnets are used for the generation of the magnetic

field. For simulation study, the magnetostatic phenomena are governed by simplified Maxwell’s

equation illustrated below:

∇ × (

1
𝜇

∇ × 𝐴) = 𝐽,

(4.14)

51

𝐵 = ∇ × 𝐴,

(4.15)

where 𝜇, 𝐴, 𝐽, 𝐵 represent magnetic permeability constant, magnetic vector potential, the equiva-

lent current density of permanent magnet, and magnetic flux density vector, respectively. The field

equations are supplemented by the constitutive relation that describes the behavior of electromag-

netic materials. In the permanent magnet region,

𝐵 = 𝜇𝐻 + 𝜇0𝑀0,

(4.16)

where 𝑀0 denotes permanent intrinsic magnetization vector. The other region is governed by

𝐵 = 𝜇𝐻,

(4.17)

On the other hand, the magnetic fields under the effect of mechanical motion are governed by

Lenz’s Law and Lorentz’s Law. Lorentz’s Law can be used for the analysis of the moving probe

effect in a dynamic MFL inspection system. If the probe moves at a certain speed, the Lorentz

force induces currents in the conductive specimen. Such currents in the specimen can be regarded

as eddy currents dependent on the velocity at which the probe travels and the current density is

expressed in

𝐽𝑉 = 𝜎𝑉 × 𝐵.

(4.18)

where 𝐽𝑉 denotes eddy current density in the specimen; 𝑉 denotes the speed of the applied

magnetic field with magnetic flux density, 𝜎 represents conductivity of the sample. With respect

to the dynamic EM system, the governing equation deduced from Maxwell’s equations is added

with eddy currents due to the movement of applied magnetic field. The modified equation for

time-harmonic electromagnetic field is expressed as

∇ × (

1
𝜇

∇ × 𝐴) = 𝐽𝑉 − 𝜎

𝜕 𝐴
𝜕𝑡

+ 𝐽𝑠.

(4.19)

The first, second, and third terms on the right represent velocity-induced eddy current, frequency-

induced eddy current, and the current density due to the applied source respectively. Compared

with the governing equation that precludes 𝐽𝑉 , the modified equation implies that the eddy currents

52

Figure 4.4 Geometry of MFL Simulation model.

generated by moving magnetic field influence not only the currents distribution in the conductive

specimen but also the magnetic field profile, which results in distortion of the measured signal.

Using the boundary condition, the magnetic potential vector can be solved. Then the distribution

of the magnetic field can be obtained. 3-D FEM is applied to analyze flat samples by COMSOL.

Figure.4.4 shows the geometry of the problem in a 3D model, and the influence of the defect

depth and lift-off on magnetic flux leakage density from the defect is being studied. The magnetic

circuit is constituted by a yoke, magnets, brushes, and specimen, and a rectangular defect located at

the center specimen. In the model, two permanent magnets, made of NdFeB material, are used as

the magnetic flux induction, the yoke and brushes use the same material of mild steel, the relative

permeability of which is 186,000, while the sample is made of Stainless Steel 416. In the process

of calculating for the finite element model, magnetization clearance (clearance between brush and

specimen) is equal to sensor liftoff. The most fundamental element of 3D is a tetrahedron. To have

a precise result, we refine the elements near the flaw. The influence of eddy current under different

speeds of 3,5,7m/s is taken as uncertainty. In each velocity case, the effect of liftoff variation is

also considered. Consequently, it is practicable to arrange measurement of the magnetic field for

defect detection at the liftoff ranging from 1mm to 9mm.

4.5.5.2 MFL experimental Setup

MFL experiment was conducted on a stainless steel sample containing three kinds of defect

size which is presented in Table 4.1. Different sensor liftoff during the data collection ranges from

1mm, 2mm, and 3mm respectively. Specifically, each defect is subjected to 60 times testing under

53

each liftoff scenario. This results in the collection of 180 sets of MFL images for each type of

defect with each image having dimensions of 217x217 pixels in RGB format. Consequently, a total

of 540 sets of experimental MFL data are gathered and labeled as "OR" for subsequent analysis.

Table 4.1 Experimental MFL Defect Dimension

Diameter x Depth (inch)
0.367" x 0.15"
0.505" x 0.12"
0.633" x 0.12"

Class 1
Class 2
Class 3

4.5.5.3 Performance Evaluation for Autoencoder-based Transfer Learning

To initiate the process of fine-tuning the network weights through pre-training, a set of 1500

MFL simulation images from the simulation model is employed for depth classification. Typically,

three defect depths are considered, which are equally divided from varying depths from 2mm to

10mm. In the training of the pre-trained model, 70% of the total simulation MFL data is used for

training, while the remaining 30% is set aside for validation. After updating the network layers and

obtaining the optimally compressed representations, the Autoencoder model is further fine-tuned

with the experimental data in which 70% are allocated for training purposes. Before evaluating the

performance of the applied Autoencoder in data augmentation, the effectiveness of the proposed

Autoencoder-based transfer learning approach in this application is addressed.

The Mean Squared Error (MSE) loss is a standard metric for assessing the accuracy of trained

neural networks (NN). It calculates the average of the squared differences between predicted and

actual target values.

In this section, we compute the MSE loss for our proposed Autoencoder

network on MFL data and compare two cases. In the first scenario, the Autoencoder is directly

trained and tested on the experimental dataset. In the second case, the Autoencoder model is pre-

trained on a larger simulated dataset and then fine-tuned with the experimental data. The results,

shown in Figure 4.5, reveal a similar progression for both models, starting with relatively high loss

values and gradually improving, with significant reductions, especially after 25 epochs. Notably,

lower MSE in the testing data during validation indicates the model’s effective generalization to

54

Figure 4.5 Comparative loss analysis with and without transfer learning: a) Train Loss; b) Validation
Loss.

new, unseen data. The visual representations demonstrate that the Autoencoder excels at learning

valuable features from MFL data. Additionally, transfer learning accelerates convergence to an

acceptable loss level within a smaller number of training epochs, highlighting the benefits of

leveraging existing knowledge for faster learning and improved generalization.

The optimal pre-trained Autoencoder network, obtained from training on experimental data, is

used to validate the remaining 30% of the data. During this process, the newly generated data is

employed to augment the original experimental MFL dataset. Consequently, 154 additional sets of

newly generated data are combined with the original dataset labeled as "OR" and denoted as "GE."

To assess the impact of data augmentation, the relationship between the "GE" and "OR" datasets

is examined by analyzing the outcomes of the proposed CNN and DE models in terms of direction

and strength.

The directional relation can be realized through the covariance matrix, which is expressed as:

𝐶𝑂𝑉 (𝑆𝐺𝐸 , 𝑆𝑂𝑅) =

1
𝑁 − 1

𝑁
∑︁

(𝑆𝐺𝐸 (𝑖) − ¯𝑆𝐺𝐸 ) (𝑆𝑂𝑅 (𝑖) − ¯𝑆𝑂𝑅),

𝑖=1
𝑆 = {𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦, 𝐿𝑜𝑔𝐿𝑜𝑠𝑠, 𝐵𝑟𝑖𝑒𝑟 }

(4.20)

where 𝑖 denotes the number of liftoff variance. ¯𝑆(.) is the averaged scoring index. Moreover, the

correlation indicator is applied to determine how strongly two variables are related, which can be

55

written as follows:

𝐶𝑂𝑅𝑅(𝑆𝐺𝐸 , 𝑆𝑂𝑅) =

𝑐𝑜𝑣(𝑆𝐺𝐸 , 𝑆𝑂𝑅)
𝜎𝑆𝐺𝐸 𝜎𝑆𝑂𝑅

(4.21)

where 𝜎𝑆(.) denotes the standard deviation of the scoring index. Based on the "OR" and "GE" MFL

data, the corresponding relation is evaluated in terms of classification accuracy and uncertainty

scoring index: calibrated Log Loss and Brier score, which are presented in Table 4.2. The

results from both CNN and DE models show that all the scoring indices’ covariance indicators

are positive, and their correlation indicators are close to 1. This confirms that there is a strong

positive correlation between the "OR" and "GE" datasets, which further supports the feasibility

and efficiency of using a pre-trained Autoencoder network to address data deficiency in MFL

experimental scenarios. Therefore, the final augmented MFL dataset "GE" is used for further

learning-based defect detection and uncertainty estimation.

Table 4.2 Performance Evaluation on Augmented MFL experimental data

𝑆

Direction Strength

CNN

DE

Accuracy
Log Loss
Brier
Accuracy
Log Loss
Brier

0.0025
0.0721
0.1963
0.0060
0.0725
0.0301

0.9824
0.9806
0.9874
0.9965
0.9987
0.9955

4.5.5.4 Performance Evaluation for Uncertainty Analysis

As discussed before, network calibration is beneficial for improving the prediction reliability of

modeling. For multiclass scenarios, Static Calibration Error (SCE) is usually applied to evaluate

calibration performance by measuring the difference between the confidence and accuracy of a

model [230]. Specifically, the model predictions are divided into 𝑁 equally spaced bins separately

for each class 𝑗 and compute the calibration error within each bin. The final result is obtained by

averaging the calibration error across all the bins. In the case of each bin 𝐵𝑖 𝑗 , the accuracy 𝑎𝑐𝑐(𝐵𝑖 𝑗 )

represents the fraction of correct predictions, while the confidence 𝑐𝑜𝑛 𝑓 (𝐵𝑖 𝑗 ) corresponds to the

56

mean of the maximum probability for each data point. The SCE can be described as follows:

𝑆𝐶𝐸 =

𝑁
∑︁

𝑀
∑︁

𝑖=1

𝑗=1

𝐵𝑖 𝑗
𝐾

| 𝑎𝑐𝑐(𝐵𝑖 𝑗 ) − 𝑐𝑜𝑛 𝑓 (𝐵𝑖 𝑗 ) |

(4.22)

where 𝑁 and 𝑀 denote the number of bins and the total number of classes, respectively. 𝐾 is the total

number of data points. The corresponding calibration comparison results on MFL experimental

classification problem for CNN and DE are illustrated in Table 4.3.

Table 4.3 The effects of calibration on the mean of SCE

CNN

DE

Uncal

9.16% 8.72%

Cal

4.46% 3.64%

It can be seen that Platt Scaling leads to a noticeable decrease in prediction errors for both models.

This reduction in prediction errors can help decrease uncertainty within the system and ultimately

enhance the model’s reliability, which provides a good basis for investigating the uncertainties

brought by the liftoff variance.

To compare the uncertainty estimation of CNN and DE, we assess their performance on aug-

mented MFL experimental data using the calibrated scoring index 𝑆 with varying liftoff variance,

as shown in Figure 4.6.

In Figure 4.6 (a), the consistent decrease in prediction accuracy with

increasing liftoff variation suggests sensitivity to liftoff changes in both models. However, CNN

exhibits greater robustness, maintaining higher classification accuracy regardless of liftoff variation

compared to DE. Regarding total predictive uncertainty in Figure 4.6 (b) and (c), an increase in

liftoff variation raises total predictive uncertainty for both CNN and DE, showcasing their ability to

assess uncertainties in this application. The results also highlight a more significant difference in

accuracy and predictive uncertainty between liftoff 2 and liftoff 3 compared to liftoff 2 and liftoff

1, indicating an exponential decline in classification capability with liftoff changes. Although the

Brier score implies less uncertainty in CNN than in DE, the overall results demonstrate that CNN

achieves higher prediction accuracy and lower Log Loss than DE.

57

Figure 4.6 Comparison performance of mean (asterisk) and variance (shadowed bounds) for CNN
and DE under different uncertainty: a) Prediction Accuracy; b) Log Loss; c) Brier.

Figure 4.7 Uncertainty estimation of CNN.

Furthermore, as discussed earlier, the CNN model with Dropout can evaluate uncertainty

components arising from the learning model and the data by identifying aleatoric and epistemic

uncertainty during the prediction stage. The results are presented in Figure ??. The findings

demonstrate a significant increase in aleatoric uncertainty from 0.04 to 0.08 under liftoff variation.

Although there are fluctuations in epistemic uncertainty, aleatoric uncertainty remains around

four times larger than epistemic uncertainty. Therefore, the uncertainty attributed to the model

is negligible compared to the data uncertainty, and aleatoric uncertainty is mainly influenced by

variations in the data.

The effectiveness of the proposed CNN model in uncertainty estimation has been discussed

in previous sections. Now, we’ll examine its capability to guide decision-making for new, unseen

58

MFL data.

Due to the time-intensive nature of collecting MFL data, four groups of new MFL data were

obtained, comprising 36, 72, 108, and 144 sets of data, respectively. Liftoff variance was averaged

across all samples in each group, and these four sample groups were input into the CNN model to

evaluate its classification performance on defect size.

Figure 4.8 illustrates the percentage of wrongly classified instances among each sample con-

cerning different liftoff variances. Larger liftoff values lead to higher prediction bias. On average,

the percentage of incorrect classifications under varying liftoff uncertainties (from liftoff 1 to 3) is

25%, 30%, and 42%, respectively. It’s important to note that this analysis used only four sample

sets, and a more stable trend can be expected with a larger number of samples. However, a relatively

consistent relation between liftoff variance and wrong classification percentage is observed, making

results in Sample 4 a reliable representation of the true underlying trends in this application.

For better insights into the predictions under different liftoffs of Sample 4, the confusion matrix

is utilized to provide a detailed breakdown of the model’s predictions, as shown in Figure 4.9. With

increasing liftoff uncertainty, class 1 defects maintain a high classification rate of 87.5%, while

the other two defects, especially class 3, are more prone to misclassification. Additionally, using

the confusion matrix, the F1 score is computed—a metric that considers false positives and false

negatives, representing the harmonic mean of precision and recall. Among these three results, LO1

demonstrates the highest F1 score at 74.72%, indicating that increased uncertainty can reduce the

model’s ability to accurately identify positive instances and most of the actual positive instances.

Overall, both the bar plot results and the confusion matrix support the previous observation in

Figure 4.8 that the classification capability deteriorates exponentially with higher liftoff changes.

Further in order to propose a quantitative way to evaluate and determine the reliability of the

classification to new input, two feature indexes are considered in this case:

• Confidence Index 𝐶 𝐼: indicate the degree of confidence in the classification performance,

with higher CI values indicating lower uncertainty and vice versa [91]. The formula can be

59

Figure 4.8 Prediction performance for new MFL data.

expressed as:

𝐶 𝐼 = 𝑎𝑏𝑠(𝐿1 − 𝐿2)

As each predictive probability is generated from the Softmax function, where a certain

probability will be assigned to each class for one prediction. 𝐿1 is the negative log-likelihood

of probability for the correctly classified class; while 𝐿2 is the negative log-likelihood of the

maximum probability, calculated among the other wrong classes.

• Weighted Predictive Uncertainty 𝑈: The Log Loss, Brier, and Aleatoric uncertainty are all

capable of revealing the uncertainty in classification to varying degrees. To combine these

uncertainty indexes, the Minimum Redundancy Maximum Relevance (MRMR) method is

used to rank them by finding the optimal feature set that can effectively represent the response

variable while minimizing redundancy between features [231]. The resultant ranking deter-

mines the importance weights for each uncertainty index and thus generates the weighted

total predictive uncertainty, which can be expressed as:

60

(a)

(b)

Figure 4.9 Confusion Matrix of prediction under different liftoff: a) LO1; b) LO2; c) LO3.

(c)

𝑈 = 𝑎 ∗ 𝑈𝑙𝑜𝑠𝑠 + 𝑏 ∗ 𝑈𝑏𝑟𝑖𝑒𝑟 + 𝑐 ∗ 𝑈𝑎𝑙𝑒𝑎𝑡𝑜𝑟𝑖𝑐,

{𝑎, 𝑏, 𝑐} = {0.624, 0.318, 0.058}

(4.23)

To assess the performance of defect classification on new MFL data, the corresponding feature

indexes from Sample 4 are extracted and presented in Figure 4.10. In the figure, correctly classified

data points are marked as red dots, while incorrectly classified ones are marked as green dots.

Two boundaries are established to enhance the decision-making process guided by uncertainty:

the Confidence Interval (CI) Threshold and the Weighted Total Uncertainty. The CI Threshold,

marked with a brown line, serves as the initial boundary in uncertainty-guided decision-making.

61

It is determined based on the CI of the highest wrong prediction sample. Any samples with a

CI higher than this threshold are considered to have a trustworthy classification, regardless of

their uncertainty index. Otherwise, samples with a CI lower than this threshold will be evaluated

using the uncertainty decision boundary, depicted with a pink curve. This boundary is generated

using Quadratic Discriminant Analysis (QDA), a statistical algorithm used for classifying data into

groups by modeling the distributions of the independent variables (predictors) for each group using

a quadratic function [232]. When a new classification is made, the evaluation steps mentioned

before should go through to determine the reliability of the classification. Based on these steps,

examples of new MFL images with correct predictions and wrong predictions are presented in

Figure 4.11 and 4.12. The corresponding feature indexes CI and U are listed, along with the

True class and Prediction class. Based on the aforementioned steps, Figure 4.11 and 4.12 show

examples of new MFL images with correct and wrong predictions, respectively. The corresponding

CI and U, as well as the true class and predicted class, are listed. These examples emphasize the

significance of considering both factors in the prediction process, demonstrating the effectiveness

and practicality of the proposed uncertainty guidance process.

4.6 Conclusion

In this chapter, we delve into several classical and widely recognized methods for propagating

uncertainty in NDE applications. These methods aim to assess how the posterior distribution

varies during the ’Learning’ phase, considering different distribution hypotheses relevant to NDE

problems.

The inverse learning uncertainty propagation (UP) process is generally categorized into three

scenarios: physics-informed, data-driven, and hybrid. Under these scenarios, different methods

are employed to address two principal uncertainty types: parametric and structural uncertainties.

Additionally, we explore an MFL-based defect classification problem which is an example of a

hybrid learning-based UP analysis aimed at incorporating uncertainties into the final prediction.

This uncertainty-guided decision-making process can provide more insights into the prediction

results, therefore increasing the reliability of the classification results.

62

Figure 4.10 New MFL sample distribution based on the Confidence Index and Weighted Total
Uncertainty with noted CI threshold (Brown) and Uncertainty Decision Boundary (Pink).

Figure 4.11 Examples of correctly predicted MFL images: First row images are of high CI, while
the second row is of low uncertainty, even with low CI.

63

Figure 4.12 Example of wrongly predicted MFL images, which are of low CI and high uncertainty.

Further, a Bayesian approximation-based learning model is applied as a supportive case that

provides a comprehensive and practical solution for uncertainty estimation in experimental MFL

defect classification. This work has introduced a valuable research framework aimed at classifying

and enhancing prediction reliability by incorporating transfer learning-assisted Autoencoder-based

data augmentation, learning-based defect classification, and uncertainty analysis. Further, we

proposed guidance to determine the reliability of the classification on new unseen MFL data with

two features indexed confidence index CI and weighted total uncertainty. Those key elements are

collectively contributing to the robustness and effectiveness of this work, and further enhancing the

reliability of the classification outcomes.

64

CHAPTER 5

RELIABILITY EVALUATION TO NDE PROCESS WITH UQ

Reliability in NDE encompasses the ability of the NDE method to provide consistent and

accurate information about the inspected material or component. Other than the aforementioned

uncertainty among stools, there are numerous controlled and uncontrolled factors that are related to

the reliability of the NDE inspection process. As mentioned in [233], relevant uncertainty may come

from four aspects: specimen condition, sensor types and numbers, inspection setup and calibration,

environmental conditions, and operator expertise. To have a better understanding of the system

reliability, depending on the availability of specified uncertainty sources, popular uncertainty

estimation technology:

the measurement uncertainty and probability of detection (POD), are

investigated in this section.

5.1 Probability of Detection

Probability of Detection (POD) is a widely adopted approach for quantifying the capability of an

NDE inspection in terms of providing statistical description in identifying cracks or more generally

defects in structures [234–236]. As mentioned in the previous section, given that parameters of

defect characters or material properties are uncertain, and that all inspection equipment has noise,

which will directly affect the sensing systems’ sensitivity in defect characterization. In this case,

POD is able to evaluate the likelihood that a given NDE method will correctly detect a flaw or defect

of interest based on a specified uncertain characteristic developed through experimental testing,

thus providing critical information for quality control, safety, and compliance in various industries

[237].

Generally, two common variants of POD analysis are used to describe the detection results,

referred to as ˆ𝑎 vs 𝑎 and Hit/Miss. Hit/Miss analysis is a typical approach to measure how likely

it is that the flaw will be detected. In cases where establishing a clear relation between flaw size

and flaw response is challenging or quantifying the response is difficult, the target is to obtain the

POD curve based on binary results based on a clearly defined hit/ miss criterion: hits (correctly

found cracks) and misses (cracks not found in the inspection). In this binary detection scenario,

65

the testing results can be scaled in terms of four terms: True positive (TP), False negative (FN),

False positive (FP), and True negative (TN). The corresponding POD can be modeled in terms of

the probability of TP [236]:

𝑃(𝑇 𝑃) =

𝑇 𝑃
𝑇 𝑃 + 𝐹 𝑁

(5.1)

Different from the Hit/Miss method, the ˆ𝑎 vs 𝑎 method, also named as signal-response data, deals

with the response signal values directly, thus containing more information related to the detected

flaw, such as flaw size and location. Specifically, where ˆ𝑎 is the measured magnitude response or

signal amplitude (such as crack size), which is the response from crack size 𝑎 through estimated

POD(𝑎) function [233]. Considering the uncertainty affecting the measurements, the objective is

to determine a decision threshold ( ˆ𝑎𝑡ℎ) that reduces false alarms caused by noise while maximizing

the detection of cracks.

The fundamental assumptions for both POD analysis models are similar: the POD curve of a

specific parameter of interest (such as flaw size) versus the probability of detection exhibits a rising

trend, which is shown in Fig.5.1. Ideally, the probability of detection can reach up to 100% with a

sufficiently large defect size [238]. Considering the uncertainties in the NDE inspection, the realistic

POD curve can not provide such dependence on crack size. A 90% detection probability with a

95% confidence level is set as the inspection capability evaluation index 𝑎90/95. The detection

reliability can only be guaranteed when the detected parameter is above the 𝑎90/95 threshold.

Therefore, the determination of this threshold is of great importance in POD analysis, which

varies in different NDT applications [233]. As concluded in [239], the log-logistic distribution

and an approximately linear relationship between 𝑙𝑛( ˆ𝑎) and 𝑙𝑛(𝑎) provide the most applicable

statistical basis to derive the POD curve respectively. Further, the proposed POD curve basis

can be extended and customized for different NDE applications. For example, a groundbreaking

recommended practice (RP) was introduced for fatigue cracks in offshore structures in May 2015,

which recommended several common POD distribution functions for different NDE methods like

EC, UT, and MPI [240].

66

Figure 5.1 Typical POC Curve.

Based on the characteristics of each POD analysis model, the selection of the applied method

varies for different types of NDE inspections, the data collected, and the specific goals of the

analysis. Hit/Miss analysis has been widely applied in quantitative visual assessment of the flaw

response for system reliability evaluation, such as visual inspection [241], penetration inspection

[242], magnetic particle inspection [243], or ultrasonic testing [244]. The ˆ𝑎 vs 𝑎 approach is

applicable when a quantitative signal response can be correlated to flaw size as typically attainable

with techniques, such as ultrasonic [244] or eddy-current inspection [245]. Further, to understand

and compare those two POD modeling performances, a manual aerospace eddy-current inspection

and a nuclear industry phased array ultrasonic weld inspection are analyzed of both models by Iikka

et al [238]. Results have shown that uncertainties in inspector judgment or data format will provide

significantly different POD curves.

Both the ˆ𝑎 vs 𝑎 and Hit/Miss models have limitations in constructing a dependable full POD

curve, restricted by the data availability, computation requirements, and costs. Therefore, a newer

model-building technique, Model Assisted POD (MAPOD) is proposed, which requires less phys-

ical data and effort [246]. In MAPOD modeling, statistical models are employed to assist with true

inspection data in improving the understanding of the nature of detected defects. Based on the infor-

mation integration scenario, Meyer et al. have classified MAPOD into Bayesian and Nonbayesian

67

methods [247]. Non-Bayesian-based MAPOD approaches are chosen when there is a preference

for using statistical or data-driven methods that may be better suited to the specific characteristics

of the data or the nature of the defects being detected. For example, the physics-based modeling

was performed with ˆ𝑎 vs 𝑎 POD curve to determine the influence of fatigue cracks with ET [248].

Harding et al. used the transfer function method that relies on a data-driven linear regression model

of POD parameters to estimate the POD for fatigue cracks in aircraft wings for UT inspection

[249]. In contrast, Bayesian-based MAPOD relies on the availability of prior knowledge or expert

opinion, which allows for a principled integration of existing knowledge with new data to make

more accurate and reliable defect classification decisions. For example, in a high-frequency eddy

current-based fatigue crack detection, the Bayes theorem is combined with computer modeling

to provide ’prior information’ to obtain posterior estimates of model parameters for generating

more data. The resultant data are fitted with the logistics function generating a reliable Hit/Miss

POD curve [250]. Besides, there are other model-based approaches to predict the POD curves

in NDE area, such as signal/noise models, Image classification models, Human reliability index,

and so on [251]. Overall, each POD model has its advantages and limitations within different

NDE applications, it is hard to find uniform guidelines. However, Virkkunen et al. have divided

the uncertainty sources in the POD estimation into five aspects [238]: statistical sampling error,

measurement variation, configurational variation, variation in crack characteristics, and inspector

judgment, which are also considered as the ’Data Uncertainty’ in the proposed TLS model. They

also concluded each POD model’s capability in facing those uncertainties, which is beneficial in

developing an optimal POD-based reliability analysis in NDE analysis.

5.2 GUM-based Measurement Uncertainty Evaluation

During the NDE measurement process, there may be uncertainties arising from factors like

measurement instruments, environmental conditions, or operator influences. These combined

uncertainties contribute to the variations in final measured outputs. Therefore, it is difficult to

quantify the relationship between inputs and measured output directly.

Instead of constructing

a complex POD curve to describe each specified uncertainty factor, it is essential to provide an

68

overall estimate of uncertainty from the measurement process, which assists in evaluating how well

the measurement accurately represents the quantity being measured. The Guide to the Expression

of Uncertainty in Measurement (GUM) [62], is the most commonly used method to analyze

measurement uncertainty, published by the Joint Committee for Guides in Metrology (JCGM).

GUM is proposed to provide a standardized and systematic approach for accurately reflecting all

propagated uncertainty relevant to the measurement and thus evaluating the quality of the numerical

measurements.

In the content of the GUM model, the precision of the measurement system is decided from

three aspects: repeatability, reproducibility, and maximum indication error of the instrument [252].

Repeatability focuses on quantifying the consistency of measurements of the same quantity under

the same operations and equipment. Reproducibility assesses the consistency of measurements

across varying conditions or different operators or using different equipment and methods. The

indication error of the instrument pertains to the inherent accuracy and precision of the measurement

instrument due to its limitations, which can be obtained from the producer’s datasheet [253].

In the context of measurement uncertainty, Type A and Type B uncertainties are two distinct

categories used to characterize and quantify different sources of uncertainty [254]. Theoretically,

Type A uncertainty is associated with random or stochastic variations in measurement, which is

quantified through statistical analysis of repeated and reproducible measurements or data. Type

B uncertainty encompasses sources of uncertainty that are not characterized by random variations

but are instead associated with systematic errors, approximations, or uncertainties in external

parameters. These uncertainties are typically estimated through non-statistical methods, such as

expert judgment, calibration data, or scientific knowledge. The detailed comparison between Type

A and B uncertainty in terms of data source, probability interpretations, and statistical formula are

summarized below:

• Uncertainty Data Collection: Type A relies on a series of observations in terms of Repeata-

bility and reproducibility; while Type B relies on available relevant information such as

manufacturer’s specification, calibration certificate, Handbook, etc.

69

• Probability Interpretations: Type A is a frequentist method; while the Type B method is a

Bayesian method.

• Statistically expression in terms of Standard Uncertainty:

Type A:𝑈 =

Type B:𝑈 =

𝑆
√
𝑛

𝑎
√

3

(5.2)

(5.3)

Specifically, Type A doesn’t require prior information about the system, and makes predictions

only using data from the current experiment to obtain the probability density function; while the

Type B method encodes prior knowledge of similar experiments, and combines it with current

experiment data to make the probability density function. However, in some cases, prior infor-

mation and mathematical modeling are hard to obtain. Also, the various uncertainty sources, in

practical application, are correlated, which increases the difficulty in generating an accurate prob-

ability density function in Type A uncertainty analysis. Generally, Type A and B approaches may

complement each other, and the choice of a particular analysis method should consider the specific

needs of each application. To provide a statistical evaluation of the overall uncertainty, each type of

uncertainty should be standardized to the same confidence level by converting them into standard

uncertainty 𝑈𝑠. A standard uncertainty represents a range that can be conceptualized as ’plus or

minus one standard deviation’, which provides information about the uncertainty associated with an

average value [254]. For type A, the standard uncertainty is calculated from the estimated standard

uncertainty 𝑆, which is shown in Equ.5.2. For repeatability uncertainty analysis, 𝑛 is the number of

measurements; while for reproducibility uncertainty, 𝑛 denotes the different measurement groups

to make multiple independent tests. For type B, the standard uncertainty is represented with maxi-

mum allowable error 𝑎 and normally, the upper and lower limits of uncertainty are assumed to fall

into rectangular or uniform distribution. After each type of uncertainty is standardized, the easiest

way to obtain the combined total measurement uncertainty 𝑈𝑐 is to use ’summation in quadrature,

70

which can be expressed as follows:

𝑈𝑐 =

√︃

𝑈2

𝑟𝑒 𝑝𝑒𝑎𝑡 + 𝑈2

𝑟𝑒 𝑝𝑟𝑜𝑑𝑢𝑐𝑒 + 𝑈2

𝑠𝑦𝑠𝑡𝑒𝑚

(5.4)

For more complicated cases, there will be more variants for obtaining 𝑈𝑐 relying on addition,

multiplication, etc., which are discussed in [255]. Afterward, the expanded uncertainty 𝑈𝑒 is

needed to establish a confidence interval to describe total measurement uncertainty with a coverage

factor 𝑘, which is expressed as:

𝑈𝑒 = 𝑘 ∗ 𝑈𝑐,

(5.5)

To provide coverage with 95% confidence when 𝑘 = 1.96.

Following the GUM-guided measurement uncertainty analysis, the reliability of NDE inspec-

tions has been evaluated and tailored to the specific needs of each application. For example, Morana

et al. evaluate measurement uncertainty associated with ultrasonic thickness measurements. The

impact of response and contact surface roughness are investigated through the reproducibility tests

[256]. In a structured light sensing-based defect detection, the relationship between the number

of repetitions and the measurement uncertainty was analyzed by providing the uncertain range in

defect size estimation. The calculated uncertainty is considered the best estimate of the correction

with the measurement error, and the low uncertainty value illustrates the high reliability of the

measurement system [257]. Besides, to track the relationship between the number of repetitions

and the uncertainty, the statistical testing scheme, Analysis of Variance (ANOVA), was applied. In

this process, the within-group mean square serves as an estimate of the variance of the measured

quantity when measured under identical conditions, which provides a numerical estimate of how

much each factor contributes to the overall uncertainty [258]. ANOVA is a powerful tool when

dealing with multiple factors or sources of variation, which has been widely applied in NDE-based

measurement uncertainty analysis [259–261]. The computational procedure of ANOVA can be

illustrated in Fig.5.2. Noted that 𝑆 is determined from the standard deviation group candidate with
possible combinations of measurements (cid:0) 𝑛
(cid:1), which is decided based on different cases for obtaining
𝑚

the best selection for uncertainty estimation.

71

Figure 5.2 Block diagram of ANOVA procedure.

Generally, GUM-based measurement uncertainty analysis provides a good basis for ensuring

the inspection quality. It is important to consider the uncertainty factors to choose the appropriate

measurement uncertainty analysis model to ensure that the model accurately reflects the associated

uncertainties and aligns with the goals and standards of specific NDE applications. Given the above

measurement uncertainty scheme, it is then applied to two practical NDE-based cases for estimating

the corresponding measurement uncertainty, which will be addressed in the next sections.

5.3 Magnetic Barkhausen Noise-based Material Fatigue Detection

5.3.1 Background

Martensitic-grade stainless steel is usually used to manufacture steam turbine blades in power

plants. Due to cyclic loading applied on these blades, the stainless-steel material will get fatigued

initiating cracks and ultimately leading to complete fracture. The failure of these turbine blades can

contribute to expensive plant failures and safety concerns [262]. The fatigue crack formation process

is a complex and dynamic sequence of events that involves the initiation, growth, and interaction

of micro-cracks within the material[263]. An understanding of the fatigue crack formation process

at the microstructure level is crucial for materials scientists, engineers, and researchers. It allows

them to develop strategies to mitigate and manage fatigue-related issues, such as improving material

properties, enhancing structural design, and implementing maintenance and inspection protocols

to prevent failure due to fatigue cracking.

Magnetic Barkhausen Noise (MBN) is sensitive to microstructure changes in ferromagnetic

materials.

It has great potential in measuring surface residual stress and other microstructural

parameters [264]. Specifically, for example, an MBN-based model is applied in characterizing

72

carbon content influence in plain steels [265]. The results reveal that the separation (gap) between

the two peaks of the MBN envelop decreases with the increase in carbon content. Also, magnetic

properties of ferrite-martensite dual-phase steels are evaluated using MBN signal, revealing that

Barkhausen noise profiles are correlated with ferrite grain size and different percentages of marten-

site [266]. Furthermore, MBN is able to detect fatigue cracks from mild steel samples, where the

influence of fatigue on fractal characteristics of Barkhausen noise is potential in the detection of

fatigue crack initiation [267]. Therefore, MBN emerges as a promising technique for early-stage

fatigue detection in steel samples.

Generally, MBN is a complex magnetoelectric phenomenon, and it is associated with a high

degree of uncertainty. One criticism directed at MBN is its limited repeatability and stability. The

primary reasons behind these issues include measurement instabilities arising from varying experi-

mental conditions and the absence of a standardized normalization principle [258, 268]. Sources of

uncertainty in MBN analysis can be categorized into two main categories: data-related and calcula-

tion model-related, as discussed in [180]. In the context of data-related uncertainties, three primary

aspects are considered: the first is material physics properties, which denotes the variability in the

material’s physical properties, such as its composition, microstructure, and mechanical properties.

The second is the data-generating method including the instrumentation and measurement setup.

Besides, the presence of experimental noise and interference during data acquisition introduces

uncertainty, like electromagnetic interference, sensor noise, and other environmental factors. To

address these data-related uncertainties, it is crucial to employ a stable sensor that ensures consis-

tent magnetization conditions and to implement accurate processing techniques for the obtained

MBN signal. Additionally, the precision and reliability of the measurement system play a signif-

icant role in assessing the uncertainties associated with the collected data. A well-calibrated and

high-precision measurement system contributes to a more accurate evaluation of the uncertainties,

ultimately enhancing the quality and reliability of MBN analysis.

The primary objective of this work is to explore the applicability of MBN in detecting fatigue

damage while diminishing and rectifying uncertainties that may arise during the measurement

73

process. The features derived from the processing of raw MBN signals are carefully selected

from both the time and frequency domains, considering their relationship with microstructural

material properties. Subsequently, Principal Component Analysis (PCA) is applied to extract

advanced features.

In addition, a Probabilistic Neural Network (PNN) is utilized for sample

classification based on the percentage of remaining fatigue life, a factor expected to be closely linked

to the specimen’s fatigue life and with the potential for early fatigue onset prediction. Moreover,

comprehensive uncertainty analysis is conducted to address and mitigate any noise present in the

acquired MBN data. To further enhance measurement precision, a statistical technique, Analysis

of Variance (ANOVA), is employed to assess repeatability.

5.3.2 Experimental Setup

The experimental setup can be described as follows: First, a wave generator produces a low-

frequency sinusoidal excitation signal, which is then amplified using a power amplifier. This

amplified signal is supplied to an excitation coil, generating changes in magnetization within the

martensitic steel sample. A pick-up coil captures the resulting MBN voltage signal from the sample

and transmits it to an NI DAQ card. LabView processes this data to create the MBN data file, which

is further analyzed in MATLAB to derive features related to the steel sample’s fatigue life. Given

the sensitive and noise-like nature of the MBN signal, an efficient sensor is essential for accurate

measurements. The MBN sensor assembly includes a U-shaped magnetic core, pick-up coil, and

excitation coils. The pick-up coil is positioned on the sample and attached to the magnetic core

with a customized holder. Optimization of the pick-up coil’s turns and analysis of MBN signals in

both the time and frequency domains are critical for obtaining the best pick-up signal results, as

described in [269]. .

The test inventory consists of 36 samples, each representing various stages of fatigue, charac-

terized by different loading cycles falling within the range of 0 to 2,000,000 cycles. Subsequently,

EPRI conducts fatigue testing on these samples, ultimately leading to their failure, during which

the lifespan of each sample is determined. It’s worth noting that, out of the 36 samples, only 26

have information available regarding their total loading cycles at the point of failure. Based on the

74

loading cycles observed during the NDE test and the total loading cycles at the point of failure, a

metric known as "percentage fatigue life" is introduced. This metric is further elucidated below:

Fatigue Life =

Total Loading Cycles − Cycles at NDE test
Total Loading Cycles

(5.6)

The sample is re-categorized according to the Percentage Fatigue Life. The categorization is

compared between the Cycles at NDE Test (without ground truth) and percentage Fatigue Life

(ground truth), shown in Table 5.1. The samples are categorized into No-Fatigue, Mid-Fatigue,

High-Fatigue and Cracked.

Table 5.1 Category for Ground truth and No Ground truth

Loading Cycles at NDE test (No Ground truth)

Sample Category
No-Fatigue (Untested)
Mid-Fatigue (150,000-750,000)
High-Fatigue (900,000-2000,000)
Cracked

Number of Sample
3
19
4
4
Percentage Fatigue Life (Ground truth)
Sample Category
Low Fatigue (75% to 100%)
Mid-Fatigue (40% to 75%)
High-Fatigue (0% to 40%)
Cracked (0%)

Number of Sample
10
12
4
4

MBN data are collected at 19 different points, where 18 of them on the 3 × 6 grid in the region of

interest, representing the fatigued area. The last data point is in the un-fatigued area, which serves as

the reference point. For each selected feature of a given sample, the normalization process of each

feature value is through averaging values of 18 scanning points and divided by the corresponding

sample’s reference point feature value, which is described through Eqn.5.7:

𝑁𝑘𝑙 =

(cid:205)18
𝑆𝑖𝑘𝑙
𝑖=1
18 · 𝑅𝑘𝑙

75

(5.7)

where 𝑖 is each scanning point at fatigue area, ranges from 0 to 18; 𝑘 denotes the sample number,

ranges from 0 to 30 (including the cracked sample); 𝑙 is the corresponding feature. 𝑆𝑖𝑘𝑙 represents

each sample point’s feature, while 𝑅𝑘𝑙 is the corresponding reference point’s feature.

5.3.3 Defect Classification in MBN

5.3.3.1 Defect Classification with Probabilistic Neural Network

Based on the obtained MBN signals from each sample, various features are evaluated in the time

domain and frequency domains. In the time domain, the shape of MBN profile shows systematic

and distinct variation in the magnetization process with respect to different microstructures[270].

Therefore, features are selected based on the information provided by the MBN signal profile. Since

the signal profile exhibits symmetry around the x-axis, the upper part of each periodic MBN signal

at every scanning point is represented using a combination of two Gaussian distributions. This

modeling approach has been demonstrated to be effective in extracting features related to material

microstructures, including metrics like the volume fractions of ferrite and pearlite, as well as the

average grain size [270] [269]. The combination of two Gaussian fitted curves (depicted in blue

and green) produces the resultant curve (shown in red), which characterizes the upper portion of

the Magnetic Barkhausen Noise (MBN) signal profile.

As shown in Fig.5.3. (a), the selected time domain features are signal peak, the full width at

half maximum (FWHM) of the upper MBN’s profile, and the peak difference among the gaussian

fitting curves. Also, fast Fourier transform (FFT) is applied to time domain MBN signals to obtain

the corresponding frequency spectrum, where the frequency spectrum, the maximum spectrum

amplitude noted as AMP, and the frequency for maximum spectrum amplitude noted as POS, are

selected as features, which are shown in Fig.5.3. (b). Besides, the energy of MBN is associated

with the misorientation angle of grain boundaries, affecting the alignment of magnetic domains

along these boundaries. As per Parseval’s theorem, the energy of a signal remains constant in both

the time and frequency domains. Consequently, the energy of an MBN signal is calculated by

summing across all the frequency components of the signal’s spectral energy density.

Principal Component Analysis (PCA) is a method employed to accentuate variations and reveal

76

(a)

(b)

Figure 5.3 Illustration of extracted feature a) Time domain features: Peak, FWHM, and Diff; b)
Frequency domain features: AMP and POS.

significant patterns within a dataset. In this case, where six distinct features have been introduced,

PCA serves as a valuable tool for gaining a more insightful understanding of these features. PCA

identifies higher-order features, known as principal components, which provide a more refined

representation of the earlier findings. Additionally, PCA excels at reducing dimensionality, enhanc-

ing interpretability, and minimizing the loss of information [271]. The fundamental steps in the

calculation procedure of Principal Component Analysis (PCA) are as follows: Begin by computing

the eigenvalues of the covariance matrix of the data, which represents the correlations among both

the original and new features. The next step involves multiplying the covariance matrix by the

original feature space, resulting in the creation of a lower-dimensional feature space. This newly

calculated feature space comprises the principal components, which serve as representations of the

new features.

A probabilistic neural network (PNN) is a feedforward network formula for probability density

estimation. Unlike other traditional back propagation neural networks, PNN shows excellent and

efficient performance in dealing with limited datasets and has been widely used in NDE problems.

In an Eddy Current-based defect detection case, the proposed PNN is able to identify the untrained

eddy current patterns with 100% classification accuracy [272]. In [273], the proposed Kernel PCA

followed with PNN algorithm is applied in acoustic data of rolling bearing type rotating machine

77

Figure 5.4 Process to obtain CI.

to classify the machine state.

In this case, the obtained MBN features are modeled by the PNN to classify the low-fatigue,

mid-fatigue, and high-fatigue samples. During the network training process, the multi-Gaussian

function is centered based on the associated input feature vector in each class, and total 𝑘 = 3 classes

are defined; for the output layer, the summed Gaussian output is defined. Then, for a test feature

vector 𝑥𝑖, where 𝑖 is the sample ID, ranges from 1 to 26, all Gaussian functional values at the hidden

nodes will be computed and then be passed to the single output node for each group of hidden

nodes. All of the inputs will be summed and multiplied by a weighted function. With the applied

Softmax function, the corresponding output vector 𝛼𝑖𝑘 is obtained, which provides the probability

belonging to each category. The maximum value within the output vector decides the classified
result. Among all the test samples, the total classification accuracy is defined as Accuracy = 𝑌
𝑃 ,

where 𝑌 is the corrected classified sample number and 𝑃 is the total sample number. In order to

describe the classification results with confidence evaluation, the Negative log-likelihood function

is applied to the largest probabilities, defining the final output class 𝐿 = −𝑙𝑜𝑔(𝑚𝑎𝑥(𝛼𝑖𝑘 )). For

the accurately classified sample, Classification Index (CI), is involved to evaluate the classification

confidence:

CI = 𝑎𝑏𝑠(𝐿1 − 𝐿2)

(5.8)

Specifically, 𝐿1 = −𝑙𝑜𝑔((𝑚𝑎𝑥(𝛼𝑖)), is final output; 𝐿2 = −𝑙𝑜𝑔((𝑚𝑎𝑥(𝛼𝑖)), is calculated

among the other two wrong classes; which are illustrated in the Fig.5.4. A large CI value means

classification performance is of higher confidence at the correct class and vice-versa.

78

Furthermore, the accuracy term can be misleading in unbalanced datasets, since it only focuses

on the correctly identified cases. Considering in this early fatigue stage detection, not all errors are

semantically equal, the evaluation of the misclassification risks is important. The confusion matrix

is a common performance measurement technique for classification, which could provide a better

understanding of mistakes made by the classifier and also mistake types. Other than accuracy,

usually in binary classification, a confusion matrix could measure Recall (proportion of actual

positives is correctly classified), Precision (proportion of predicted positives is truly positive), etc.

Also, the F1-score takes a harmonic mean of Recall and Precision to punish extreme values, which

is a better indicator for evaluating unbalanced class distribution, shown in Eq.(??). The higher the

F1-score is, the classification procedure is of higher reliability.

In the context of unbalanced datasets, the term "accuracy" can be misleading since it primarily

focuses on correctly identified cases. In this early fatigue stage detection, not all errors have the

same significance. Therefore, it is important to assess the risks associated with misclassifications.

To achieve this, the confusion matrix serves as a widely used performance measurement technique

in classification. It offers a more comprehensive understanding of the errors made by the classifier,

including the types of mistakes.

In binary classification scenarios, a confusion matrix could

measure Recall (proportion of actual positives is correctly classified), Precision (proportion of

predicted positives is truly positive), etc. Additionally, the F1-score is a valuable metric that takes

the harmonic mean of Recall and Precision. This helps in penalizing extreme values and serves

as a more reliable indicator, particularly for assessing unbalanced class distributions. A higher F1

score signifies a classification procedure of higher reliability and effectiveness.

5.3.3.2 Comparison Results

Based on the percentage fatigue life, the normalized feature value for each sample 𝑁𝑘𝑙 is

shown in Fig. 5.5. For each normalized feature, we use a box-and-whisker plot to describe sample

distribution based on the ground truth category among 26 samples. Also, the average normalized

feature of each category is presented. Results show that for all six MBN features, there exist

monotonical trends in mean with percentage fatigue life. Except for POS plot, the other five

79

Figure 5.5 Comparison Results of Six features.

features are positively correlated with percentage fatigue life.

In the context of PCA, it’s essential to address the potential impact of useless features on the

results. To account for this, we employ two distinct original feature sets and conduct PCA separately

for each, allowing for a comparison of the results. In the first scenario, the original features include

signal peak, Full Width at Half Maximum (FWHM), differences between two peak curves, and

signal energy. These features have previously been established to exhibit a strong relationship with

microstructures. Through PCA, two new features, PC1 and PC2, are generated to represent the

original feature space. In the second scenario, all six features are retained as the original feature

set, which are then reduced to four principal components (PC1, PC2, PC3, and PC4). In both cases,

the new principal components collectively account for 98% of the total variance. To construct new

feature sets, we consider the first two components (PC1 and PC2) in Case 1 and the first three

components (PC1, PC2, and PC3) in Case 2, as these components account for a higher proportion

80

of the variance. These new feature sets are defined and subsequently assessed using a PNN.

To enhance the reliability of the results, the PNN is trained and tested across 30 iterations,

and the final accuracy is determined as the average of the values obtained from each iteration.

In each train-test scheme, a random sample from each category is selected to ensure that every

category is represented in the test sample. The dataset is divided into three categories, and balanced

cross-validation is applied for the train-test split. For samples that are correctly classified, the CI

is calculated. The final CI is derived as the average of all the obtained CIs. Given the necessity for

three classifiers in our case, the classification performance is evaluated by computing the arithmetic

mean of the per-class F1-scores, resulting in what is known as the macro-averaged F1-score. Based

on the preceding discussions, three distinct input feature spaces for the Probabilistic Neural Network

(PNN) are defined as follows:

1. Six normalized MBN features: Signal Peak, FWHM, Diff, Energy, AMP, POS;

2. PCA results(PC1 and PC2) for Case 1: Peak, FWHM, Diff, and Energy;

3. PCA results(PC1, PC2, and PC3) for Case 2: six MBN features;

The corresponding PNN classification results are presented in Table 5.2. Among all three

cases, after PCA for all six features, higher order MBN features have 79% classification accuracy

performance along with a high classification index, which proves that the extracted principal

components (PC1, PC2, and PC3) in this scenario, are the most representative features to indicate

remaining fatigue life of martensitic samples. Also, the macro-F1 score in the third input space

is relatively higher than others, which shows that the selected classifier is of higher precision and

robustness.

For these extracted principal components, Fig. 5.6 shows the averaged feature value’s scatter

plots based on the category of loading cycles at NDE test (No Ground truth) and that of percentage

fatigue life (Ground truth). The results show that, compared with simply using the loading cycles

as the classification criterion, the extracted principal components show a monotonic trend with

ground truth information. Considering all the previous cases, though in Case 2, where the variance

81

Table 5.2 PNN Comparison Results

Six features PCA results for Case 1 PCA results for Case 2

PNN Accuracy
CI
Macro-F1 Score

66.67%
1.5896
0.4032

66.67%
1.0769
0.4859

77.89%
8.9922
0.5951

Figure 5.6 Averaged PCA results comparison among ground truth and no Ground truth.

distribution spreads out for more components with some information loss, the extracted new MBN

feature space is better in categorizing samples based on the percentage fatigue life.

5.3.4 Uncertainty Analysis on MBN

The presence of uncertainties within the system can significantly impact prediction capabilities,

underscoring the importance of uncertainty assessment in the MBN system to gauge the reliability

of measurement results. Currently, the unavailability of complete material fatigue information and

insufficient data on failures make it challenging to precisely identify the sources of uncertainty

and quantitatively estimate them. To address this, efforts are concentrated on building a stabilized

data collection system and obtaining reliable output to reduce the noises in the experiment and

signal. These efforts encompass various aspects, such as stabilization and standardization of sensor

82

Figure 5.7 Comparison results of expanded uncertainty.

lift-off, circuit connections, and scanning areas; Multiple measurements are conducted, comprising

four repetitions. Over ten half-periods of the MBN signal are averaged at each measurement point,

followed by the application of Root Mean Square (RMS) to present signal intensity with normalized

features.

In this study, by ignoring the Type B uncertainty, the measurement standard uncertainty is

addressed in terms of repeatability only, through the Equ.5.2. Specifically, four repetitions (𝑛 = 4)

are made for uncertainty analysis. The uncertainty associated with each feature is presented

through the corresponding expanded uncertainty. The standard deviation 𝑆 for every combination

was determined by taking the maximum of those four chosen 𝑚 standard deviations. The standard

uncertainties were calculated for all selected features.

5.3.4.1 Analysis of Variance-based Uncertainty Quantification Results

Based on the previous discussion, PC1, PC2, and PC3 in PCA Case 2 as well as the original

six features have a monotonic relationship to fatigue information. Therefore, all nine features are

evaluated in this statistical uncertainty analysis. The comparison results of each feature’s expanded

uncertainty are illustrated in Fig. 5.7.

The generated plot illustrates a correlation between uncertainties and the number of repetitions

83

for each selected feature. As anticipated, the uncertainties exhibited a decreasing trend as the number

of repetitions increased. In general, the uncertainties were observed to be quite low, suggesting

that repeated measurements yield consistent results. Notably, features such as signal energy,

POS, and PC1 initially displayed higher uncertainties. However, with the increasing number of

repetitions, these uncertainties decreased rapidly. This implies that conducting additional repeated

experiments could significantly reduce uncertainties among these features. It’s worth noting that in

this particular case, only four repetitions were performed. The robustness and comprehensiveness

of this uncertainty analysis could be further enhanced with additional experiments.

5.4 Structure Light Sensing based Defect Reconstruction

5.4.1 Background

Plastic pipes have become the prevalent choice for the distribution of natural gas since the early

1970s, and this trend continued to be the primary material used as of 2017, as documented in

[274]. However, it’s important to note that the rigidity and strength of plastic pipes do not match

those of steel pipes. This disparity renders plastic pipes vulnerable to damage caused by various

factors, including improper excavation or installation, as well as excessive stresses within the pipe’s

operational environment [275]. Such damages can result in leaks and, in more severe cases, gas

pipe explosions, which pose substantial risks. Consequently, the detection and identification of

material degradation within the pipe walls hold significant importance. Various nondestructive

evaluation-based methods have been developed and verified for inspecting plastic pipes. These

methods encompass techniques like ultrasonic testing as highlighted in [276, 277], microwave

testing methods as discussed in [278, 279], infrared thermography-based approaches detailed in

[280, 281], and camera-based visual inspection techniques presented in [282, 283].

Optical inspection represents one of the earliest NDE methods, initially involving visual inspec-

tion with the naked eye to identify potential defects in the examined object [284]. Over time, with

the evolution of digital photography and advancements in camera manufacturing, the preference

shifted toward digital cameras equipped with automated detection algorithms. Optical inspection

techniques exhibit several advantages. They are less influenced by the type of material being

84

inspected, in contrast to conventional NDE methods. Additionally, they do not necessitate the use

of a coupling medium and can be scaled down in size through careful engineering to achieve a

compact form factor, making them suitable for insertion into confined spaces.

The traditional visual inspection method, which employs various types of cameras, has a long

history and remains popular. However, it relies heavily on the operator’s expertise and lacks

the capability to quantitatively measure the depth of damage in plastic pipes. To address these

limitations, the authors have developed a structured light (SL) sensor for inline inspection of gas

pipes[285]. SL technology is gaining widespread acceptance due to its numerous advantages,

which include robustness, high precision, and the ability to shrink sensors to very compact sizes

[286]. The structured light pattern serves to endow the inspected surface with the necessary features

for triangulation, enabling SL systems to inspect surfaces even when unique surface characteristics

are absent. This capability is especially valuable, as it facilitates the inspection of smooth and

featureless surfaces, such as the walls of plastic gas pipelines.

The original prototype of the SL sensor is suitable for performing the inspection only when

the sensor is moving linearly with no change in orientation. However, in a real-world industrial

application, such stringent requirements are often not feasible because the sensor is typically fitted

on a moving platform (e.g., a robot), which will not be able to maintain such a strict pose while

traversing the length of the pipeline. The proposed registration algorithm addresses this design

gap and allows the SL sensing system to dynamically correct for changes in pose, resulting in a

stabilized and accurate 3D reconstruction of wall profiles. The designed SL sensor is attached to a

scanning platform that moves along a pipeline during the internal 3D inspection. Every single frame

from the sensor produces data for a sparse reconstruction of the pipe surface with a density that is

dependent on the number of projected rings. In an ideal situation, the sensor’s axis is aligned with

the main axis of the pipe and always points in the direction of platform movement, which is defined

as the 𝑧-axis. Therefore, the reconstructed 3D frames can be stacked sequentially by only adding

a displacement in the z-direction that is dependent on the scanner speed at the time of acquisition.

Experimentally, this assumption is not practical because it is difficult to maintain the sensor moving

85

exactly at the center of the pipe along the forward direction. Also, the platform’s moving speed is

hard to keep constant due to multiple uncertainties such as imperfections in mounting the sensor on

the robot, vibration from the movement of the robotic platform, and slippage of the robot wheels.

Therefore, a holistic registration algorithm is required to estimate both the orientation of the sensor

and its real-time position inside the pipe, to realize accurate 3D reconstruction.

Simultaneous localization and mapping (SLAM) algorithms, a popular global positioning

method, allow the incremental creation of maps using data from sensors while estimating real-

time positions [287, 288]. While various methods have been applied to reduce mapping errors in

SLAM, camera-based mapping with inertial navigation systems (INS) often had issues with the

accuracy and drift of these systems [289]. For accurate global positioning in pipeline detection,

the cylindrical nature of pipes is utilized as the basis for SL sensor-based sensor localization [290].

Also, encoder data from the robot can provide accurate estimations on how deep inside the pipeline

the robot is [291]. After the location in the pipeline is refined, the performance of the 3D recon-

struction is related to the local positioning of the sensor as well. In this work, information from

wheel odometry and IMU are incorporated to estimate the speed and orientation of the sensor

in real-time, which could realize a more reliable local positioning. The data are then fed to a

registration algorithm to provide an initial guess about the sensor orientation and position inside

the pipe; following that a RANSAC-assisted [292] cylindrical fitting-based registration approach

is followed to provide high efficacy 3D point cloud registration to stabilize sensor. Furthermore,

an intensity-based threshold search method is employed to determine the reconstructed defect size.

Finally, the uncertainties associated with structured light sensing are examined to quantify both

the total reconstruction uncertainty and the estimated measurement uncertainty, demonstrating

the measurement precision. The effectiveness of the proposed algorithms is validated through

experimental results in pipeline inspection.

5.4.2 Design of Structured Light Sensing System

To demonstrate the capability of wheel odometry in enhancing the performance of the proposed

algorithm’s performance, the robot-integrated sensing system with three main components: SL

86

sensor, IMU, and employed robot, are illustrated in Fig. 5.8. The robotic system has the capability

to be deployed in the 4-inch to 6-inch PVC pipe for real-time data collection.

(a)

(b)

(c)

Figure 5.8 a) The robotic system with integrated SL sensor; b) Camera and SL sensor system; c)
Schematic of the endoscopic SL sensor.

A structured light sensor consists of a projection module that projects a highly textured pattern

and a camera that captures the deformations in the projected pattern [293]. The detailed description

of the SL sensor design and fabrication is addressed in our previous work [285]. It consists of

a camera module, a projector module, and a connected transparent glass tube for enabling the

projection of the colored rings to the pipe walls. The projector module consists of a high-intensity

light-emitting diode (LED), a collimation lens, a transparency slide, and a projection lens, which

has been shown in Fig. 5.8. (c). The Complementary metal-oxide-semiconductor (CMOS) camera

is used to monitor the pipe surface and capture deformations in the projected rings. The 3D

87

imaging reconstruction of the scanned object surface, as mentioned in [293], is the process of

detecting, localizing, and matching the projected edges.

In this process, the acquired image is

converted to the polar domain to perform edge detection based on the predefined color coding of

the slide pattern. With essential cleaning and filtering, the extracted edges of each acquired image

will be reconstructed to a cylindrical shape in 3D domain, which provides a basis for point cloud

registration between each data frame.

To find the geometric transformation, the registration algorithm depends on both inertial mea-

surements and the matching of the common features in the fixed and moving frames. The main

framework of the proposed algorithm for stabilization can be summarized in Figure 5.9 with two

main interconnected tasks: local positioning and global positioning. In this scheme, the use of a

synchronized acquisition framework is realized with real-time 3D data which are assisted by the

IMU and wheel odometry data. Global positioning provides a vague pose (position and orien-

tation) of the sensor inside the pipe by using wheel odometry and inertial measurements. Local

positioning is then used to further improve the global position calculations especially when surface

features exist. The data from the global positioning are fed to the registration algorithm to provide

an initial guess about the sensor pose inside the pipe, and then 3D information is used to provide

a more precise tuning. If defect features are found, the global position is updated and the data

are registered; otherwise, the initial global position is used in addition to the constraints from the

cylindrical 3D environment.

In this work, sensor characteristics are integrated into the 3D registration problem to improve

the robustness of the fitting performance. The environment inside the pipe is described in Fig

5.10, where a structured light sensor is enclosed by a cylinder with a radius R𝐶 𝑦𝑙 with an arbitrary

orientation axis described by the unit vector 𝐴𝐶 𝑦𝑙 for the SL sensor. In this environment, the camera

is located at the origin (𝐶 = (0, 0, 0)) of the coordinate system and the camera is pointing along

the 𝑧-axis. The projected ring is imaged by the camera to create a set of image points (D𝐶) that

can be represented by the camera ray ( (cid:174)w). The camera rays intersect with both the projected cone

from the projector module and the surface of the bounding cylinder. Therefore, the intersection

88

Figure 5.9 Proposed registration approach for sensor stabilization with data acquisition procedure.

points belong to both the cylinder and the cone surfaces. With known cylinder parameters, the

intersection between the camera ray and the cylindrical surface can be easily calculated with the

substitution of the ray equation in the cylinder equation. Therefore, the cylinder orientation can be

calculated by minimizing the difference between D𝐶 and D𝑎𝑟𝑏, which can be described by

(𝜙𝑥, 𝜙𝑦, 𝑇𝑥, 𝑇𝑦) = 𝑎𝑟𝑔𝑚𝑖𝑛(|(D𝑎𝑟𝑏 − D𝐶)|22).(5.9)

Figure 5.10 Triangulation of structured light sensor inside a pipe environment.

89

One of the error sources that affect the accuracy of the cylindrical fitting is the existence of

artifacts on the pipe walls since the fitting process assumes an ideal cylindrical surface. The defect

causes the fitting problem to be biased and results in an inaccurate estimation of the cylinder

parameters; therefore, the problem is more prominent when having deep defects in the pipe wall

[285]. To reduce the effect of wall defects, the defects are assumed to be outliers that need to be

identified and removed from the fitting problem. In this case, random sample consensus (RANSAC)

was applied. RANSAC is an iterative method that estimates the model parameters in the existence

of outliers by separating them from inliers with repeated random sub-sampling [292]. Therefore,

all the defects will be separated because they do not fit the cylindrical model that is assumed during

the optimization process. The fitting process of simulation is presented in Figure 5.11, where the

input frame to the fitting process with a cylinder diameter of 6 inches (76.2 mm) and a wall defect

with a depth of 10.16 mm. Figure 5.11.(a) shows the isolated defect region with RANSAC, while

the actual rotation angle is around 𝑧 axis. The algorithm can successfully isolate the defect region

from the rest of the cylindrical surface. After isolating the defect data, the cylindrical surface data

are fitted and the rigid transformation parameters can be calculated.

Figure 5.11 Alignment correction with cylindrical fitting, a) Moving frame with isolated defect by
RANSAC in blue; b) Point clouds after alignment correction, Red: Moving frame, Black: Fixed
frame.

(a)

The IMU cannot provide an absolute 3D position of the sensor but it can provide linear

90

acceleration, angular velocity, and orientation information, therefore these readings are integrated

twice to estimate the instantaneous position of the sensor 𝑟 𝐴𝑐𝑐 (𝑡). The IMU combines readings

from the magnetometer and gyroscope to estimate the orientation of the IMU in 3D space, which

is used to estimate the rotation angle of the sensor inside the pipe. Through our testing, we found

this type of data to be more reliable than the accelerator data but is still prone to deviation due to

error accumulation. Therefore, it is only used as an initial point for the registration algorithm. It

is worth noting that we are mainly interested in checking if our sensor has rotated around the main

pipe axis (𝑧-axis). Once the calibration parameters are known, we can start using the IMU data

to monitor the sensor orientation, which is further integrated to minimize ellipsoid orientation and

the gyroscope orientation plan for orientation correction.

Another set of sensors utilized for localization is the wheel odometers as an additional input to

estimate the speed and position of the platform. The sensor position is estimated according to the

number of wheel rotations at each frame and wheel diameter to obtain the estimated instantaneous

position. The robot uses three pairs of wheels, with each wheel connected to a dedicated encoder.

3D printed material and the necessary electronics are equipped in the robot for operation. It is

powered by two sets of 14.4v LiPo batteries attached to the robot. These batteries provide the

necessary power to the motors, structured light projectors, and other electronics, enabling them

to operate untethered to an external power source. In this work, the median distance between the

frames of the three encoders is used as the reference distance estimate for the entire robot. Utilizing

the median can reduce effects from wheel slippage producing an artificial distance increase or from

motor stalling decreasing the distance.

5.4.3 Experimental Performance Evaluation

To demonstrate the IMU-assisted robotic sensing system, experiments were performed in a

6-inch PVC pipe with two defects, The first scanned segment is from point A to B, and the second

is from point B to C as shown in Fig. 5.12. Both defects have the same dimensions of 70mm

length, 35mm width, and 6mm height. The sensor is attached to a gantry to traverse the pipe. The

inspection process starts at point A when reaching point B, the sensor is rotated, and the inspection

91

is continued till point C. This inspection scenario simulates the sensor rotation during the inspection

and with off-center sensor misalignment. Fig. 5.13 shows example structured light image frames

of the sensor rotation between points A to B and B to C.

Figure 5.12 Schematic of the test pipe.

Figure 5.13 Example image frame illustration a)Point A to Point B; b) Point B to Point C.

(a)

(b)

5.4.3.1 Comparison of 3D reconstruction methods

To evaluate the performance of the proposed feature-based registration algorithm, we compare

it to two other previous works. The first one is an ellipsoid fitting-based point cloud registration

algorithm, which was developed previously by the authors [285].

In this method, an ellipsoid

is used to fit the cylindrical surface to handle pipes with oval shapes and errors from the sensor

calibration and then estimate both the orientation of the sensor and its position inside the pipe

for each acquired frame with the following alignment correction. To register multiple frames, the

corrected data are stacked by adding a constant displacement in the z-direction for each acquired

frame, while keeping the 𝑧-direction displacement constant by having a fixed scanning speed.

Another method for comparison is the Iterative Closest Point (ICP) algorithm, a well-established

registration technique often used to align 2D or 3D surfaces obtained from different scans and for

92

PC50 cmMaterial loss defectsABC25 cm𝑧𝑦𝑥localizing robots in order to optimize path planning [294]. The ICP algorithm is widely employed

in various applications, including the development of 3D models and the construction of 3D world

maps for SLAM systems. Its primary function is to determine the transformation between a point

cloud and a reference point cloud by minimizing the squared error between corresponding data

points [295–297]. In this approach, the initial frame serves as the reference for establishing the

initial transformation estimation, often involving the fitting of a plane. Subsequent frames utilize

a point-to-plane distance metric minimization technique to align each source point cloud with the

combined estimated rotation and translation. This aligned frame is then stacked in the z-direction

to reconstruct the pipeline structure. The ICP algorithm’s versatility and capability to align data

from various sources make it a valuable tool in pipeline inspection and other fields.

The comparison results are presented in Fig. 5.14. From the top view, it can be seen that the

proposed registration algorithm retrieves a better pipe shape with a clearer and smoother boundary

compared to the other two methods. Also, the marked defect area in the cylindrical-based method

is more distinct and solid which is beneficial for defect isolation. Besides, the reconstructed 3D

profile of the ICP-based method has a large misalignment after the sensor rotation, while the

ellipsoid-based and the proposed cylindrical fitting-based method could fully reconstruct the 3D

profile of the inspected pipe section. The main criterion of the registration algorithm is the ability

to reconstruct the complete pipeline structure with less noise. Specifically, each data frame should

be aligned vertically to build a straight and clear pipe surface. Therefore, we applied a plane to fit
each 3D cloud point and extracted the normal vector −→𝑛𝑖 to obtain the directional information of each

frame i. Since the pipe is a standard cylinder, projecting the 3D data frame onto the XY plane should

theoretically be a circle. The projected points are then fitted to a circle to obtain the estimated center

location Oci and radius Rai. The center is able to reflect the location of each frame, which is highly

related to the registration theory of each method. For a well-registered model, the differences in

estimated centers and in directional vectors should be small to ensure height alignment between

frames. Also, in the horizontal direction, the estimated circle should approximate the actual pipe

size, so the estimated radius is a good criterion for evaluating registration performance, which is

93

(a)

(b)

Figure 5.14 Reconstruction performance comparison among Ellipsoid based (left), ICP-based
(middle) and proposed Cylindrical (right) registration algorithm: a) Top view: reconstructed defect
area are marked by the red dotted circle; b) Performance evaluation parameters from one single
frame.

interpreted as Closeness, and defined as:

𝐶𝑙𝑜𝑠𝑒𝑛𝑒𝑠𝑠 =

| ¯𝑅𝑎 − 𝑅𝐺𝑇 |
𝑅𝐺𝑇

∗ 100%.

(5.10)

where ¯𝑅𝑎 is the average of the total estimated R𝑎𝑖 in each method. Therefore, we extracted the above

shape-based parameters from the reconstructed pipe to quantitatively evaluate the reconstruction

of those three registration techniques, which are illustrated in Fig.5.14.𝑏. In this comparison work,

total variances of normal vectors center location, and radius closeness are obtained from each

registration technique, which is shown in Table 5.3.

The results indicate that all three methods effectively align the frame consistently with minimal

94

Table 5.3 Reconstruction Quality Evaluation for Registration Techniques

Normal Variance(mm) Center Variance(mm) Radius Closeness

Ellipsoid
ICP
Cylindrical

0.001
0.002
0.001

1.507
9.112
0.025

1.71%
1.55%
0.95%

variation in the estimated orientation. Notably, the cylindrical-based algorithm demonstrates

much greater reliability in aligning the frame vertically, as evidenced by the low variance at the

center. Regarding the estimated pipe diameter, all methods managed to determine it with only

minor differences, with the largest error being 1.71%. Notably, the proposed method exhibited

the highest level of accuracy in this regard. In summary, the proposed method has demonstrated

its robustness and dependability as an alignment correction technique. While ICP registration

offers theoretically high accuracy, its performance and efficiency are constrained by the need for a

precise initial value, minimal transformation between the two point clouds, and limited occlusion.

This constraint is particularly pronounced in scenarios involving significant misalignment between

point clouds, especially during rotations, which result in substantial misalignments and shifts in

the frame baseline. Additionally, traditional ICP registration relies solely on geometry, color, or

meshes and may struggle to perfectly reconstruct the defect area, particularly when the initial

point cloud lacks comprehensive defect information.

In contrast, the proposed algorithm treats

the defect as a distinctive feature, enabling the reconstruction of both the pipe and the defect with

higher reliability and resistance to baseline shifts. Considering the quality of the reconstructed pipe

shapes and defects, the proposed cylindrical-based 3D registration algorithm outperforms current

state-of-the-art methods.

All of the registration methods listed failed to correct the rotation of the robotic platform around

the pipe’s main axis. Therefore, through the use of inertial measurements from the IMU data, the

pose was corrected. IMU data are acquired in real-time with the camera data and then used to

correct the data alignments according to the procedure described in the background section. To

illustrate the sensitivity and efficacy of the IMU, experiments were performed at different rotation

angles (8deg) from point B to point C. The 3D reconstructed profile after incorporating the IMU

95

(a)

(b)

Figure 5.15 Reconstruction performance without IMU (left) and with IMU (right). Rotation angle:
8 degree: a) Top view; b) Side view.

data with the proposed cylindrical-based registration is shown in Fig.5.15. From the top and side

views, we notice that after integrating the IMU information, the position of the second defect is

corrected to a similar vertical orientation to the first defect in both cases. Specifically, in Fig.

5.15.𝑏, the estimated angle between two defects in 𝑧-axis with IMU assistance is corrected from

8 degrees to 2.3 degrees. Even with such a relatively small rotation angle (8 degrees), the IMU

is sensitive enough to capture changes in rotation and turned out to be very precise about its

orientation. Experimental results show that the proposed registration algorithm with the IMU data

incorporated was sufficient to reconstruct the defect adequately, which provides a good basis for

applying cylindrical-based 3D registration to facilitate more reliable data reconstruction.

96

(a)

(b)

(c)

Figure 5.16 Intensity-based Threshold Searching Procedure: a) Cylindrical defect map; b) Intensity
histogram; c) Binarized candidate examples with the segmented defect.

5.4.3.2 Reconstructed Defect Size Evaluation

To better illustrate the effectiveness and robustness of the robotic integrated platform, we

estimate the reconstructed defect size for comparison with the ground truth defect. The idea of

size estimation is converted to a segmentation problem, starting with flattening the reconstructed

3-D Point cloud to the cylindrical domain. The following defect estimation procedure is realized

through a proposed Intensity-based Threshold Searching algorithm, which is presented in Fig. 5.16.

In detail, first, the intensity histogram is applied to Obtain N intensity clusters and then extract

the mean of each cluster: Mi : M1, M2, .., MN; Next, based on each Mi, the binarized cylindrical

97

defect map Bi is generated for the following defect segmentation to obtain the estimated length Li

and Width Wi; Then an error estimation is deployed for selecting the best candidate by evaluating

the distance between the estimated size and the true defect size 𝐿𝐺𝑇 and 𝑊𝐺𝑇 , which is described

as follows:

𝐸𝑟𝑟𝑖 = 0.5 ∗

√︂

(

𝐿𝑖 − 𝐿𝐺𝑇
𝐿𝐺𝑇

)2 + (

𝑊𝑖 − 𝑊𝐺𝑇
𝑊𝐺𝑇

)2.

(5.11)

By selecting the minimal 𝐸𝑟𝑟, the optimal threshold will be chosen and thus obtain the optimal

estimated defect size length L∗ and Width W∗. Considering the reconstruction results are affected

by various uncertainties, the calculated 𝐸𝑟𝑟 is a good estimation of the overall uncertainty in this

3D reconstruction-based sensing system.

The determination of defect depth holds significant importance as it plays a crucial role in

evaluating the impact of a defect on the structural integrity of a component or material. In this

specific application, the process of gauging defect depth commences with the extraction of a 3D

point cloud that represents the defect area, as depicted in Figure 5.17. Initially, this 3D defect map

is transformed onto the 𝑌 − 𝑍 axis. Subsequently, the defect information is isolated to obtain the

average background information 𝐵. This background information serves the purpose of filling in

the defect region, resulting in the creation of a comprehensive 3D plot that accurately portrays the

defect in relation to its reconstructed depth. Given that the actual defect surface possesses some

inherent roughness, the variation in defect depth can be observed from the entire map. This variation

provides a solid foundation for comprehending the inspected defect. Since the industry primarily

focuses on determining the maximum wall loss, only the largest estimated depth measurement is

considered in this scenario.

To investigate the accuracy of the reconstructed measurements, the estimated length 𝐿𝑖, width

𝑊𝑖, and depth 𝐷𝑖 are compared to the ground truth defect size 𝐿𝐺𝑇 , 𝑊𝐺𝑇 , and 𝐷𝐺𝑇 . This

evaluation involves an error estimation equation that considers the differences in length, width, and

depth simultaneously. The equation for evaluating the overall size estimation performance can be

obtained by extending the Equ.5.11 to:

98

(a)

Figure 5.17 Defect depth estimation procedure.

𝐸𝑟𝑟𝑜𝑟𝑖 =

√︃

( 𝐿𝑖−𝐿𝐺𝑇
𝐿𝐺𝑇

)2 + ( 𝑊𝑖−𝑊𝐺𝑇
𝑊𝐺𝑇
3

)2 + ( 𝐷𝑖−𝐷𝐺𝑇
𝐷𝐺𝑇

)2

(5.12)

5.4.4 Uncertainty Analysis on SL Sensing System

5.4.4.1 Uncertainty Source

In the context of pipeline inspection and field testing, it is imperative to conduct a thorough in-

vestigation into the impact of various uncertainties. Uncertainty quantification (UQ) plays a pivotal

role in quantitatively characterizing the quality and accuracy of non-destructive evaluation (NDE)

and, consequently, the reliability of intricate systems. In this sensing system, any uncertainties

or errors in data collection, feature extraction, or subsequent analysis can significantly influence

the quality and reliability of the final 3D data reconstruction. Hence, uncertainty quantification

serves as a means to impartially evaluate performance, offering a comprehensive analysis of the

connection between uncertainty and the ultimate output. This, in turn, leads to the development of

a highly reliable sensing system for pipeline inspection. The errors and uncertainties that impact

the accuracy of this structured light 3D measurement system primarily stem from three sources:

the instruments, the processing methods, and the environmental conditions.

99

1. From Instrument Design: A reliable and rigid mechanical design of the sensing system is

important. The shadow effect of the sensor is one major uncertainty source, which deteriorates

the reconstruction accuracy when dealing with abrupt height changes in the pipe surface.

As mentioned in [285], this problem is caused by the current single camera and projector

setup, which restricts the view angle. The low intensity of the light source and low resolution

will cause poor imaging quality of the slide pattern and thus may affect the measurement

accuracy. Also, the accurate distance and directional from IMU and encoder are key factors

to determine the alignment between each frame and the accumulated error will lead to

performance degradation of localization. For odometry input, possible error accumulation

will be from either the measurements of the wheel diameter or speed estimation.

a) Wheel diameter: non-circular wheel shape because of the usage of omnidirectional

wheels.

b) Wheel slippage: wheel slippage while passing over obstacles or during turning, will

cause perturbations in velocity measurements. The relation between the acquired wheel

slip velocity 𝑉𝑠 and the measured velocity (rotational speed) 𝑉𝑟 can be described through

a linear function with an unbiased Gaussian uncertainty 𝑈[298], which can be written

as:

V𝑠 = ( 𝑓 (𝑉𝑟) + 𝑈)

(5.13)

while the uncertainty 𝑈 ∼ N (0, 𝜎2) is the unbiased Gaussian noise with zero mean and

variance 𝜎2; 𝑓 (.) is first-order polynomial fitting procedure with lease square estimate.

Therefore, if slip is considered in future work, the wheel slip distance 𝑇𝑜𝑑𝑠, can be

obtained by velocity and duration time Δ𝑡 : 𝑇𝑜𝑑𝑠 = 𝑉𝑠 ∗ Δ𝑡.

2. From Method: The calibration of SL sensor’s main components, such as the camera, pro-

jector, and IMU, is an essential part of obtaining the relative parameters for further data

analysis. Errors introduced by the imprecise parameter estimation of the sensing system will

100

deteriorate a system’s overall performance. Therefore, a reliable and efficient calibration

method for each component is essential. For the following defect reconstruction and mea-

surement, uncertainties from those processing models, such as the calculation error of the

registration algorithm and threshold searching algorithm, also contribute to the uncertainty

in the measurement result.

3. From Environment: The environment where tests and calibrations are performed can have

an influence on uncertainty in measurement results. Vibration caused by the uneven pipe

surface will introduce random errors to the measurements. Also, inadequate light conditions

for measurements have a crucial impact on the imaging quality, which makes slide patterns

difficult to distinguish.

The measurement uncertainties will result in the precision and accuracy of the reconstructed defect

shape decreasing, which will be further investigated in the next section.

5.4.4.2 Measurement Uncertainty Evaluation

Similar to the previous MBN-based application, in this work, the GUM-based uncertainty

analysis of repeatability is addressed to have a better evaluation to the reconstruction performance.

Specifically, six repeated scans with the robotic platform were conducted. The estimated defect

size and reconstructed 𝐸𝑟𝑟 of six tests are presented in Table 5.4. Results show that there exists

variation in each reconstructed defect size, however, based on the estimated 𝐸𝑟𝑟 from Equ. 5.12,

differences between the estimated defect size and the ground truth defect are relatively small, all

within 13%. Therefore, the proposed registration and following defect estimation algorithm of this

IMU-assisted SL sensing system is proven to reveal real defect information with high accuracy.

For estimating the uncertainty from measurement, the standard deviation 𝑆 for every combina-

tion was determined by taking the mean of those six choose 𝑚 standard deviations. The standard

uncertainty to the estimated defect length, width and depth are computed separately, shown in

Fig. 5.18. The results reveal a correlation between measurement uncertainties and the number

of repetitions, indicating that uncertainties decrease as the number of repetitions increases. This

101

Table 5.4 Estimated Defect Size and 𝐸𝑟𝑟𝑜𝑟 in Repetition Tests

mm
Length
Width
Depth
Error

Test 1 Test 2 Test 3 Test 4 Test 5 Test 6
66.1
67.2
32.6
34.2
5.7
4.8
8.2% 5.1% 8.7% 12.9% 9.0% 12.4%

76.6
35.2
5.6

72.3
35.7
8.6

71.3
31.3
4.7

67.7
33.1
9.4

Figure 5.18 Comparison results of expanded uncertainty with GUM.

underscores the significance of conducting repeat measurements to reduce uncertainties effectively.

Moreover, following the GUM, an expanded uncertainty is employed to ensure a 95% confidence

level when the coverage factor 𝑘 is set at 1.96, which is considered as the best estimate of the

correction with the measurement error. The measurement uncertainty is expressed as::

𝑈𝑒𝑥𝑡𝑒𝑛𝑑 = 𝑘 ∗ 𝑈𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 = 𝑘 ∗

𝑆
√

𝑁

(cid:118)(cid:116)

= 𝑘 ∗

1
𝑁 (𝑛 − 1)

𝑛
∑︁

𝑖=1

(𝑦𝑖 − ¯𝑦)2

(5.14)

where the mean value of 𝑁 repeat measurements is considered as the optimal representation; 𝑦𝑖 is

the 𝑖𝑡ℎ measurement value; and ¯𝑦 is the average value of repeated measurements [259]. According

to equation 5.14, the measurement uncertainty interval can be determined by considering the

102

estimated length and width of all six measurements. Specifically, the calculated confidence interval

of length is 70.5 ± 1.55𝑚𝑚, the width is 33.7 ± 1.05𝑚𝑚, and the depth is 6.1 ± 1.22𝑚𝑚. Overall,

the uncertainty of the length estimation is much smaller than that of the width, and since the frame

distance is one of the main factors in determining the length, the accuracy and robustness of the

odometer and IMU sensors in position correction are demonstrated.

While the number of repetitions is limited, the initial error estimates and the analysis of mea-

surement uncertainty lay a solid groundwork for illustrating the potential of uncertainty estimates

under the GUM. Additionally, the integrated robotics platform’s ability to facilitate multiple Si-

multaneous Localization and Mapping (SLAM)-based data collections opens the possibility of

constructing a more robust and comprehensive uncertainty estimation model through additional

experiments.

5.5 Conclusion

In this chapter, we begin by introducing the concept of Probability of Detection (POD), which

is a specific method aimed at assessing the effectiveness and reliability of defect detection under

conditions of uncertainty within NDE. Additionally, we delve into the subject of measurement

uncertainty analysis, which is the most popular means to evaluate and quantify the overall uncer-

tainty inherent in NDE measurements. The information derived from both of these methods plays

a critical role in shaping the interpretation of NDE results. Furthermore, the evaluated uncer-

tainties resulting from both the POD and measurement uncertainty analysis methods are integral

components of ensuring the reliability and robustness of NDE inspections.

Besides, two practical NDE applications are introduced to address the applicability of ANOVA-

based measurement uncertainty:

In the first application, Magnetic Barkhausen Noise (MBN) is utilized, and relevant features

are extracted to establish correlations with the microstructure information of martensitic-grade

stainless-steel samples. A Probabilistic Neural Network (PNN) is subsequently employed to clas-

sify samples based on their remaining fatigue life percentage, effectively distinguishing between

unfatigued and cracked samples.

103

The second application presents a comprehensive and practical Inertial Measurement Unit

(IMU)-assisted robotic Structured Light (SL) sensing system with improved registration and defect

estimation capabilities for pipeline detection. The results demonstrate that the proposed framework

is capable of providing a robust and reliable 3D defect reconstruction solution, with valuable

assistance from IMU and robot odometry.

Both of these sensing systems, whether it’s the specially designed MBN sensor or the robot-

enabled SL sensor, are capable of delivering consistent and reliable repeated measurements. This,

in turn, forms a solid foundation for a more accurate estimation of measurement uncertainties.

104

CHAPTER 6

CONCLUSION AND FUTURE WORK

6.1 Conclusion

The main contributions of this thesis are concluded as follows:

1. A comprehensive TLS-based UQ framework has been proposed, which signifies an important

step forward in the domain of Non-Destructive Evaluation (NDE) inspections. This framework

expands the horizons of conventional NDE practices, offering a more holistic and comprehensive

approach to system maintenance. With a clear focus on practicality, this proposed framework

stands as a guiding light, facilitating the understanding and management of uncertainty propagation

through a selection of advanced and widely adopted techniques tailored for NDE inspections.

Within the framework, we’ve identified key criteria to guide the selection of appropriate methods

for propagating uncertainty in NDE scenarios. These criteria are designed to ensure that uncertainty

is comprehensively addressed, managed, and minimized, thereby enhancing the overall reliability

and effectiveness of NDE inspections. The proposed uncertainty analysis scheme is a good basis

for NDE system design optimization and quantification.

2. The comprehensive identification of uncertainty sources within each NDE application

not only plays a pivotal role in ensuring the integrity of the inspection process but also serves

as a foundational step for conducting more in-depth reliability and sensitivity analyses in this

specialized domain. By diligently identifying and understanding these sources of uncertainty, NDE

practitioners can enhance the accuracy and effectiveness of their inspections, ultimately contributing

to safer and more dependable systems across various industries.

3. A capacitive sensing-based inspection is applied to illustrate the forward uncertainty propa-

gation process from “Data ” to “Modeling”. The process of uncertainty propagation demonstrates

how the uncertainty originating from the liftoff is systematically transferred through the various

stages of the ’Modeling’ process. The output distributions resulting from this propagation are

derived from the implementation of the three proposed meta-models. Further, to comprehensively

evaluate the influence of liftoff uncertainty on the final output, a widely used uncertainty prop-

105

agation method MCS is employed to generate a probability distribution for the total impedance,

denoted as 𝑍. This approach enables a thorough assessment of the potential variations in the total

impedance due to the uncertainty in the measurements.

4. A Magnetic Flux Leakage-based defect characterization algorithm is used for addressing

uncertainties during the inverse NDE process. It places a particular emphasis on the uncertainties

stemming from sensing liftoff, which can influence the output signal of the sensing system. Given

the intricate nature of the forward uncertainty propagation process, this research conducts a com-

parative analysis of two commonly used learning-based approximate Bayesian inference methods,

Convolutional Neural Network (CNN) and Deep Ensemble (DE), to handle the input uncertainty

derived from MFL response data. Moreover, the study employs an Autoencoder method to address

the challenge of limited experimental data by augmenting the dataset. This approach involves

pre-training on MFL simulation data and subsequently applying it to real-world data. In the con-

text of defect size classification within experimental MFL-based applications, the study not only

assesses prediction accuracy but also delves into uncertainty analysis. These efforts are crucial for

evaluating the reliability of predictions. The proposed methodology for uncertainty quantification

offers valuable insights into the assessment of reliability in decision-making and inverse problems

related to MFL-based NDE.

5. While Probability of Detection (POD) and measurement uncertainty analysis serve distinct

purposes in assessing uncertainty in NDE, it’s important to note that the information derived

from both methodologies plays a significant role in shaping the interpretation of NDE results

within the proposed TLS UQ framework. The combined evaluation of uncertainty through these

two methods is a critical component that enhances the overall reliability and robustness of NDE

inspections. To assess the effectiveness of the introduced GUM-based measurement uncertainty

analysis, two distinct sensing systems are employed: one based on structured light technology and

the other on Magnetic Barkhausen Noise, which serves as practical test cases for measurement

uncertainty analysis. In the MBN application, PCA can extract advanced features from both the

time and frequency domain. A PNN is utilized for sample classification based on the percentage of

106

remaining fatigue life. The specimen’s fatigue life can be successfully indicated with the potential

for early fatigue onset prediction. In the second application, a comprehensive and practical IMU-

assisted robotic SL sensing system with enhanced registration and defect estimation solutions is

proposed for pipeline detection. The proposed framework utilizes a RANSAC-assisted cylindrical

fitting registration algorithm to enhance the alignment of the SL system, ensuring accurate 3D

profiling of the pipeline. Additionally, the system integrates inertial and odometry measurements

to facilitate global and local positioning, further enhancing the precision and reliability of the

3D profiling process. Assisted with customized defect sizing techniques, The robot-enabled SL

sensing system is able to provide a robust and reliable 3D defect reconstruction solution in terms of

varying defect shape and depth. For both applications, the GUM-based uncertainty method is able

to illustrate the reliability of the corresponding measurement process. Furthermore, the uncertainty

sources in both sensing systems are described in detail, providing a guide for future uncertainty

analysis.

6.2 Future Work

This section provides valuable insights into potential directions and ideas for the continuation

of this research:

• TLS UQ framework:

It would be beneficial to incorporate additional UP methods within the proposed framework

into NDE applications which can provide researchers with a deeper understanding of the versatility

and applicability of the framework in various scenarios. Besides providing practical examples to

demonstrate the potential and efficiency of this framework, which could further clarify the benefits

and feasibility of utilizing these methods, ultimately strengthening the framework’s value in the

NDE field.

• Learning-based MFL defect characterization:

Exploring additional types of data augmentation techniques and incorporating more learning-

based uncertainty propagation methods into the applied dataset can enhance the research by offering

a broader perspective on the optimization of uncertainty propagation in NDE applications. Di-

107

versifying data augmentation methods allows for the evaluation of a wider range of approaches,

potentially uncovering the most effective techniques for specific scenarios. Furthermore, integrat-

ing learning-based uncertainty propagation into the dataset can provide valuable insights into the

nature of uncertainties and how they can be managed and minimized through advanced computa-

tional approaches. These efforts can lead to more robust, efficient, and data-driven solutions in the

field of NDE.

• Velocity-induced MIEC effect on MFL signal:

Leveraging simulations provides a valuable means to gain deeper insights into the MIEC signal.

By focusing on selected informative features, it becomes possible to explore the impact of velocity

on signal characteristics and, concurrently, on defect sizing. Considering velocity as an uncertainty

factor holds promise for incorporating it into a TLS-based uncertainty analysis, thus contributing

to a more comprehensive understanding of the entire inspection process.

108

BIBLIOGRAPHY

[1]

[2]

Daniel Wallach, David Makowski, James W Jones, and Francois Brun. Working with dynamic
crop models: evaluation, analysis, parameterization, and applications. Elsevier, 2006.

David Vose. Quantitative risk analysis: a guide to Monte Carlo simulation modelling. Wiley,
1996.

[3] William L Oberkampf, Sharon M DeLand, Brian M Rutherford, Kathleen V Diegert, and
Kenneth F Alvin. Error and uncertainty in modeling and simulation. Reliability Engineering
& System Safety, 75(3):333–357, 2002.

[4]

[5]

[6]

[7]

[8]

[9]

Dan G Cacuci and Mihaela Ionescu-Bujor. A comparative review of sensitivity and un-
certainty analysis of large-scale systems—ii: statistical methods. Nuclear science and
engineering, 147(3):204–217, 2004.

Richard De Neufville, Olivier De Weck, Daniel Frey, Daniel Hastings, Richard Larson, David
Simchi-Levi, Kenneth Oye, Annalisa Weigel, Roy Welsch, et al. Uncertainty management
In Engineering Systems Symposium, MIT,
for engineering systems planning and design.
Cambridge, MA, 2004.

Shankar Sankararaman. Significance, interpretation, and quantification of uncertainty in
prognostics and remaining useful life prediction. Mechanical Systems and Signal Processing,
52:228–247, 2015.

Erdem Acar, Gamze Bayrak, Yongsu Jung, Ikjin Lee, Palaniappan Ramu, and Suja Shree
Ravichandran. Modeling, analysis, and optimization under uncertainties: a review. Structural
and Multidisciplinary Optimization, 64(5):2909–2945, 2021.

F Owen Hoffman and Jana S Hammonds. Propagation of uncertainty in risk assessments:
the need to distinguish between uncertainty due to lack of knowledge and uncertainty due to
variability. Risk analysis, 14(5):707–712, 1994.

Shankar Sankararaman and Sankaran Mahadevan. Likelihood-based representation of epis-
temic uncertainty due to sparse point data and/or interval data. Reliability Engineering &
System Safety, 96(7):814–824, 2011.

[10] Yiping Li, Jianwen Chen, and Ling Feng. Dealing with uncertainty: A survey of theories
and practices. IEEE Transactions on Knowledge and Data Engineering, 25(11):2463–2482,
2012.

[11] Erdem Acar, Raphael T Haftka, and Theodore F Johnson. Tradeoff of uncertainty reduction

mechanisms for reducing weight of composite laminates. 2007.

[12] Erdem Acar, Raphael T Haftka, and Nam H Kim. Effects of structural tests on aircraft safety.

109

AIAA journal, 48(10):2235–2248, 2010.

[13] Amit A Kale and Raphael T Haftka. Tradeoff of weight and inspection cost in reliability-

based structural optimization. Journal of Aircraft, 45(1):77–85, 2008.

[14] Nikolaevich Kolmogorov, Andre and Albert T Bharucha-Reid. Foundations of the theory of

probability: Second English Edition. Courier Dover Publications, 2018.

[15] Thomas Bayes. An essay towards solving a problem in the doctrine of chances. 1763. MD

computing: computers in medical practice, 8(3):157–171, 1991.

[16] Didier Dubois and Henri Prade. Fuzzy sets, probability and measurement. European Journal

of Operational Research, 40(2):135–154, 1989.

[17] Ha-Rok Bae, Ramana V Grandhi, and Robert A Canfield. An approximation approach for
uncertainty quantification using evidence theory. Reliability Engineering & System Safety,
86(3):215–225, 2004.

[18] Ha-Rok Bae, Ramana V Grandhi, and Robert A Canfield. Epistemic uncertainty quantifica-
tion techniques including evidence theory for large-scale structures. Computers & Structures,
82(13-14):1101–1112, 2004.

[19] Amos Tversky and Craig R Fox. Weighing risk and uncertainty. Psychological review, 102

(2):269, 1995.

[20] Didier Dubois and Henri Prade. Possibility theory: an approach to computerized processing

of uncertainty. Springer Science & Business Media, 2012.

[21]

I Elishakoff and M Zingales. Contrasting probabilistic and anti-optimization approaches
in an applied mechanics problem. International journal of solids and structures, 40(16):
4281–4297, 2003.

[22] Rituparna Chutia. Uncertainty quantification under hybrid structure of probability-fuzzy
parameters in gaussian plume model. Life Cycle Reliability and Safety Engineering, 6(4):
277–284, 2017.

[23] Robert L Kosut, Ming Lau, and Stephen Boyd. Identification of systems with parametric
and nonparametric uncertainty. In 1990 American Control Conference, pages 2412–2417.
IEEE, 1990.

[24]

John McFarland and Sankaran Mahadevan. Error and variability characterization in structural
dynamics modeling. Computer Methods in Applied Mechanics and Engineering, 197(29-32):
2621–2631, 2008.

[25] Gideon Schwarz. Estimating the dimension of a model. The annals of statistics, pages

110

461–464, 1978.

[26] Yoojeong Noh, KK Choi, and Ikjin Lee.

Identification of marginal and joint cdfs using
bayesian method for rbdo. Structural and Multidisciplinary Optimization, 40(1):35–51,
2010.

[27] Andrew A Neath and Joseph E Cavanaugh. The bayesian information criterion: background,
derivation, and applications. Wiley Interdisciplinary Reviews: Computational Statistics, 4
(2):199–203, 2012.

[28] Yosiyuki Sakamoto, Makio Ishiguro, and Genshiro Kitagawa. Akaike information criterion

statistics. Dordrecht, The Netherlands: D. Reidel, 81(10.5555):26853, 1986.

[29] Priscilla E Greenwood and Michael S Nikulin. A guide to chi-squared testing, volume 280.

John Wiley & Sons, 1996.

[30] Fritz W Scholz and Michael A Stephens. K-sample anderson–darling tests. Journal of the

American Statistical Association, 82(399):918–924, 1987.

[31] Young-Jin Kang, Jimin Hong, O Lim, Yoojeong Noh, et al. Reliability analysis using
parametric and nonparametric input modeling methods. Journal of the Computational
Structural Engineering Institute of Korea, 30(1):87–94, 2017.

[32] Young-Jin Kang, Yoojeong Noh, O Lim, et al. Kernel density estimation with bounded data.

Structural and Multidisciplinary Optimization, 57(1):95–113, 2018.

[33] Roger Ghanem, David Higdon, Houman Owhadi, et al. Handbook of uncertainty quantifi-

cation, volume 6. Springer, 2017.

[34] Vistasp M Karbhari. Non-destructive evaluation (NDE) of polymer matrix composites.

Elsevier, 2013.

[35] D McA McKirdy, A Cochran, GB Donaldson, and A McNab. Forward and inverse processing
in electromagnetic nde using squids. Review of Progress in Quantitative Nondestructive
Evaluation: Volume 15A, pages 347–354, 1996.

[36] Krishnan Balasubramaniam. Inverse models and implications for nde. In Key Engineering

Materials, volume 321, pages 6–11. Trans Tech Publ, 2006.

[37] Angelika Wronkowicz, Krzysztof Dragan, and Krzysztof Lis. Assessment of uncertainty in
damage evaluation by ultrasonic testing of composite structures. Composite structures, 203:
71–84, 2018.

[38] Marco A Azpurua, Ciro Tremola, and Eduardo Paez. Comparison of the gum and monte carlo
methods for the uncertainty estimation in electromagnetic compatibility testing. Progress in

111

electromagnetics research B, 34:125–144, 2011.

[39] Nicola Toscani, Flavia Grassi, Giordano Spadacini, and Sergio A Pignari. A possibilistic
approach for the prediction of the risk of interference between power and signal lines onboard
satellites. Mathematical Problems in Engineering, 2018, 2018.

[40] Ladislav Janousek, Milan Smetana, and Marcel Alman. Decreasing uncertainty in size
estimation of stress corrosion cracking from eddy-current signals. In Electromagnetic Non-
destructive Evaluation (XIV), pages 53–60. IOS Press, 2011.

[41] Marc Horner, Stephen M Luke, Kerim O Genc, Todd M Pietila, Ross T Cotton, Benjamin A
Ache, Zachary H Levine, and Kevin C Townsend. Toward estimating the uncertainty asso-
ciated with three-dimensional geometry reconstructed from medical image data. Journal of
Verification, Validation and Uncertainty Quantification, 4(4):041002, 2019.

[42] Rambod Hadidi, Nenad Gucunski, and Ali Maher. Application of probabilistic approach to
the solution of inverse problems in nondestructive testing and engineering geophysics. In
20th EEGS Symposium on the Application of Geophysics to Engineering and Environmental
Problems, pages cp–179. European Association of Geoscientists & Engineers, 2007.

[43] Hajin Choi, Youngjib Ham, and John S Popovics. Integrated visualization for reinforced
concrete using ultrasonic tomography and image-based 3-d reconstruction. Construction
and building materials, 123:384–393, 2016.

[44] SWEETY SHAHINUR. Quantification of uncertainty of material properties and its appli-

cation. 2018.

[45] F. Taddei, S. Martelli, B. Reggiani, L. Cristofolini, and M. Viceconti. Finite-element
modeling of bones from ct data: Sensitivity to geometry and material uncertain-
IEEE Transactions on Biomedical Engineering, 53(11):2194–2200, 2006.
ties.
doi:
10.1109/TBME.2006.879473.

[46] Flavio de Andrade Silva, Nikhilesh Chawla, and Romildo Dias de Toledo Filho. Tensile
behavior of high performance natural (sisal) fibers. Composites Science and Technology, 68
(15-16):3438–3443, 2008.

[47] HAMZA Sekkak and MEHDI Modares. A comparison between probabilistic and possi-
bilistic numerical schemes for structural uncertainty analysis. Vol. 6 of Proc., Int. Structural
Engineering and Construction. Fargo, ND: International Structural Engineering and Con-
struction Press. https://doi. org/10.14455/ISEC. res, 2019.

[48] G Hübschen. Ultrasonic techniques for materials characterization. In Materials Character-

ization Using Nondestructive Evaluation (NDE) Methods, pages 177–224. Elsevier, 2016.

[49]

I Altpeter, R Tschuncky, and K Szielasko. Electromagnetic techniques for materials charac-

112

terization. In Materials characterization using nondestructive evaluation (NDE) methods,
pages 225–262. Elsevier, 2016.

[50] R Tschuncky, K Szielasko, and I Altpeter. Hybrid methods for materials characterization. In
Materials Characterization Using Nondestructive Evaluation (NDE) Methods, pages 263–
291. Elsevier, 2016.

[51] Houssem Haddar, Zixian Jiang, and Mohamed Kamel Riahi. A robust inversion method for
quantitative 3d shape reconstruction from coaxial eddy current measurements. Journal of
Scientific Computing, 70(1):29–59, 2017.

[52] Zehor Oudni, Azouaou Berkache, Hamid Mehaddene, Hassane Mohellebi, and Jinyi Lee.
Comparative study to assess reliability in the presence of two geometric defect shapes for
non-destructive testing. Przegląd Elektrotechniczny, 95(12):48–52, 2019.

[53] Felix H Kim, Adam Pintar, Anne-Françoise Obaton, Jason Fox, Jared Tarr, and Alkan
Donmez. Merging experiments and computer simulations in x-ray computed tomography
probability of detection analysis of additive manufacturing flaws. NDT & E International,
119:102416, 2021.

[54] Roger G Ghanem and Pol D Spanos. Stochastic finite elements: a spectral approach. Courier

Corporation, 2003.

[55] Zehor Oudni, Mouloud Féliachi, and Hassane Mohellebi. Assessment of the probability
of failure for ec nondestructive testing based on intrusive spectral stochastic finite element
method. The European Physical Journal-Applied Physics, 66(3), 2014.

[56] Milan Smetana, Lukas Behun, Daniela Gombarska, and Ladislav Janousek. New proposal for
inverse algorithm enhancing noise robust eddy-current non-destructive evaluation. Sensors,
20(19):5548, 2020.

[57] Catalin Mandache, Mike Brothers, and Vivier Lefebvre. Time domain lift-off compensation

method for eddy current testing. NDT. net, 10(6):1–7, 2005.

[58] Lukáš Behúň and Milan Smetana. Decreasing uncertainty in width estimation of edm
cracking from eddy-current signals. In 2016 ELEKTRO, pages 474–477. IEEE, 2016.

[59] Subrata Mukherjee, Xuhui Huang, Lalita Udpa, and Yiming Deng. Nde based cost-effective
In 2019
detection of obtrusive and coincident defects in pipelines under uncertainties.
Prognostics and System Health Management Conference (PHM-Paris), pages 297–302.
IEEE, 2019.

[60]

Julian Q Kosciessa, Ulman Lindenberger, and Douglas D Garrett. Thalamocortical excitabil-
ity modulation guides human perception under uncertainty. Nature communications, 12(1):
1–15, 2021.

113

[61] Mingyang Lu, Wenqian Zhu, Liyuan Yin, Anthony J Peyton, Wuliang Yin, and Zhigang
Qu. Reducing the lift-off effect on permeability measurement for magnetic plates from
multifrequency induction data. IEEE Transactions on Instrumentation and Measurement,
67(1):167–174, 2017.

[62]

I Iso and BIPM OIML. Guide to the expression of uncertainty in measurement. Geneva,
Switzerland, 122:16–17, 1995.

[63] B Venkatraman and Baldev Raj. Nondestructive testing: An overview of techniques and
application for quality evaluation. Non-Destructive Evaluation of Corrosion and Corrosion-
assisted Cracking, pages 1–55, 2019.

[64] Menner A Tatang, Wenwei Pan, Ronald G Prinn, and Gregory J McRae. An efficient method
for parametric uncertainty analysis of numerical geophysical models. Journal of Geophysical
Research: Atmospheres, 102(D18):21925–21932, 1997.

[65] Hélder S Sousa, John D Sørensen, Poul H Kirkegaard, Jorge M Branco, and Paulo B
Lourenço. On the use of ndt data for reliability-based assessment of existing timber structures.
Engineering Structures, 56:298–311, 2013.

[66] Gabriel LS Silva, Daniel A Castello, Lavinia Borges, and Jari P Kaipio. Damage identification
in plates under uncertain boundary conditions. Mechanical Systems and Signal Processing,
144:106884, 2020.

[67] Gözde Sarı and Mehmet Pakdemirli. Effects of non-ideal boundary conditions on the
vibrations of a slightly curved micro beam. In AIP Conference Proceedings, volume 1493,
pages 883–890. American Institute of Physics, 2012.

[68]

James-A Goulet and Ian FC Smith. Structural identification with systematic errors and
unknown uncertainty dependencies. Computers & structures, 128:251–258, 2013.

[69] Canhai Lai and Xin Sun. Predicting flaw-induced resonance spectrum shift with theoretical
perturbation analysis. Journal of Sound and Vibration, 332(22):5953–5964, 2013.

[70]

John C Aldrin, Enrique A Medina, Eric A Lindgren, Charles Buynak, Gary Steffes, and
Mark Derriso. Model-assisted probabilistic reliability assessment for structural health moni-
toring systems. In AIP Conference Proceedings, volume 1211, pages 1965–1972. American
Institute of Physics, 2010.

[71] Ming Yang. Efficient methods for solving boundary integral equation in diffusive scalar
problem and eddy current nondestructive evaluation. PhD thesis, Iowa State University
Ames, IA, 2010.

[72] Dezhi Li, Wilson Wang, and Fathy Ismail. Enhanced fuzzy-filtered neural networks for

material fatigue prognosis. Applied Soft Computing, 13(1):283–291, 2013.

114

[73]

Jie Chen and Yongming Liu. Fatigue modeling using neural networks: A comprehensive
review. Fatigue & Fracture of Engineering Materials & Structures, 45(4):945–979, 2022.

[74] Pratesh Jayaswal, Sunita N Verma, and Arun K Wadhwani. Application of ann, fuzzy logic
and wavelet transform in machine fault diagnosis using vibration signal analysis. Journal of
Quality in Maintenance Engineering, 2010.

[75] RE Silva, Rafael Gouriveau, Samir Jemei, Daniel Hissel, Loïc Boulon, Kodjo Agbossou,
and N Yousfi Steiner. Proton exchange membrane fuel cell degradation prediction based on
adaptive neuro-fuzzy inference systems. International Journal of Hydrogen Energy, 39(21):
11128–11144, 2014.

[76] Matthias Seeger. Gaussian processes for machine learning. International journal of neural

systems, 14(02):69–106, 2004.

[77] Alexandra Coppe, Raphael T Haftka, and Nam H Kim. Uncertainty identification of damage

growth parameters using nonlinear regression. AIAA journal, 49(12):2818–2821, 2011.

[78] Qinming Liu, Ming Dong, and Ying Peng. A novel method for online health prognosis
of equipment based on hidden semi-markov model using sequential monte carlo methods.
Mechanical Systems and Signal Processing, 32:331–348, 2012.

[79]

Jooho Choi, Dawn An, Jinhyuk Gang, Jinwon Joo, and Nam Ho Kim. Bayesian approach
for parameter estimation in the structural analysis and prognosis. In Annual Conference of
the PHM Society, volume 2, 2010.

[80] Arnaud Doucet, Nando De Freitas, Neil James Gordon, et al. Sequential Monte Carlo

methods in practice, volume 1. Springer, 2001.

[81] Wei He, Nick Williard, MPAL Osterman, and Michael Pecht. Prognostics of lithium-ion
batteries using extended kalman filtering. In Proceedings of the IMAPS Advanced Technology
Workshop on High Reliability Microelectronics for Military Applications, Linthicum Heights,
MD, USA, volume 1719, 2011.

[82] Dawn An, Nam H Kim, and Joo-Ho Choi. Practical options for selecting data-driven
or physics-based prognostics algorithms with reviews. Reliability Engineering & System
Safety, 133:223–236, 2015.

[83] Sun Hye Kim and Fani Boukouvala. Machine learning-based surrogate modeling for data-
driven optimization: a comparison of subset selection for regression techniques. Optimiza-
tion Letters, 14(4):989–1010, 2020.

[84] Arinan Dourado and Felipe AC Viana. Physics-informed neural networks for corrosion-
fatigue prognosis. In Proceedings of the Annual Conference of the PHM Society, volume 11,
2019.

115

[85] Arinan Dourado and Felipe AC Viana. Physics-informed neural networks for missing physics
estimation in cumulative damage models: a case study in corrosion fatigue. Journal of
Computing and Information Science in Engineering, 20(6), 2020.

[86] Nazri Mohd Nawi, RS Ransing, and MR Ransing. An improved conjugate gradient based
learning algorithm for back propagation neural networks. International Journal of Computer
and Information Engineering, 2(6):2062–2071, 2008.

[87] Xuefei Guan, Yongming Liu, Abhinav Saxena, Jose Celaya, and Kai Goebel. Entropy-based
probabilistic fatigue damage prognosis and algorithmic performance comparison. In Annual
Conference of the PHM Society, volume 1, 2009.

[88] Matthew Cherry, Harold Sabbagh, John Aldrin, Jeremy Knopp, and Adam Pilchak. Forward
propagation of parametric uncertainties through models of nde inspection scenarios. In AIP
Conference Proceedings, volume 1650, pages 1884–1892. American Institute of Physics,
2015.

[89] Daniel Gauder, Michael Biehler, Johannes Gölz, Benedict Stampfer, David Böttger, Ben-
jamin Häfner, Bernd Wolter, Volker Schulze, and Gisela Lanza. Development of a methodi-
cal approach for uncertainty quantification and meta-modeling of surface hardness in white
tm-Technisches Messen, 88(11):661–673,
layers of longitudinal turned aisi4140 surfaces.
2021.

[90]

Jeffrey A Kornuta, Nicoli M Ames, Mary W Louie, Peter Veloo, and Troy Rovella. Un-
certainty quantification of nondestructive techniques to verify pipeline material strength. In
International Pipeline Conference, volume 51869, page V001T03A046. American Society
of Mechanical Engineers, 2018.

[91] Zi Li, Bharath Basti Shenoy, Lalita Udpa, Satish Udpa, and Yiming Deng. Magnetic
barkhausen noise technique for early-stage fatigue prediction in martensitic stainless-steel
samples. Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering
Systems, 4(4), 2021.

[92] Xiaojing Wu, Weiwei Zhang, and Shufang Song. Uncertainty quantification and sensitivity
International Journal of

analysis of transonic aerodynamics with geometric uncertainty.
Aerospace Engineering, 2017, 2017.

[93] Sez Atamturktur, Saurabh Prabhu, and Gregory Roche. Predictive modeling of large scale
In 9th
historic masonry monuments: uncertainty quantification and model validation.
International Conference on Structural Dynamics, EURODYN 2014, pages 2721–2727.
European Association for Structural Dynamics, 2014.

[94] Zi Li, Xuhui Huang, Obaid Elshafiey, Subrata Mukherjee, and Yiming Deng. Fem of
magnetic flux leakage signal for uncertainty estimation in crack depth classification using
bayesian convolutional neural network and deep ensemble. In 2021 International Applied

116

Computational Electromagnetics Society Symposium (ACES), pages 1–4. IEEE, 2021.

[95] Ming Hong, Zhu Mao, Michael D Todd, and Zhongqing Su. Uncertainty quantification
for acoustic nonlinearity parameter in lamb wave-based prediction of barely visible impact
damage in composites. Mechanical Systems and Signal Processing, 82:448–460, 2017.

[96]

John C Aldrin, Jeremy S Knopp, Mark P Blodgett, and Harold A Sabbagh. Uncertainty
propagation in eddy current nde inverse problems. In AIP Conference Proceedings, volume
1335, pages 631–638. American Institute of Physics, 2011.

[97] Hyung-Seop Shim. Performance evaluation of nde methods. KSCE Journal of Civil Engi-

neering, 7(2):185–192, 2003.

[98] Samira Mohammadi and Selen Cremaschi. Efficiency of uncertainty propagation methods
for moment estimation of uncertain model outputs. Computers & Chemical Engineering,
page 107954, 2022.

[99] Sang Hoon Lee and Wei Chen. A comparative study of uncertainty propagation methods for
black-box-type problems. Structural and multidisciplinary optimization, 37:239–253, 2009.

[100] Cosmin Safta, Richard L-Y Chen, Habib N Najm, Ali Pinar, and Jean-Paul Watson. Efficient
uncertainty quantification in stochastic economic dispatch. IEEE Transactions on Power
Systems, 32(4):2535–2546, 2016.

[101] XY Jia, C Jiang, CM Fu, BY Ni, CS Wang, and MH Ping. Uncertainty propagation analysis
by an extended sparse grid technique. Frontiers of Mechanical Engineering, 14:33–46, 2019.

[102] Martin Hunt, Benjamin Haley, Michael McLennan, Marisol Koslowski, Jayathi Murthy, and
Alejandro Strachan. Puq: A code for non-intrusive uncertainty propagation in computer
simulations. Computer Physics Communications, 194:97–107, 2015.

[103] Mohammad Mahdi Rajabi. Review and comparison of two meta-model-based uncertainty
propagation analysis methods in groundwater applications: polynomial chaos expansion and
gaussian process emulation. Stochastic environmental research and risk assessment, 33:
607–631, 2019.

[104] Chiara Tardioli, Martin Kubicek, Massimiliano Vasile, Edmondo Minisci, and Annalisa
Riccardi. Comparison of non-intrusive approaches to uncertainty propagation in orbital
mechanics. 2015.

[105] D Ye, L Veen, A Nikishova, J Lakhlili, W Edeling, OO Luk, VV Krzhizhanovskaya, and
AG Hoekstra. Uncertainty quantification patterns for multiscale models. Philosophical
Transactions of the Royal Society A, 379(2197):20200072, 2021.

[106] Jeff Duffy, Songquan Liu, Herbert Moskowitz, Robert Plante, and Paul V Preckel. Assessing

117

multivariate process/product yield via discrete point approximation. IIE transactions, 30:
535–543, 1998.

[107] Paul Glasserman. Monte Carlo methods in financial engineering, volume 53. Springer,

2004.

[108] Alex A Gorodetsky, Gianluca Geraci, Michael S Eldred, and John D Jakeman. A generalized
approximate control variate framework for multifidelity uncertainty quantification. Journal
of Computational Physics, 408:109257, 2020.

[109] Ajay Jasra, Kody JH Law, and Yan Zhou. Forward and inverse uncertainty quantification
International

using multilevel monte carlo algorithms for an elliptic nonlocal equation.
Journal for Uncertainty Quantification, 6(6), 2016.

[110] John Hammersley. Monte carlo methods. Springer Science & Business Media, 2013.

[111] Jiaxin Zhang. Modern monte carlo methods for efficient uncertainty quantification and
propagation: A survey. Wiley Interdisciplinary Reviews: Computational Statistics, 13(5):
e1539, 2021.

[112] Eric Biedermann, Leanne Jauriqui, John C Aldrin, Alexander Mayes, Tom Williams, and
Siamack Mazdiyasni. Uncertainty quantification in modeling and measuring components
with resonant ultrasound spectroscopy. In AIP Conference Proceedings, volume 1706. AIP
Publishing, 2016.

[113] Matthias Franz Rath, Bernhard Schweighofer, and Hannes Wegleiter. Uncertainty analysis
of an optoelectronic strain measurement system for flywheel rotors. Sensors, 21(24):8393,
2021.

[114] Olivier Le Maître and Omar M Knio. Spectral methods for uncertainty quantification: with

applications to computational fluid dynamics. Springer Science & Business Media, 2010.

[115] Dongbin Xiu. Numerical methods for stochastic computations: a spectral method approach.

Princeton university press, 2010.

[116] Jun Xu and Fan Kong. A cubature collocation based sparse polynomial chaos expansion for

efficient structural reliability analysis. Structural Safety, 74:24–31, 2018.

[117] Jeongeun Son and Yuncheng Du. An efficient polynomial chaos expansion method for

uncertainty quantification in dynamic systems. Applied Mechanics, 2(3):460–481, 2021.

[118] Jethro Nagawkar and Leifur Leifsson. Applications of polynomial chaos-based cokriging to
simulation-based analysis and design under uncertainty. In International Design Engineering
Technical Conferences and Computers and Information in Engineering Conference, volume
84010, page V11BT11A046. American Society of Mechanical Engineers, 2020.

118

[119] TT Zygiridis, AE Kyrgiazoglou, and TP Theodoulidis. Polynomial-chaos uncertainty mod-

eling in eddy-current inspection of cracks.

[120] Firooz Bakhtiari-Nejad, Naserodin Sepehry, and Mahnaz Shamshirsaz. Polynomial chaos
expansion sensitivity analysis for electromechanical impedance of plate. In International
Design Engineering Technical Conferences and Computers and Information in Engineering
Conference, volume 50206, page V008T10A037. American Society of Mechanical Engi-
neers, 2016.

[121] Arnold Bingler and Sándor Bilicz. Sensitivity analysis using a sparse grid surrogate model in
electromagnetic nde. In Electromagnetic non-destructive evaluation (XXI), pages 152–159.
IOS Press, 2018.

[122] Nahid Sanzida and Zoltan K Nagy. Polynomial chaos expansion (pce) based surrogate
modeling and optimization for batch crystallization processes. In Computer Aided Chemical
Engineering, volume 33, pages 565–570. Elsevier, 2014.

[123] K Konakli, C Mylonas, S Marelli, and B Sudret. Uqlab user manual—canonical low-rank

approximations. Report UQLab-V1, pages 1–108, 2019.

[124] Georges Matheron. Principles of geostatistics. Economic geology, 58(8):1246–1266, 1963.

[125] Frank L Hitchcock. The expression of a tensor or a polyadic as a sum of products. Journal

of Mathematics and Physics, 6(1-4):164–189, 1927.

[126] Subrata Mukherjee, Xuhui Huang, Lalita Udpa, and Yiming Deng. A kriging-based magnetic
flux leakage method for fast defect detection in massive pipelines. Journal of Nondestructive
Evaluation, Diagnostics and Prognostics of Engineering Systems, 5(1):011002, 2022.

[127] Leifur Leifsson, Xiaosong Du, and Slawomir Koziel. Multifidelity modeling of ultrasonic
In 2018 IEEE MTT-S International Conference on
testing simulations with cokriging.
Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), pages
1–4. IEEE, 2018.

[128] Michael Eldred. Recent advances in non-intrusive polynomial chaos and stochastic collo-
cation methods for uncertainty analysis and design. In 50th AIAA/ASME/ASCE/AHS/ASC
Structures, Structural Dynamics, and Materials Conference 17th AIAA/ASME/AHS Adaptive
Structures Conference 11th AIAA No, page 2274, 2009.

[129] George Klir and Bo Yuan. Fuzzy sets and fuzzy logic, volume 4. Prentice hall New Jersey,

1995.

[130] Ranjan Ganguli. A fuzzy logic system for ground based structural health monitoring of a
helicopter rotor using modal data. Journal of Intelligent Material Systems and Structures,
12(6):397–407, 2001.

119

[131] Giovanni Cascante, Homayoun Najjaran, and Paola Ronca. Relative ndt evaluation of the side
walls of a brick channel. In Advances in Engineering Structures, Mechanics & Construction,
pages 485–492. Springer, 2006.

[132] Vahid Jahangiri, Hadi Mirab, Reza Fathi, and Mir Mohammad Ettefagh. Tlp structural health
monitoring based on vibration signal of energy harvesting system. Latin American Journal
of Solids and Structures, 13:897–915, 2016.

[133] Bertrand Iooss and Loïc Le Gratiet. Uncertainty and sensitivity analysis of functional risk
curves based on gaussian processes. Reliability Engineering & System Safety, 187:58–66,
2019.

[134] Valérie Kaftandjian, Yue Min Zhu, Olivier Dupuis, and Daniel Babot. The combined use
of the evidence theory and fuzzy logic for improving multimodal nondestructive testing
systems. IEEE Transactions on Instrumentation and Measurement, 54(5):1968–1977, 2005.

[135] Ram M Narayanan and Robin James.
International Journal, 7(1), 2018.

International journal of microwaves applications.

[136] Zhen Li and Zhaozong Meng. A review of the radio frequency non-destructive testing for

carbon-fibre composites. Measurement Science Review, 16(2):1, 2016.

[137] Xiaodong Shi, Vivek T Rathod, Saptarshi Mukherjee, Lalita Udpa, and Yiming Deng. Multi-
modality strain estimation using a rapid near-field microwave imaging system for dielectric
materials. Measurement, 151:107243, 2020.

[138] Paul Probst. A Miniaturized Multi-Modality Imaging System for Dielectric Materials Eval-

uation. Michigan State University, 2021.

[139] Deng Huang, Theodore T Allen, William I Notz, and Ning Zeng. Global optimization
of stochastic black-box systems via sequential kriging meta-models. Journal of global
optimization, 34(3):441–466, 2006.

[140] Roland Schöbi and Bruno Sudret. Pc-kriging: a new metamodelling method combining
In Proc. 2nd Int. Symp. Uncertain. Quantif.

polynomial chaos expansions and kriging.
Stoch. Model., Rouen, France, 2014.

[141] Thomas J Santner, Brian J Williams, William I Notz, and Brain J Williams. The design and

analysis of computer experiments, volume 1. Springer, 2003.

[142] R Schöbi, S Marelli, and B Sudret. Uqlab user manual–pc-kriging. Report UQLab-V1, pages

1–109, 2017.

[143] Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in

evolution strategies. Evolutionary computation, 9(2):159–195, 2001.

120

[144] Minas D Spiridonakos and Eleni N Chatzi. Metamodeling of dynamic nonlinear structural
systems through polynomial chaos narx models. Computers & Structures, 157:99–113,
2015.

[145] Pierric Kersaudy, Bruno Sudret, Nadège Varsier, Odile Picon, and Joe Wiart. A new surrogate
modeling technique combining kriging and polynomial chaos expansions–application to
uncertainty analysis in computational dosimetry. Journal of Computational Physics, 286:
103–117, 2015.

[146] Thierry Crestaux, Olivier Le Maıtre, and Jean-Marc Martinez. Polynomial chaos expansion

for sensitivity analysis. Reliability Engineering & System Safety, 94(7):1161–1172, 2009.

[147] Katerina Konakli and Bruno Sudret. Polynomial meta-models with canonical low-rank
approximations: Numerical insights and comparison to sparse polynomial chaos expansions.
Journal of Computational Physics, 321:1144–1169, 2016.

[148] F Bartolucci and L Scrucca. Point estimation methods with applications to item response

theory models. 2010.

[149] Ralph C Smith. Uncertainty quantification:

theory, implementation, and applications,

volume 12. Siam, 2013.

[150] David Blackwell and Lester Dubins. Merging of opinions with increasing information. The

Annals of Mathematical Statistics, 33(3):882–886, 1962.

[151] Jadie Adams and Shireen Y Elhabian. Benchmarking scalable epistemic uncertainty quan-

tification in organ segmentation. arXiv preprint arXiv:2308.07506, 2023.

[152] Shiro Kubo. Inverse analyses and their applications to nondestructive evaluations. Proc.

12th A-PCNDT, 2006.

[153] W Lord. Nondestructive evaluation inverse problems. In Elsevier Studies in Applied Elec-

tromagnetics in Materials, volume 6, pages 101–104. Elsevier, 1995.

[154] F Honarvar and A Varvani-Farahani. A review of ultrasonic testing applications in additive
manufacturing: Defect evaluation, material characterization, and process control. Ultrason-
ics, 108:106227, 2020.

[155] Richard A Ketcham and William D Carlson. Acquisition, optimization and interpretation
of x-ray computed tomographic imagery: applications to the geosciences. Computers &
Geosciences, 27(4):381–400, 2001.

[156] BA Auld and JC Moulder. Review of advances in quantitative eddy current nondestructive

evaluation. Journal of Nondestructive evaluation, 18:3–36, 1999.

121

[157] Karol Grondzak.

Inverse problems of eddy current testing and uncertainties evaluation.

Advances in Electrical and Electronic Engineering, 5(1):245–248, 2011.

[158] Xu Wu, Ziyu Xie, Farah Alsafadi, and Tomasz Kozlowski. A comprehensive survey of
inverse uncertainty quantification of physical model parameters in nuclear system thermal–
hydraulics codes. Nuclear Engineering and Design, 384:111460, 2021.

[159] Xiao Wang et al. Frequentist and bayesian approaches for probabilistic fatigue life assessment

of high-speed train using in-service monitoring data. 2018.

[160] Eyke Hüllermeier and Willem Waegeman. Aleatoric and epistemic uncertainty in machine
learning: An introduction to concepts and methods. Machine Learning, 110:457–506, 2021.

[161] Leo Breiman. Bagging predictors. Machine learning, 24:123–140, 1996.

[162] John A Rice. Mathematical statistics and data analysis. Cengage Learning, 2006.

[163] Anthony Kulesa, Martin Krzywinski, Paul Blainey, and Naomi Altman. Sampling distribu-

tions and the bootstrap. 2015.

[164] Robert E Kass, Uri T Eden, Emery N Brown, Robert E Kass, Uri T Eden, and Emery N
Brown. Propagation of uncertainty and the bootstrap. Analysis of Neural Data, pages
221–246, 2014.

[165] Felix Kim, Adam Pintar, Jason Fox, Jared Tarr, Alkan Donmez, et al. Probability of detection
of x-ray computed tomography of additive manufacturing defects. Review of Progress in
Quantitative Nondestructive Evaluation, 2019.

[166] Radford M Neal. Bayesian learning for neural networks, volume 118. Springer Science &

Business Media, 2012.

[167] Saptarshi Mukherjee, Hillary Fairbanks, Jordan Lum, David Stobbe, Gabe Guss, Andrew
Townsend, Seemeen Karimi, and Joseph Tringe. A bayesian inference technique for ul-
trasound uncertainty quantification in metal additive manufacturing. Available at SSRN
4250943.

[168] William C Schneck III, Heather Reed, Elizabeth D Gregory, and Cara AC Leckey. Sequential
monte carlo based parameter estimation for structural health monitoring with an intel xeon phi
optimized ultrasound kernel. In AIP Conference Proceedings, volume 2102, page 020035.
AIP Publishing LLC, 2019.

[169] Johnathan M Bardsley. Mcmc-based image reconstruction with uncertainty quantification.

SIAM Journal on Scientific Computing, 34(3):A1316–A1332, 2012.

[170] Alex Graves. Practical variational inference for neural networks. Advances in neural infor-

122

mation processing systems, 24, 2011.

[171] David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for
statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.

[172] Konstantin Posch, Jan Steinbrener, and Jürgen Pilz. Variational inference to measure model

uncertainty in deep neural networks. arXiv preprint arXiv:1902.10189, 2019.

[173] Herbert E Robbins. An empirical bayes approach to statistics. In Breakthroughs in Statistics:

Foundations and Basic Theory, pages 388–394. Springer, 1992.

[174] Christos Louizos and Max Welling. Multiplicative normalizing flows for variational bayesian
In International Conference on Machine Learning, pages 2218–2227.

neural networks.
PMLR, 2017.

[175] Yarin Gal and Zoubin Ghahramani. Bayesian convolutional neural networks with bernoulli

approximate variational inference. arXiv preprint arXiv:1506.02158, 2015.

[176] Durk P Kingma, Tim Salimans, and Max Welling. Variational dropout and the local repa-
rameterization trick. Advances in neural information processing systems, 28, 2015.

[177] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing
model uncertainty in deep learning. In international conference on machine learning, pages
1050–1059. PMLR, 2016.

[178] Yongchan Kwon, Joong-Ho Won, Beom Joon Kim, and Myunghee Cho Paik. Uncertainty
quantification using bayesian neural networks in classification: Application to biomedical
image segmentation. Computational Statistics & Data Analysis, 142:106816, 2020.

[179] Seyed Omid Sajedi and Xiao Liang. Uncertainty-assisted deep vision structural health

monitoring. Computer-Aided Civil and Infrastructure Engineering, 36(2):126–142, 2021.

[180] Zi Li. Deep Learning Techniques for Magnetic Flux Leakage Inspection with Uncertainty

Quantification. Michigan State University, 2019.

[181] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable
predictive uncertainty estimation using deep ensembles. Advances in neural information
processing systems, 30, 2017.

[182] Malte Nalenz. Characterizing model uncertainty in ensemble learning. PhD thesis, lmu,

2022.

[183] Cheng Li. A gentle introduction to gradient boosting. URL: http://www. ccs. neu. edu/home-

/vip/teach/MLcourse/4_ boosting/slides/gradient_boosting. pdf, page 30, 2016.

123

[184] Miguel Á Carreira-Perpiñán and Arman Zharmagambetov. Ensembles of bagged tao trees
consistently improve over random forests, adaboost and gradient boosting. In Proceedings
of the 2020 ACM-IMS on foundations of data science conference, pages 35–46, 2020.

[185] Merrill B Rudd, James T Thorson, and Skyler R Sagarese. Ensemble models for data-poor
assessment: accounting for uncertainty in life-history information. ICES Journal of Marine
Science, 76(4):870–883, 2019.

[186] Jerome H Friedman. Stochastic gradient boosting. Computational statistics & data analysis,

38(4):367–378, 2002.

[187] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings
of the 22nd acm sigkdd international conference on knowledge discovery and data mining,
pages 785–794, 2016.

[188] Ossi Kajasalo. Deep ensemble-based speed of sound field estimation with uncertainty
quantification in ultrasonic tomography. Master’s thesis, Itä-Suomen yliopisto, 2023.

[189] Homin Song and Yongchao Yang. Uncertainty quantification in super-resolution guided wave
array imaging using a variational bayesian deep learning approach. NDT & E International,
133:102753, 2023.

[190] David Draper. Assessment and propagation of model uncertainty. Journal of the Royal

Statistical Society Series B: Statistical Methodology, 57(1):45–70, 1995.

[191] Max Hinne, Quentin F Gronau, Don van den Bergh, and Eric-Jan Wagenmakers. A con-
ceptual introduction to bayesian model averaging. Advances in Methods and Practices in
Psychological Science, 3(2):200–215, 2020.

[192] Theo S Eicher, Chris Papageorgiou, and Adrian E Raftery. Default priors and predictive
performance in bayesian model averaging, with application to growth determinants. Journal
of Applied Econometrics, 26(1):30–55, 2011.

[193] Jie Chen and Yongming Liu. Multimodality data fusion for probabilistic strength estimation
of aging materials using bayesian networks. In AIAA Scitech 2020 Forum, page 1653, 2020.

[194] Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, and Andrew Gordon Wil-
arXiv preprint

son. Cyclical stochastic gradient mcmc for bayesian deep learning.
arXiv:1902.03932, 2019.

[195] Yuki Mae, Wataru Kumagai, and Takafumi Kanamori. Uncertainty propagation for dropout-

based bayesian neural networks. Neural Networks, 144:394–406, 2021.

[196] Chio Lam and Wenxing Zhou. Statistical analyses of incidents on onshore gas transmission
pipelines based on phmsa database. International Journal of Pressure Vessels and Piping,

124

145:29–40, 2016.

[197] Yao Yao, Shue-Ting Ellen Tung, and Branko Glisic. Crack detection and characterization
techniques—an overview. Structural Control and Health Monitoring, 21(12):1387–1413,
2014.

[198] Shiuh-Chuan Her and Sheng-Tung Lin. Non-destructive evaluation of depth of surface cracks

using ultrasonic frequency analysis. Sensors, 14(9):17146–17158, 2014.

[199] K Kimoto, S Ueno, and S Hirose. Image-based sizing of surface-breaking cracks by sh-wave

array ultrasonic testing. Ultrasonics, 45(1-4):152–164, 2006.

[200] Sony Baby, T Balasubramanian, R J Pardikar, M Palaniappan, and R Subbaratnam. Time-of-
flight diffraction (TOFD) technique for accurate sizing of surface-breaking cracks. Insight -
Non-Destr. Test. Cond. Monit., 45(6):426–430, June 2003.

[201] Weiying Cheng and Kenzo Miya. Reconstruction of parallel cracks by ECT. Int. J. Appl.

Electromagn. Mech., 14(1-4):495–502, December 2002.

[202] Yiming Deng and Xin Liu. Electromagnetic imaging methods for nondestructive evaluation

applications. Sensors, 11(12):11774–11808, 2011.

[203] Maryam Ravan, Reza K Amineh, Slawomir Koziel, Natalia K Nikolova, and James P
Reilly. Sizing of multiple cracks using magnetic flux leakage measurements. IET science,
measurement & technology, 4(1):1–11, 2010.

[204] Shamim Ahmed, Roberto Miorelli, Pierre Calmon, Nicola Anselmi, and Marco Salucci.
Real time flaw detection and characterization in tube through partial least squares and svr:
In AIP conference proceedings, volume 1949. AIP
Application to eddy current testing.
Publishing, 2018.

[205] Kharudin Bin Ali, Ahmed N Abdalla, Damhuji Rifai, and Moneer A Faraj. Review on
system development in eddy current testing and technique for defect classification and
characterization. IET Circuits, Devices & Systems, 11(4):338–351, 2017.

[206] PA Ivanov, V Zhang, CH Yeoh, H Udpa, Y Sun, SS Udpa, and W Lord. Magnetic flux
leakage modeling for mechanical damage in transmission pipelines. IEEE Transactions on
magnetics, 34(5):3020–3023, 1998.

[207] Ameet Joshi, Lalita Udpa, Satish Udpa, and Antonello Tamburrino. Adaptive wavelets for
characterizing magnetic flux leakage signals from pipeline inspection. IEEE transactions on
magnetics, 42(10):3168–3170, 2006.

[208] S Mukhopadhyay and GP Srivastava. Characterisation of metal loss defects from magnetic
flux leakage signals with discrete wavelet transform. Ndt & E International, 33(1):57–65,

125

2000.

[209] Gianni D’Angelo and Salvatore Rampone. Shape-based defect classification for non destruc-
In 2015 IEEE Metrology for Aerospace (MetroAeroSpace), pages 406–410.

tive testing.
IEEE, 2015.

[210] Wei Lu, Yuning Wei, Jinxia Yuan, Yiming Deng, and Aiguo Song. Tractor assistant driving
control method based on eeg combined with rnn-tl deep learning algorithm. IEEE Access,
8:163269–163279, 2020.

[211] Peipei Zhu, Yuhua Cheng, Portia Banerjee, Antonello Tamburrino, and Yiming Deng. A
novel machine learning model for eddy current testing with uncertainty. ndt & e International,
101:104–112, 2019.

[212] David JC MacKay. Bayesian neural networks and density networks. Nuclear Instruments
and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and
Associated Equipment, 354(1):73–80, 1995.

[213] Andrey Malinin. Uncertainty estimation in deep learning with application to spoken lan-

guage assessment. PhD thesis, University of Cambridge, 2019.

[214] T Azizzadeh and MS Safizadeh. Investigation of the lift-off effect on the corrosion detection
sensitivity of three-axis mfl technique. Journal of Magnetics, 23(2):152–159, 2018.

[215] J Bruce Nestleroth and Richard J Davis. The effects of magnetizer velocity on magnetic flux
leakage signals. Review of progress in quantitative nondestructive evaluation: volumes 12A
and 12B, pages 1891–1898, 1993.

[216] Ali Mirzaee, Sina Zahedifard, Iman Ahadi Akhlaghi, and Saeed Kahrobaee. Application of
magnetic flux leakage (mfl) method to non-destructively characterize the microstructure and
corrosion behaviour of api x65 grade steel. Journal of Magnetism and Magnetic Materials,
566:170311, 2023.

[217] S Jäggi, H Böhni, and B Elsener. Macrocell corrosion of steel in concrete-experiments and

numerical modelling. European Federation of Corrosion, 2001.

[218] Muhammad Faheem, Syed Bilal Hussain Shah, Rizwan Aslam Butt, Basit Raza, Muhammad
Anwar, Muhammad Waqar Ashraf, Md A Ngadi, and Vehbi C Gungor. Smart grid commu-
nication and information technologies in the perspective of industry 4.0: Opportunities and
challenges. Computer Science Review, 30:1–30, 2018.

[219] Yasi Wang, Hongxun Yao, and Sicheng Zhao. Auto-encoder based dimensionality reduction.

Neurocomputing, 184:232–242, 2016.

[220] Changro Lee. Data augmentation using a variational autoencoder for estimating property

126

prices. Property Management, 39(3):408–418, 2021.

[221] Lovedeep Gondara. Medical image denoising using convolutional denoising autoencoders.
In 2016 IEEE 16th international conference on data mining workshops (ICDMW), pages
241–246. IEEE, 2016.

[222] Kasra Babaei, ZhiYuan Chen, and Tomas Maul. Data augmentation by autoencoders for

unsupervised anomaly detection. arXiv preprint arXiv:1912.13384, 2019.

[223] Herim Han and Sunghwan Choi. Transfer learning from simulation to experimental data:
Nmr chemical shift predictions. The Journal of Physical Chemistry Letters, 12(14):3662–
3668, 2021.

[224] Matthew Welborn, Lixue Cheng, and Thomas F Miller III. Transferability in machine
learning for electronic structure via the molecular orbital basis. Journal of chemical theory
and computation, 14(9):4772–4779, 2018.

[225] Yifan Peng, Shankai Yan, and Zhiyong Lu. Transfer learning in biomedical natural language
processing: an evaluation of bert and elmo on ten benchmarking datasets. arXiv preprint
arXiv:1906.05474, 2019.

[226] Keunwoo Choi, György Fazekas, Mark Sandler, and Kyunghyun Cho. Transfer learning for

music classification and regression tasks. arXiv preprint arXiv:1703.09179, 2017.

[227] Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Moham-
mad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U Rajendra Acharya,
et al. A review of uncertainty quantification in deep learning: Techniques, applications and
challenges. Information Fusion, 76:243–297, 2021.

[228] Weiyang Liu, Yandong Wen, Zhiding Yu, and Meng Yang. Large-margin softmax loss for

convolutional neural networks. arXiv preprint arXiv:1612.02295, 2016.

[229] Alexandru Niculescu-Mizil and Rich Caruana. Predicting good probabilities with supervised
learning. In Proceedings of the 22nd international conference on Machine learning, pages
625–632, 2005.

[230] Jeremy Nixon, Michael W Dusenberry, Linchuan Zhang, Ghassen Jerfel, and Dustin Tran.

Measuring calibration in deep learning. In CVPR workshops, volume 2, 2019.

[231] Chris Ding and Hanchuan Peng. Minimum redundancy feature selection from microarray
gene expression data. Journal of bioinformatics and computational biology, 3(02):185–205,
2005.

[232] Ronald A Fisher. The use of multiple measurements in taxonomic problems. Annals of

eugenics, 7(2):179–188, 1936.

127

[233] John Brausch, Lawrence Butkus, David Campbell, Tommy Mullis, and Michael Paulk.
Recommended processes and best practices for nondestructive inspection (ndi) of safety of
flight structures’. Technical report, Report No. AFRL-RX-WP-TR-2008-4373, Air Force
Research Laboratory, USA, 2008.

[234] Alan P Berens. Evaluation of nde reliability characterization. Report No. AFWAL-TR-81-

4160, 1981.

[235] Peter W Hovey and Alan P Berens. Statistical evaluation of nde reliability in the aerospace
In Review of Progress in Quantitative Nondestructive Evaluation: Volume 7B,

industry.
pages 1761–1768. Springer, 1988.

[236] F Fücsök, C Müller, and M Scharmach. Measuring of the reliability of nde.

In 8th In-
ternational Conference of the Slovenian Society for Non-Destructive Testing „Application
of Contemporary Non-Destructive Testing in Engineering, Portorož, Slovenia, September,
pages 1–3, 2005.

[237] Eric Lindgren, David Forsyth, John Aldrin, and Floyd Spencer. Probability of detection.

2018.

[238] Iikka Virkkunen, Tuomas Koskinen, Suvi Papula, Teemu Sarikka, and Hannu Hänninen.
Comparison of â versus a and hit/miss pod-estimation methods: A european viewpoint.
Journal of Nondestructive Evaluation, 38:1–13, 2019.

[239] Arvind Keprate and RM Chandima Ratnayake. Probability of detection as a metric for
quantifying nde capability: the state of the art. J. Pipeline Eng, 14(3):199–209, 2015.

[240] GL DNV. Rp-0001: Probabilistic methods for planning of inspection for fatigue cracks in

offshore structures. Det Norske Veritas, Høvik, Norway, 2015.

[241] Fengyang Jiang, GUAN Zhidong, LI Zengshan, and WANG Xiaodong. A method of
predicting visual detectability of low-velocity impact damage in composite structures based
on logistic regression model. Chinese Journal of Aeronautics, 34(1):296–308, 2021.

[242] CA Harding and GR Hugo. Review of literature on probability of detection for liquid

penetrant nondestructive testing. 2011.

[243] SK Burke and RJ Ditchburn. Review of literature on probability of detection for magnetic

particle nondestructive testing. Department of Defence, Australia, 2013.

[244] Jochen H Kurz, Anne Jüngert, Sandra Dugan, Gerd Dobmann, and Christian Boller. Relia-
bility considerations of ndt by probability of detection (pod) determination using ultrasound
phased array. Engineering failure analysis, 35:609–617, 2013.

[245] Junzhen Zhu, Qingxu Min, Jianbo Wu, and Gui Yun Tian. Probability of detection for

128

eddy current pulsed thermography of angular defect quantification. IEEE Transactions on
Industrial Informatics, 14(12):5658–5666, 2018.

[246] Michael Wright. How to implement a pod into a highly effective inspection strategy. NDT

Canada, pages 15–17, 2016.

[247] Ryan M Meyer, Susan L Crawford, John P Lareau, and Michael T Anderson. Review of

literature for model assisted probability of detection. 2014.

[248] R Bruce Thompson, Lisa J Brasche, Eric Lindgren, Paul Swindell, and William P Winfree.
In 4th European-American

Recent advances in model-assisted probability of detection.
workshop on reliability of NDE, number LF99-9094, 2009.

[249] CA Harding, GR Hugo, and SJ Bowles. Application of model-assisted pod using a trans-
In AIP conference proceedings, volume 1096, pages 1792–1799.

fer function approach.
American Institute of Physics, 2009.

[250] F Jenson, N Dominguez, P Willaume, and T Yalamas. A bayesian approach for the determi-
nation of pod curves from empirical data merged with simulation results. In AIP Conference
Proceedings, volume 1511, pages 1741–1748. American Institute of Physics, 2013.

[251] M Wall, FA Wedgwood, and S Burch. Modelling of ndt reliability (pod) and applying
corrections for human factors. In Proceedings of the 7th European Conference on NDT,
Copenhagen, Denmark, 1998.

[252] Sarah Muscat, Stuart Parks, Ewan Kemp, and David Keating. Repeatability and repro-
ducibility of macular thickness measurements with the humphrey oct system. Investigative
ophthalmology & visual science, 43(2):490–495, 2002.

[253] Barry N Taylor. Guidelines for Evaluating and Expressing the Uncertainty of NIST Mea-

surement Results (rev. Diane Publishing, 2009.

[254] Stephanie A Bell. A beginner’s guide to uncertainty of measurement. 2001.

[255] United Kingdom Accreditation Service. The expression of uncertainty and confidence in

measurement. United Kingdom Accreditation Service, 1997.

[256] Morana Mihaljević, Damir Markučič, Biserka Runje, and Zdenka Keran. Measurement
uncertainty evaluation of ultrasonic wall thickness measurement. Measurement, 137:179–
188, 2019.

[257] Mohand Alzuhiri, Zi Li, Adithya Rao, Jiaoyang Li, Preston Fairchild, Xiaobo Tan, and
Imu-assisted robotic structured light sensing with featureless registration

Yiming Deng.
under uncertainties for pipeline inspection. NDT & E International, page 102936, 2023.

129

[258] Robert Tomkowski, Aki Sorsa, Suvi Santa-aho, Per Lundin, and Minnamari Vippola. Statis-
tical evaluation of barkhausen noise testing (bnt) for ground samples. Sensors, 19(21):4716,
2019.

[259] Shuang Li, Zhongyu Wang, Jingyu Guan, and Jihu Wang. Uncertainty evaluation in surface
structured light measurement. In 2021 IEEE 15th International Conference on Electronic
Measurement & Instruments (ICEMI), pages 395–400. IEEE, 2021.

[260] Fernando J Alamos, Jiahui C Gu, and Hyunok Kim. Evaluating the reliability of a non-
destructive evaluation (nde) tool to measure the incoming sheet mechanical properties. In
Forming the Future: Proceedings of the 13th International Conference on the Technology
of Plasticity, pages 2573–2584. Springer, 2021.

[261] Matthew R Cherry, Harold S Sabbagh, Adam L Pilchak, Jeremy S Knopp, and DAYTON
UNIV RESEARCH INST OH. Characterization of a random anisotropic conductivity field
with karhunen-loeve methods (postprint). 2013.

[262] Yung-Li Lee, Mark E Barkey, and Hong-Tae Kang. Metal fatigue analysis handbook:
practical problem-solving techniques for computer-aided engineering. Elsevier, 2011.

[263] Kwai S Chan. Roles of microstructure in fatigue crack initiation. International Journal of

Fatigue, 32(9):1428–1447, 2010.

[264] Moorthy Vaidhianathasamy, Brian Andrew Shaw, Will Bennett, and Peter Hopkins. Evalu-
ation of contact fatigue damage on gears using the magnetic barkhausen noise technique. In
Proc. 12th Int. Workshop Electromag. Nondestruct. Eval., pages 98–105, 2008.

[265] Jose Alberto Perez-Benitez, J Capó-Sánchez, J Anglada-Rivera, and LR Padovese. A model
for the influence of microstructural defects on magnetic barkhausen noise in plain steels.
Journal of magnetism and magnetic materials, 288:433–442, 2005.

[266] Sadegh Ghanei, M Kashefi, and Mohammad Mazinani. Comparative study of eddy current
and barkhausen noise nondestructive testing methods in microstructural examination of
ferrite–martensite dual-phase steel. Journal of magnetism and magnetic materials, 356:
103–110, 2014.

[267] Krzysztof Miesowicz, Wieslaw J Staszewski, and Tomasz Korbiel. Analysis of barkhausen
noise using wavelet-based fractal signal processing for fatigue crack detection. International
Journal of Fatigue, 83:109–116, 2016.

[268] O Stupakov, J Pal’a, Toshiyuki Takagi, and Tetsuya Uchimoto. Governing conditions of
repeatable barkhausen noise response. Journal of Magnetism and Magnetic Materials, 321
(18):2956–2962, 2009.

[269] Shuo Zhang, Xiaodong Shi, Lalita Udpa, and Yiming Deng. Micromagnetic measurement

130

for characterization of ferromagnetic materials’ microstructural properties. AIP Advances,
8(5):056614, 2018.

[270] M Vashista and V Moorthy. On the shape of the magnetic barkhausen noise profile for better
revelation of the effect of microstructures on the magnetisation process in ferritic steels.
Journal of Magnetism and Magnetic Materials, 393:584–592, 2015.

[271] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics

and intelligent laboratory systems, 2(1-3):37–52, 1987.

[272] P Burrascano, E Cardelli, A Faba, S Fiori, and A Massinelli. Application of probabilistic
neural networks to eddy current non destructive test problems. In EANN 2001 Conference,
pages 16–18, 2001.

[273] S Kanemoto. Acoustic monitoring using kernel pca and probabilistic neural network. In 7th

International Conference on NDE, 2009.

[274] Christian P Vetter, Laura A Kuebel, Divya Natarajan, and Ray A Mentzer. Review of failure
trends in the us natural gas pipeline industry: An in-depth analysis of transmission and
distribution system incidents. Journal of Loss Prevention in the Process Industries, 60:
317–333, 2019.

[275] Julie Maupin, Michael Mamoun, et al. Plastic pipe failure, risk, and threat analysis. Technical

report, Gas Technology Institute, 2009.

[276] Juanjuan Zhu, Richard P Collins, Joby B Boxall, Robin S Mills, and Rob Dwyer-Joyce.
Non-destructive in-situ condition assessment of plastic pipe using ultrasound. Procedia
engineering, 119:148–157, 2015.

[277] Qiang Wang, Haiting Zhou, Junwei Xie, and Xiaomeng Xu. Nonlinear ultrasonic evaluation
of high-density polyethylene natural gas pipe thermal butt fusion joint aging behavior.
International Journal of Pressure Vessels and Piping, 189:104272, 2021.

[278] Tobias D Carrigan, Benjamin E Forrest, Hector N Andem, Kaiyu Gui, Lewis Johnson,
James E Hibbert, Barry Lennox, and Robin Sloan. Nondestructive testing of nonmetallic
pipelines using microwave reflectometry on an in-line inspection robot. IEEE Transactions
on Instrumentation and Measurement, 68(2):586–594, 2018.

[279] Andri Haryono, Mohamed A Abou-Khousa, et al. Microwave non-destructive evaluation
of glass reinforced epoxy and high density polyethylene pipes. Journal of Nondestructive
Evaluation, 39(1):1–9, 2020.

[280] R Kafieh, T Lotfi, and Rassoul Amirfattahi. Automatic detection of defects on polyethylene
pipe welding using thermal infrared imaging. Infrared Physics & Technology, 54(4):317–
325, 2011.

131

[281] Marjan Doaei and M Sadegh Tavallali. Intelligent screening of electrofusion-polyethylene
joints based on a thermal ndt method. Infrared Physics & Technology, 90:1–7, 2018.

[282] Cong Li, Hui-Qing Lan, Ya-Nan Sun, and Jun-Qiang Wang. Detection algorithm of defects
on polyethylene gas pipe using image recognition. International Journal of Pressure Vessels
and Piping, 191:104381, 2021.

[283] Mansour Karkoub, Othmane Bouhali, and Ali Sheharyar. Gas pipeline inspection using
autonomous robots with omni-directional cameras. IEEE Sensors Journal, 21(14):15544–
15553, 2020.

[284] Portia Banerjee, Rajendra Prasath Palanisamy, Mahmood Haq, Lalita Udpa, and Yiming
Deng. Data-driven prognosis of fatigue-induced delamination in composites using optical
In 2019 IEEE International Conference on Prognostics and
and acoustic nde methods.
Health Management (ICPHM), pages 1–10. IEEE, 2019.

[285] Mohand Alzuhiri, Khalid Farrag, Ernest Lever, and Yiming Deng. An electronically stabi-
lized multi-color multi-ring structured light sensor for gas pipelines internal surface inspec-
tion. IEEE Sensors Journal, 2021.

[286] Christoph Schmalz, Frank Forster, Anton Schick, Elli Angelopoulou, Frank Forster, Anton
Schick, and Elli Angelopoulou. An endoscopic 3D scanner based on structured light. Medical
Image Analysis, 2012.

[287] Tzu-Yi Chuang and Cheng-Che Sung. Learning and slam based decision support platform

for sewer inspection. Remote Sensing, 12(6):968, 2020.

[288] Shrĳan Kumar. Development of slam algorithm for a pipe inspection serpentine robot.

Master’s thesis, University of Twente, 2019.

[289] Dennis Krys and Homayoun Najjaran. Development of visual simultaneous localization
In 2007 International Symposium on

and mapping (vslam) for a pipe inspection robot.
Computational Intelligence in Robotics and Automation, pages 344–349. IEEE, 2007.

[290] R Zhang, MH Evans, R Worley, SR Anderson, and L Mihaylova. Improving slam in pipe
networks by leveraging cylindrical regularity. In Annual Conference Towards Autonomous
Robotic Systems, pages 56–65. Springer, 2021.

[291] Andreu Corominas Murtra and Josep M Mirats Tur. Imu and cable encoder data fusion for
in-pipe mobile robot localization. In 2013 IEEE Conference on Technologies for Practical
Robot Applications (TePRA), pages 1–6. IEEE, 2013.

[292] Chi-Keung Tang. Tensor voting in computer vision, visualization, and higher dimensional

inferences. University of Southern California, 2000.

132

[293] Jason Geng. Structured-light 3d surface imaging: a tutorial. Advances in Optics and

Photonics, 3(2):128–160, 2011.

[294] Paul J Besl and Neil D McKay. Method for registration of 3-d shapes. In Sensor fusion IV:

control paradigms and data structures, volume 1611, pages 586–606. Spie, 1992.

[295] Rachid Tiar, Mustapha Lakrouf, and Ouahiba Azouaoui. Fast icp-slam for a bi-steerable
mobile robot in large environments. In 2015 IEEE International Workshop of Electronics,
Control, Measurement, Signals and their Application to Mechatronics (ECMSM), pages 1–6.
IEEE, 2015.

[296] Wilian França Costa, Jackson P Matsuura, Fabiana Soares Santana, and Antonio Mauro
Saraiva. Evaluation of an icp based algorithm for simultaneous localization and mapping
using a 3d simulated p3dx robot. In 2010 Latin American Robotics Symposium and Intelligent
Robotics Meeting, pages 103–108. IEEE, 2010.

[297] Yue Wang, Rong Xiong, and Qianshan Li. Em-based point to plane icp for 3d simultaneous

localization and mapping. Int J Rob Autom, 28:234–244, 2013.

[298] Rakesh Kumar Sidharthan, Ramkumar Kannan, Seshadhri Srinivasan, and Valentina Emilia
Balas. Stochastic wheel-slip compensation based robot localization and mapping. Advances
in Electrical and Computer Engineering, 16(2):25–32, 2016.

133