AI-ENABLED KNOWLEDGE TRANSFER AND LEARNING FOR NONDESTRUCTIVE
EVALUATION TOWARD INTELLIGENT AND ADAPTIVE SYSTEMS

By

Xuhui Huang

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Electrical Engineering—Doctor of Philosophy

2025

ABSTRACT

Critical infrastructure integrity and reliability, from composites to high-voltage feeder pipes

and railway tracks, demands precise, robust, and explainable nondestructive evaluation (NDE)

techniques.

In this thesis, the growing need for accurate damage detection, localization, and

characterization under conditions involving high-speed inspections, varying sensor positions, and

electromagnetic interference is addressed through a suite of data-driven, physics-informed, and

domain-adaptive methodologies. The proposed framework integrates deep learning, advanced

signal processing, and multi-fidelity modeling to address key challenges such as limited and unbal-

anced experimental datasets, mismatched simulation-to-field conditions, and signal distortion from

environmental noise. First, a novel hybrid deep learning architecture is introduced that combines

Generative Adversarial Networks (GANs) with Inception-based neural networks for coordinate-

based Acoustic Emission (AE) source localization. This method achieves significant reductions

in localization estimation errors, enabling reliable single-sensor monitoring. Complementing this

effort, explainable deep convolutional neural network models are proposed for AE signal classi-

fication. Guided by physics-informed signal segmentation and advanced visualization techniques

such as Class Activation Mapping (CAM) and Gradient-weighted CAM (Grad-CAM), these models

illuminate the underlying mechanisms of Lamb wave mode interactions, thereby instilling trust and

interpretability into the machine learning pipeline. Domain adaptation and transfer learning are

central to this work. Specifically, the gap between abundant simulated data (source domain) and

limited experimental measurements (target domain) is bridged to ensure that feature representations

learned from large-scale synthetic datasets can be effectively transferred and fine-tuned. By inte-

grating physics-informed constraints and knowledge transfer, the resulting models exhibit higher

accuracy, are less prone to overfitting, and maintain interpretability in varied scenarios. A multi-

fidelity Gaussian Process Regression (GPR) strategy is further presented for motion-induced eddy

current testing (MIECT) to manage both forward (signal prediction) and inverse (defect estimation)

problems under in-service inspection, sensor motion, and environmental noise. These GPR-based

surrogates fuse low-fidelity finite element simulations with high-fidelity experimental data, accu-

rately predicting sensor responses at inspection speeds exceeding typical laboratory conditions and

enabling robust inverse estimations of defect geometries. In parallel, a novel auto-compensation

algorithm for Pulsed Eddy Current (PEC) inspections is developed to address electromagnetic

interference in feeder lines carrying high currents, significantly improving thickness estimation

accuracy and reducing false-positive indications in underground pipeline applications.

Copyright by
XUHUI HUANG
2025

ACKNOWLEDGEMENTS

I am deeply grateful to my advisor, Professor Dr. Yiming Deng, for his exceptional mentorship

throughout my doctoral studies. His expert guidance helped me navigate numerous challenges

in NDE applications. Beyond technical supervision, he fostered my academic development by

offering opportunities to lead diverse projects and guiding me through problem formulation, model

development, and research publication. His passion for research, commitment to teaching excel-

lence, and professional approach have profoundly shaped my academic journey. As committee

chair, his direction was extremely important in developing my research scope.I extend my sincere

thanks to my committee members, Professors Dr. Ming Han, Dr. Lalita Udpa, Dr. Satish Udpa,

and Dr. Tapabrata Maity, for their generous support and guidance. Professor Maity’s constructive

feedback shaped the GAN-based data augmentation work, complemented by Professor Satish’s

insights on pretrained networks, knowledge transfer, and motion-induced eddy current physics.

Professor Han provided crucial guidance on test design and AE configurations, while Professor

Udpa’s expertise enhanced my understanding of NDE challenges and experimental protocols, re-

fining both equipment requirements and thesis content. Moreover, in the Electrical and Computer

Engineering department, I found an extraordinary community that has significantly shaped my

academic development. To my friends and colleagues at the Nondestructive Evaluation Laboratory

(NDEL), I am profoundly thankful for your friendships, for introducing me to compelling real-

world research problems, and for contributing to my scientific growth. Finally, I am profoundly

indebted to my family for their support and belief in my pursuits, which has made every goal seem

achievable. Their constant encouragement and faith in my abilities have been the bedrock of my

journey, inspiring me to push beyond perceived limitations.

v

TABLE OF CONTENTS

CHAPTER 1

1.1 Motivation & Objectives
1.2 Scope of Research .

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

1
1
2

CHAPTER 2

DOMAIN ADAPTIVE FRAMEWORK FOR SIMULATION TO
4
EXPERIMENT KNOWLEDGE TRANSFER . . . . . . . . . . . . . . .
2.1 Domain-Adaptive Framework . . . . . . . . . . . . . . . . . . . . . . . . . .
6
.
2.2 Experimental Setup .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.3 Simulation Setup .
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Imbalanced and Limited Training Data . . . . . . . . . . . . . . . . . . . . . . 14
2.4
2.5 Transfer Learning and Pre-Training on Simulation Data . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Domain Adaptation .
. 22
2.7 GAN-Based Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . .
2.8 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

CHAPTER 3

.

.

.

.

Introduction .

MULTI-FIDELITY SURROGATE MODELING FOR EFFICIENT
SIMULATION EXPERIMENT INTEGRATION . . . . . . . . . . . . . 33
3.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Motivation and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
. 35
3.3 Forward and Inverse Problem Formulations
3.4 Multi-Fidelity Surrogate Framework . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Computational Complexity and Trade-Offs . . . . . . . . . . . . . . . . . . . . 39
3.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 Conclusion .

. . . . . . . . . . . . . . . . . .

.

.

.

.

CHAPTER 4

Introduction .

COMPENSATION IN PULSED EDDY CURRENT TESTING VIA
SURROGATE MODELING . . . . . . . . . . . . . . . . . . . . . . . . 45
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
. 46
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
. 50
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.1
4.2 Challenges .
4.3 Analytical Model
4.4 Compensation Method via Surrogate Model
4.5 Field Tests and Results
.
4.6 Conclusions .

.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

.
. .

.
.
. .

. .

.

.

.

CHAPTER 5

Introduction .

PHYSICS GUIDED EXPLAINABLE NETWORKS FOR AE
CLASSIFICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
. 61
5.2 Physics-informed AE Segmentation . . . . . . . . . . . . . . . . . . . . . .
5.3 Signal Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Explainable CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . .
. 65
Improved Interpretability and Performance . . . . . . . . . . . . . . . . . . . . 67
5.5

.

.

.

.

CHAPTER 6

. 75
6.1 Conclusions and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 75

CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . .

vi

6.2 Future Work .

. .

BIBLIOGRAPHY . .

. .

APPENDIX .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 76

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

vii

CHAPTER 1

INTRODUCTION

1.1 Motivation & Objectives

In the past decades, there has been a growing interest in developing more effective Structural

Health Monitoring (SHM) and Nondestructive Evaluation (NDE) techniques for critical infras-

tructure systems. Aging bridges, composite panels, high-voltage feeder lines, and railway tracks

require thorough inspection to ensure early detection of damage and long-term reliability. However,

as these infrastructures become increasingly geometrically complex and operate under demanding

conditions, such as in-service inspections at high speeds, varying load levels, and harsh environ-

mental noise, traditional inspection approaches frequently fail to identify initial defects or predict

crack propagation rate with sufficient accuracy. This not only risks safety and reliability but also

increases maintenance cost, therefore highlighting the need of data-driven methodologies capable

of adapting to real-world operational scenarios. Among various diagnostic methods, Acoustic

Emission (AE) techniques exhibit great potential for characterizing micro-crack formation, de-

lamination, and other damage phenomena in both metallic and composite materials. In practice,

however, analyzing AE signals involves confronting multiple complexities, including multi-modal

wave propagation (with mode conversions and dispersive behaviors), non-stationary signal patterns,

and ambient operational noise. These factors impede accurate identification of AE characteristics

and risk obscuring nascent failure indicators.

Recent innovations in deep learning frameworks—such as convolutional and recurrent neural

architectures—offer a transformative potential for SHM and NDE by automatically extracting

salient features from high-dimensional, noisy signals. This alleviates many longstanding challenges

associated with the manual engineering of signal descriptors, especially when addressing rare

or subtle fault conditions. Nonetheless, the limited availability of labeled experimental data

continues to pose a formidable hurdle.

In many critical infrastructures, it is often costly to

gather a sufficient quantity of in situ measurements, and for rare or infrequent failure modes, such

data may be virtually unattainable. Consequently, there has been a marked rise in approaches

1

grounded in transfer learning, domain adaptation, data augmentation (e.g., Generative Adversarial

Networks), and multi-fidelity modeling, where simplified finite element analyses are fused with

high-fidelity experimental data to form robust, cost-effective detection. Simultaneously, the “black

box” perception of deep learning algorithms has prevented their deployment in safety-critical

applications, such as aerospace or nuclear power generation. In these applications, it is critical not

only to obtain accurate predictions but also to understand and explain the underlying mechanism.

Explainable AI tools—such as Class Activation Mapping and its variants—enable visualization

of the internal decision-making, aligning model interpretations with physical wave phenomena

(e.g., Lamb wave modes). Transparency in this process builds confidence and facilitates regulatory

approval for AI-powered inspection systems. Ultimately, our research aims to elevate SHM from

a proactive, predictive methodology. By uniting AI-based algorithms with physics insights and

domain-adaptive knowledge transfer, the present work seeks to advance damage detection, curtail

false alarms, and strengthen interpretability across diverse applications.

1.2 Scope of Research

In this dissertation, we investigate data-driven approaches for enhancing the accuracy and in-

terpretability of AE-based monitoring and other NDE methods by addressing two core challenges:

limited experimental data and imbalanced label distributions. Such challenges commonly arise in

real-world structural health monitoring scenarios, where rare crack initiation events, combined with

demanding test conditions, often constrain the quantity and quality of available labeled datasets.

The objective is to develop robust, physics-informed methods that enable more reliable damage

detection, localization, and characterization across various materials, including metals, compos-

ites, and electromagnetically active structures. Chapter 1 reviews the motivation and objectives

for next-generation SHM and previews the organizational structure of this dissertation. In particu-

lar, it introduces the fundamental issues related to the data scarcity, class imbalance, and domain

adaptation needs. Chapter 2 presents the Domain-Adaptive Framework, illustrating how domain

adaptation and transfer learning can bridge discrepancies between simulated datasets (source do-

main) and limited experimental data (target domain) in AE applications. Through strategies such as

2

fine-tuning pre-trained networks, minimizing domain mismatches, and implementing GAN-based

data augmentation, the chapter demonstrates how reliance on extensive experimental data can

be mitigated. Chapter 3 describes the Multi-Fidelity Surrogate Modeling approach for Motion-

Induced Eddy Current Testing, integrating low-fidelity simulations with high-fidelity experimental

measurements. Radial Basis Function scaling, Gaussian Process Regression, and feature discretiza-

tion are employed to enhance both forward signal predictions and inverse defect estimations, while

ensuring computational efficiency suitable for real-time applications. Chapter 4 outlines a novel

compensation method for Pulsed Eddy Current (PEC) Testing in environments where strong, spa-

tially varying magnetic fields from high-current power lines adversely affect thickness readings.

Leveraging surrogate modeling and finite element simulations, this novel correction algorithm sig-

nificantly reduces false positives by compensating for electromagnetic interference, as evidenced by

field validations in working pipeline segments. Chapter 5 introduces Physics-Guided Explainable

Networks for AE signal classification, focusing on explainable AI (XAI) techniques such as Class

Activation Mapping (CAM), Gradient-weighted CAM (Grad-CAM), and Dispersion-Compensated

CAM (DCAM). By correlating these visualization methods with Lamb wave segmentation, the

chapter demonstrates how time-frequency regions critical to model predictions can be clearly iden-

tified, thereby enhancing interpretability and fostering confidence among domain experts. Chapter

6 summarizes the conclusions and contributions while outlining future research directions. This

chapter explores potential developments in domain-adaptive frameworks, multi-fidelity modeling

and signal processing, and compensation methods. Chapter 7 contains the appendix. In summary,

these chapters combine numerical simulations, laboratory experiments, advanced signal processing,

and deep learning to expand the frontiers of AE-based and electromagnetic-based structural health

monitoring. Through robust domain adaptation, multi-fidelity integration, compensation strategies,

and interpretable neural networks, this dissertation aspires to provide actionable insights, reduce

inspection costs, and increase diagnostic certainty in support of mission-critical industries such as

aerospace, nuclear power, and civil infrastructure.

3

CHAPTER 2

DOMAIN ADAPTIVE FRAMEWORK FOR SIMULATION TO
EXPERIMENT KNOWLEDGE TRANSFER

Acoustic Emission (AE) provides real-time capabilities for detecting and localizing structural

damage Wevers and Lambrighs (2009). This monitoring technique employs surface-mounted sen-

sors to detect stress waves propagating through the structure. Figure 1 illustrates how these captured

signals are processed by data acquisition systems to identify both the source and characteristics of

AE events. AE monitoring is distinguished by its exclusive sensitivity to active damage processes,

enabling real-time detection of evolving defects. The continuous, real-time characteristics enable

dynamic structural integrity assessment, providing crucial early warning of developing damage

across multiple engineering applications Bouzid et al. (2015); Holford et al. (2009). This facilitates

the detection of micro-cracking, fiber breakage, delamination, and other early damage mechanisms

long before they become visually apparent Bhuiyan and Giurgiutiu (2017). This diagnostic ca-

pability aids in preventing catastrophic failures, enhancing maintenance strategies, and extending

infrastructure service life Sen and Nagarajaiah (2018); Chadha et al. (2023); Malekzadeh et al.

(2015). Technological advances in sensing hardware and signal acquisition have substantially

improved AE detection sensitivity and reliability. Recent research has focused on robust sensor

placement, novel waveguide designs, and miniaturized wireless devices capable of operating under

harsh environments Dong et al. (2018).

Figure 1 AE technique

For experimental validation, this thesis utilizes fiber-optic sensing technology, with the stan-

dardized pencil-lead break (PLB) test serving as the controlled acoustic emission source Sause

4

(2011); De Almeida et al. (2015); Hashim et al. (2021) as shown in Figure 2. The fiber optic sensor,

known for its high sensitivity and small size, records elastic waves generated by the event. They

have been extensively applied for measuring a variety of parameters including temperature, strain,

pressure, vibration, and acoustics due to their high detection sensitivity and small size Zhu et al.

(2020). The source localization has advanced from traditional time-difference-of-arrival (TDOA)

methods to identification frameworks, enabling localization with reduced sensor array complex-

ity. Existing data analysis methods employ techniques such as time-domain, frequency-domain,

and time-frequency analysis to extract key features from signals Takagi et al. (1998); Eaton et al.

(2012). Advanced instrumentation and digital acquisition systems provide enhanced signal qual-

ity and timing precision, enabling detailed damage characterization across metals and composites

Ebrahimkhanlou and Salamone (2017b,a, 2018c); Mahajan and Banerjee (2023).

Figure 2 PLB test as AE source and fiber coil sensor as sensor

Deep learnings have revolutionized AE signal analysis, offering powerful tools for damage

detection Ebrahimkhanlou and Salamone (2018c); Mahajan and Banerjee (2023). Deep learning

architectures, particularly convolutional and recurrent neural networks, enable single-sensor source

localization by automatic feature extraction from complex AE waveforms Ebrahimkhanlou and

Salamone (2018a,b); Ebrahimkhanlou et al. (2019). Researchers have also explored data augmen-

tation, physics-informed modeling, and noise reduction strategies to tackle the inherent challenges

of AE signals, which are highly susceptible to noise and environmental factors Assarar et al.

(2015); Ai et al. (2021). Such methods refine AE interpretation, characterizing impact-induced

acoustic events Daugela et al. (2021a) and facilitating accurate detection of wave reflections Haile

5

et al. (2020a). Efforts to integrate AE with complementary SHM modalities, leverage probabilistic

frameworks for source identification, and adopt adaptive models have led to greater reliability and

applicability across broad operational conditions Jung et al. (2019); Jones et al. (2022); Garrett

et al. (2022); Verstrynge et al. (2021).

In summary, AE-based SHM has evolved into a highly

refined practice, combining advanced sensing, improved data acquisition, and cutting-edge signal

interpretation methods. This has increased the AE’s sensitivity and reliability, enabling this tech-

nique to detect subtle signs of material deformation, track damage progression in real-time Wevers

and Lambrighs (2009), and ultimately enhance the safety and longevity of critical engineering

structures.

2.1 Domain-Adaptive Framework

A significant challenge in AE modeling is bridging the gap between simulation-trained models

and real-world applications, where signal distributions, environmental noise, and wave propagation

characteristics differ substantially Li et al. (2018). To address this challenge, domain adaptation,

transfer learning, and data augmentation techniques have emerged as critical strategies for improv-

ing robustness and generalization Sun (2020); Bengio (2012); Ismail-Fawaz et al. (2022). Domain

adaptation focuses on aligning feature distributions between source (simulated) and target (exper-

imental) domains, enabling models to perform well despite shifts in data characteristics Fawaz

et al. (2018); Zhang et al. (2017); Weiss et al. (2016). Transfer learning leverages knowledge

acquired from a related, abundant domain—often simulation-based—to expedite learning in a

data-scarce target domain, thereby reducing the number of required experimental samples Ismail-

Fawaz et al. (2022); Fawaz et al. (2018).Recent innovations have also highlighted the potential

of generative adversarial networks (GANs) to overcome these domain disparities. Among these

innovative approaches, generative adversarial networks (GANs) have emerged as a powerful tool

in the field of data augmentation, offering significant potential for enhancing time series signal

analysis Ebrahimkhanlou et al. (2019). By producing synthetic AE signals that closely mimic

real data characteristics Sun et al. (2022), GAN-based augmentation mitigates the imbalance and

scarcity of experimental data, thereby improving model training. Furthermore, GANs can signifi-

6

cantly expand training datasets Sun et al. (2022), an especially valuable feature when dealing with

minority damage classes and irregular signal patterns Shao et al. (2019); Sorin et al. (2020); Wang

et al. (2023); Jain et al. (2018). Such synthetic data generation has proven effective in related

fields, including machine fault diagnosis and medical imaging, where GAN-augmented training

sets improve performance and address data limitations Shao et al. (2019); Sorin et al. (2020).

Nonetheless, challenges such as training instability and mode collapse Jain et al. (2018) re-

quire careful architectural refinements and training protocols. To address these domain adaptation

challenges for AE source localization, researchers have explored approaches that combine feature

extraction techniques Kats and Volkov (2020, 2019); Liu et al. (2017a,b), unsupervised adaptation

strategies Wu and Huang (2022); Jiang et al. (2020); Xiao and Zhang (2021); Shi et al. (2022);

Hsu et al. (2015), and transfer learning paradigms. For example, unsupervised domain adaptation

with imbalanced cross-domain data is tackled by Hsu et al. (2015) through Closest Common Space

Learning (CCSL), ensuring robust learning despite limited labeled targets. Similarly, reinforcement

learning-based domain adaptation can learn robust domain-invariant representations Wu and Huang

(2022), while sampling-based implicit alignment approaches help to mitigate class imbalance im-

pact Jiang et al. (2020). By integrating these methodologies—domain-adaptive feature extraction,

GAN-based augmentation, and transfer learning—recent studies have reported improved AE lo-

calization accuracy and reduced training data requirements Li et al. (2018); Sun (2020); Bengio

(2012). Techniques such as fine-tuning pre-trained models on simulated data allow pre-trained

models to generalize well to variations in acoustic emission signals Ismail-Fawaz et al. (2022),

ultimately guiding the practical deployment of single-sensor monitoring systems and advancing

structural health monitoring practices Bengio (2012).

To address this limitation, we propose an integrated domain-adaptive framework incorporating

three key strategies as shown in Figure 3: (1) pre-training on simulated AE data to establish ro-

bust feature representations, (2) domain adaptation techniques to align simulated and experimental

feature distributions, and (3) generative adversarial networks (GANs) for synthetic data augmen-

tation of limited experimental datasets. This proposed domain adaptive framework significantly

7

Figure 3 Overview of Advanced Methods for Bridging Simulation-Based AE Data with Experi-
mental Data

enhances AE source localization and characterization accuracy and reliability in practical applica-

tions, demonstrating effective knowledge transfer between simulation and experiment domains.

2.2 Experimental Setup

The experimental setup used in this research centers on a novel fiber-optic coil-based AE

sensing system. As shown in Figure 4, the arrangement features a grid-marked aluminum plate and

a fiber-optic AE sensor developed in Liu et al. (2020). This sensor includes a fiber coil with two

identical Fiber Bragg Gratings (FBGs) forming a low-finesse Fabry–Perot Interferometer (FPI).

With an 8 mm outer diameter and 6 mm inner diameter, the sensor can be flexibly mounted onto the

sample surface, ensuring high ultrasound sensitivity while adapting to environmental perturbations.

Ultrasound waves induce refractive index variations within the coiled fiber, causing shifts in the

FPI’s spectral fringes. By employing a modified phase-generated carrier method Karim et al.

(2021), the AE signal can be extracted with good linearity and high sensitivity regardless of the

laser wavelength alignment with the sensor fringes.

8

Figure 4 Schematic of the fiber-optic coil-based Acoustic Emission (AE) sensing and monitoring
system

Light from a narrow-linewidth, tunable diode laser passes through a circulator and polarization

controller before reaching the FBG-FPI sensor. Reflected light returns through the circulator to a

photodetector (PD). The output from the PD is processed through dual parallel paths, each mixed,

filtered, and amplified to enhance the signal-to-noise ratio (SNR) over the frequency band of interest

(50–500 kHz). These steps enable effective isolation of the AE signals from background noise and

interference.

2.2.1 PLB Test and Impact Test

To generate AE signals, the widely accepted Hsu–Nielsen pencil lead break (PLB) test Sause

(2011) is employed. As depicted in Figure 5(a), the aluminum plate is divided into a well-defined

grid layout with multiple test points. Each of the designated locations undergoes repeated PLB tests

using a 2H mechanical pencil with a 0.5 mm diameter lead. This standardized procedure ensures

that each location underwent ten PLB tests, providing a consistent and reproducible dataset. In

addition to PLB tests, impact-like signals are generated by dropping small steel balls onto the plate

from controlled heights. The diversity in signal types and locations allows for evaluation of the

AE source localization. Furthermore, impact-like signals were gathered by dropping steel balls

(4.7 mm diameter) from a height of 25 mm at the same AE sensor location illustrated in Figure

5(b). The equipment and settings for this experiment mirrored those utilized for the PLB tests. The

recorded signals were distinguished and examined for AE source identification and localization,

using these procedures. The experimental setup facilitated the collection of precise and accurate

9

data, thereby enabling the evaluation of the proposed method’s efficacy.

Figure 5 Experimental test setups: (a) PLB Test (b) Impact Test

2.2.2 AE Data Labeling and Classification

Figure 6 (a) Tests conducted on an aluminum plate that is segregated into nine identified zones (b)
manual crack propagation test on the sample with three zones labeled

For this study, PLB tests were conducted on a 2.54 mm thick aluminum plate measuring 0.30 x

0.30 m. The grid-marked approach and repeated testing across numerous points yield a robust set

of AE signals crucial for training and validating the deep learning models. Method for AE signal

generation Sause (2011), was used in this study. PLB tests were conducted on a 1/10-inch-thick

aluminum plate. The plate was partitioned into nine distinct locations as delineated in Figure 6(a).

Each of the nine representative points, denoted by a red dot, underwent the PLB test ten times, using

a 2H mechanical pencil with a 0.5 mm diameter lead. To associate AE waveforms corresponding to

crack propagation levels, A 1-inch pre-crack notch simulated crack initiation shown in Figure 6(b).

A 2H lead pencil, angled at 45°, was fractured on the plate’s surface in ten repetitions at cracking

10

tip to mimic AE generated during crack propagation, aiding in the classification of different crack

stages. The propagation step size, denoted as Δ𝐿, is set at 0.1 inches. The tests strive for uniformity

by breaking nearly identical lengths of pencil leads at the same angle to the surface. We aim to

demonstrate the ability to discern different cracking levels through the parametric analysis of AE

signals.

2.3 Simulation Setup

2.3.1 Analytical Excitation Function

We characterize AE burst in consistent with the approach in Cuadra et al. (2015). We pre-

train Deep Learning models using AE data obtained from Finite Element Method (FEM). These

simulations include both impact-type and PLB tests to improve source localization in the specimen

Hamstad (2007). In the setup, the PLB source was positioned in the out-of-plane direction at a

predefined location on the plate, with the sensor located an inch from the right and upper edges of

the plate. This approach enables the production of pre-training data for DL models, fostering more

accurate and efficient localization of AE sources in practice. In essence, simulated AE signals serve

as effective training surrogates, allowing us to bolster the accuracy and efficiency of real-world AE

signal-processing algorithms.

Figure 7 (a) Excitation function to simulate PLB test (b) Excitation function to simulate impact test

Two excitation functions were employed in the simulations: one for the PLB test shown in

Figure 7(a) and another for the impact test shown in Figure 7(b). For the PLB test, the excitation

signal 𝐹1(𝑡) was chosen due to its gradual increase in amplitude, representing mechanical loading.

11

The function is defined as:

𝐹1(𝑡) =

−2𝑡/𝑡1,

0 < 𝑥 < 𝑡1

− cos(𝜋(𝑡 − 𝑡1)) − 1,

𝑡1 < 𝑥 < 𝑡2

(2.1)

0,

𝑡2 < 𝑥






In this expression, 𝑡1 and 𝑡2 define specific time intervals during which the signal ramps up and

then decrease to zero, effectively simulating the mechanical loading process. For the impact test,

the excitation signal 𝐹2(𝑡) is expressed as:

𝐹2(𝑡) = 𝐶𝑒−𝛾𝑡/𝑡0 sin

(cid:19)

(cid:18)

4𝜋
1 + 𝑡0/𝑡

(2.2)

Here, 𝐶 denotes the initial amplitude, 𝛾 is a damping factor, and 𝑡0 represents a characteristic time,

with the value shown in Table 2.1. This damped sinusoidal wave is commonly observed in impact

tests, capturing the material response to mechanical loading. 𝐹1(𝑡) and 𝐹2(𝑡) coupled with FEM

simulations, yield waveforms that closely approximate real AE signals in both PLB and impact

scenarios.

Table 2.1 Analytical function parameter specification

Parameter
Young’s modulus
Poisson’s ratio
Density
𝑡0
𝑡1
𝑡2
Decay rate 𝛾

Value (unit)
206 (GPa)
0.3
2710 (kg/m3)
5 (𝜇s)
6.5 (𝜇s)
7.5 (𝜇s)
1.85

2.3.2 Wave Propagation

By applying external loads, tensile stresses develop, especially at the extremities. The resulting

stress intensity factor, 𝐾, gauges the intensity of the stress field near the crack tip. In this research, we

model crack growth using a step-by-step approach where crack length increases by predetermined

amounts. The rate and path of propagation are governed by factors such as stress intensity range,

12

maximum stress intensity, fracture toughness 𝐾𝐼𝐶, and local stresses at the crack tip. The stress

ratio 𝑅 = 𝜎𝑚𝑖𝑛/𝜎𝑚𝑎𝑥, and the differential stress intensity factor Δ𝐾𝐼 = (1 − 𝑅)𝐾𝐼 characterize the

cyclic loading conditions associated with time-dependent energy dispersal from the crack. Table

2.2 details the relevant material properties. When the predicted growth rate aligns with the critical

fracture toughness 𝐾𝐼𝐶, a catastrophic fracture event is anticipated. Because direct measurement

of 𝐾 is challenging due to the inherent stress singularity, alternative energy-based methods such

as the J-integral and energy release rate become indispensable. These methods facilitate the stress

field assessment near the crack tip, improving accuracy in evaluating fracture mechanics.

Table 2.2 Material Property

Name Expression
Quantity
𝐸
Young’s modulus
𝜈
Poisson’s ratio
Coefficient in Paris’ Law 𝐴
𝑚
Exponent in Paris’ Law

206 GPa
0.3
1.4×10−11
3.1

Figure 8 Snapshots of Von Mises stress distribution in MPa obtained from fatigue loading FEM
simulation as the crack propagate: (a) crack length of 0.1 inch (b) crack length of 1.5 inch (c) crack
length of 3 inch

The snapshots of stress distribution as the crack propagates at different lengths (0.1 inch,

1.5 inches, and 3 inches) is illustrated in Figure 8. These highlight how stress concentrations

developed with increasing crack size. To further investigate the transient dynamics, time-domain

simulations capture wave propagation, illustrating how AE waveforms are generated and how

their characteristics change as the fracture progresses. Such simulations offer profound insights

13

into crack behavior, underscoring the utility of FEM analyses in nondestructive evaluation and

structural health monitoring.

2.4

Imbalanced and Limited Training Data

We rely on simulation data to develop training sets and enhance Deep Learning models, yet

real-world fatigue testing continues to play a critical role.

In this research, crack lengths are

categorized into three stages as shown in Figure 9—initiation (0–1 inch), stable propagation (1–2

inches), and unstable propagation (2–3 inches)—to classify AE signals more effectively. However,

data imbalance, particularly in the 2–3 inch range where rapid failure occurs, presents a common

machine learning challenge. While we do not show it here, earlier analyses highlighted the need

for balanced datasets or data augmentation techniques to ensure that DL models effectively learn

from all stages of crack evolution. By coupling FEM simulations with controlled experiments, we

have developed a coherent framework that not only enhances AE source localization through DL

pre-training but also provides deeper insights into fracture mechanics. This approach underscores

the potential of simulation-based pre-training. Ultimately, the integration of FEM simulations and

experimental validation paves the way for improved structural health monitoring, fostering more

reliable, data-driven fracture assessments in complex materials and components.

Figure 9 (a) Crack lengths divide into 3 categories (b) Imbalanced dataset with a reduction factor
of 0.5, 0.9 on the last class

14

2.5 Transfer Learning and Pre-Training on Simulation Data

In the context of AE signal classification via deep learning, CNN filters act as universal pattern

detectors in AE waveforms, similar to how shapelets identify distinctive acoustic signatures. AE

waveform shows patterns such as burst emissions, reflected waves, and attenuative phenomena.

Convolutional filters, initially optimized for micro-crack detection, can serve as pretrained feature

extractors for related AE signal classifications. These transferable features include characteristic

wave shapes, frequency components, and amplitude variations that are common across different

types of material failure mechanisms. This transfer learning approach enables parameter opti-

mization in similar detection tasks. The application of transfer learning to new data, leveraging the

insights obtained from pre-trained models. A variety of deep learning (DL) models, including Con-

volutional Neural Network (CNN), Fully Connected Neural Network (FCNN), Encoder, Residual

Network (ResNet), Inception, and Multi-Layer Perceptron (MLP), were assessed for their ability to

analyze simulated datasets and to extract underlying features using a layer-wise fine-tuning strategy.

The employed methodology entailed signal acquisition from the simulated datasets, followed by

data preprocessing, feature extraction via fine-tuned DL models, and finally classification based

on Acoustic Emission (AE) source location. To investigate impact and PLB test simulations, six

DL models with distinct architectures and capabilities were investigated. This innovative strategy

leads to a broader comprehension of the data, permitting the recognition of overlooked patterns and

features when using a singular model.

The CNN architecture consists of two sequential convolutional blocks, each containing a 1D

convolution layer followed by instance normalization, dropout, and max pooling. The extracted

hierarchical features are flattened before passing through a SoftMax classifier, with the network

trained using categorical cross-entropy loss and the Adam optimizer Simonyan and Zisserman

(2014). FCN shares similarities with CNN but substitutes max pooling with global average pooling

to minimize spatial information loss, compacting spatial information into a 1D vector before

passing to the SoftMax classifier Zhang et al. (2017). ResNet employs residual blocks to address

the vanishing gradient issue by adding the input directly to stacked convolutional layers, utilizing

15

batch normalization and L2 regularization, with each residual block containing two Conv1D layers

followed by batch normalization and activation, before average pooling and transmission to the

SoftMax classifier He et al. (2016). Encoder mirrors CNN’s convolutional blocks but implements

ReLU activation and instance normalization, followed by an attention mechanism that weights

feature maps to focus on relevant features before flattening and classification Vincent et al. (2008).

MLP replaces convolutional layers with two dense layers incorporating dropout for regularization,

flattening the input time series before processing and utilizing SoftMax activation for classification

Delashmit and Manry (2005). Inception employs an inception module featuring parallel branches

of 1x1, 3x3, and 5x5 convolutions and max pooling, concatenating their outputs to form the

inception module, implementing batch normalization and dropout post-module before flattening

and transmitting to the SoftMax classifier Zhang et al. (2022).

Figure 10 Schematic and structure of knowledge transfer via deep transfer learning

We demonstrate the method’s effectiveness by enhancing DNN performance in acoustic emis-

sion (AE) source localization in Huang et al. (2023). The approach begins with pretraining a DNN

on extensive simulated AE data to establish baseline signal feature recognition. Following this,

16

we fine-tuned the DNN on a smaller experimental dataset, a process that facilitated the network’s

learning of specific features present in the experimental data. Our process involves first trans-

ferring layers from a pretrained model, and subsequently freezing their parameters. As new AE

monitoring data is processed, it passes through these frozen layers before progressing through the

trainable layers, allowing us to localize the AE source. Owing to the intrinsic connection between

simulation and experimental data, the feature extractor can be applied to the latter, incorporating

it as a non-adjustable layer in our model. We designate the high-level features extracted from

these layers as “bottleneck” features due to their high level of condensation and their position at

the classifier’s preceding constriction point (as illustrated in Figure 10). The applied deep learning

architecture comprises one of six classifiers, each consisting of multiple fully connected layers

following global pooling. This design enables nonlinear mapping of bottleneck features to AE

source localization. Additionally, a fusion layer is utilized to amalgamate extracted features, and an

extra layer is employed to link bottleneck features to location predictions. During fine-tuning, the

pretrained model’s weights serve as the initial values, and the model undergoes further training with

available target domain data. As a consequence, the fine-tuned model can acclimatize to the target

domain’s unique characteristics, offering superior performance to a model trained from scratch.

2.6 Domain Adaptation

The goal of domain adaptation is to learn features that are invariant to both the source and

unannotated target domains, thus bridging the distribution gap between them. This strategy en-

ables precise localization without dependence on labeled training data, resulting in beneficial data

augmentation. The proposed approach efficiently achieves unsupervised domain adaptation of time

series by utilizing a substantial amount of data for training in the absence of labeled data.

In this section, we present domain adaptation consists of three main components: a feature

extractor 𝑓 , a source classifier 𝑔, and a target domain classifier ℎ in Huang et al. (2024a). The

feature extractor functions as a four-layer neural network that transforms AE waveforms into 64-

dimensional features, utilizing ReLU activation, batch normalization, and dropout for optimal

regularization. The source classifier processes these extracted features through a network with one

17

hidden layer to predict class probabilities across 9 categories using softmax activation, while the

domain classifier employs a similar structure with sigmoid activation to determine whether inputs

belong to the target domain. These components are integrated and trained using a combined loss

function that balances classification accuracy and domain adaptation objectives, where the feature

extractor is shared between the source classifier and domain classifier to enable simultaneous

optimization. The training workflow progresses from dataset preparation through fold division,

followed by concurrent training of the source classifier and feature extractor, and culminates in

the training of a target classifier on combined source and target data to achieve domain invariance.

During this workflow, we begin by preparing the dataset, then divide it into training and testing folds.

Next, we design and train a source classifier 𝑔 alongside a feature extractor 𝑓 using a combined loss

function that balances classification and domain adaptation goals. Finally, we initialize and train

the target classifier ℎ on combined source and target data, tracking its performance over multiple

epochs to assess how well domain invariance is achieved.

Figure 11 High-Level Overview of Domain Adaptation Workflow: From Data Preparation to
Evaluation and Implementation

High-Level overview outlines the workflow in Figure 11, which unfolds in three distinct phases:

Data Preparation and Cross-Validation, Model Initialization and Training, and Evaluation and Im-

18

plementation. Initially, data is prepared and processed through 10-fold cross-validation. The model

architecture is then designed by selecting 𝑖 relevant features. Training of the source classifier 𝑔

begins, optimizing a combined loss function of Maximum Mean Discrepancy (MMD) and clas-

sification loss to enable domain adaptation. This training proceeds to sufficiently minimize the

loss. Subsequently, the target classifier ℎ is initialized, trained on combined source and target data,

and its performance is methodically recorded via test accuracy and confusion matrices after each

epoch, facilitating a comprehensive assessment of the model’s domain adaptation efficacy. We em-

ploy 10-fold cross-validation which consists of two key steps: feature normalization to standardize

measurement scales, and feature selection to identify key predictors. Each fold maintains balanced

class distributions and representative sampling, ensuring reliable model evaluation and creating a

solid foundation for subsequent training phases. The initial training phase focuses on the source

domain dataset {𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑠𝑜𝑢𝑟𝑐𝑒}, where we simultaneously optimize the feature extractor 𝑓 and

classifier 𝑔. The training process minimizes the classification loss 𝐿𝑐𝑙𝑠, through gradient descent,

updating the parameters 𝜃 𝑓 and 𝜃𝑔 of both networks according to:

𝜃 𝑓 ← 𝜃 𝑓 − 𝜂 · ∇𝜃 𝑓 𝐿𝑐𝑙𝑠

𝜃𝑔 ← 𝜃𝑔 − 𝜂 · ∇𝜃𝑔 𝐿𝑐𝑙𝑠

(2.3)

To align the feature distributions of source and target domains, we incorporate the Maximum Mean

Discrepancy (MMD) metric. Conceptually, MMD measures the difference between mean embed-

dings of source and target samples in a Reproducing Kernel Hilbert Space (RKHS). Formulated

as:

𝑀 𝑀 𝐷 =

(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)

1
𝑛

𝑛
∑︁

𝑖=1

𝜙(𝑥𝑖) −

1
𝑚

𝑚
∑︁

𝑗=1

𝜙(𝑦 𝑗 )

(cid:12)
2
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
H

(2.4)

It is designed to reduce the difference between the distributions, facilitating domain adaptation.

𝑥𝑖 and 𝑦 𝑗 are samples from the source and target distributions, respectively. 𝜙 represents the

feature map to the RKHS, transforming the samples into a high-dimensional space where the mean

difference is computed. 𝑛 and 𝑚 denote the number of samples in the source and target datasets,

19

respectively. The norm | · |2

H indicates the square of the RKHS norm, measuring the distance

between the mean embeddings of the two distributions in the RKHS.

Moreover, the model undergoes adjustments to minimize both the classification loss on the

source domain and the domain discrepancy, striving for a balance between classification accuracy

and domain invariance. Joint Parameter Update Reflecting Classification Accuracy and Domain

Invariance. The investigation focuses on the trade-off by fixing 𝜆 = 100 and considering the number

of epochs 𝑀 as a variable. The optimization of the feature extractor’s parameters considers the

effects of both 𝐿𝑐𝑙𝑠 and MMD. The combined loss function guiding this process is:

𝐿𝑡𝑜𝑡𝑎𝑙 = 𝐿𝑐𝑙𝑠 + 𝜆 · MMD

𝜃 𝑓 , 𝜃𝑔 ← Update based on ∇(𝐿𝑡𝑜𝑡𝑎𝑙)

(2.5)

(2.6)

When training our domain adaptation model, we balance minimizing the classification loss (𝐿𝑐𝑙𝑠)

against the Maximum Mean Discrepancy loss (MMD). Initially, the model prioritizes 𝐿𝑐𝑙𝑠 reduction,

which may not immediately impact MMD. As training progresses and domain features begin

to align, reducing MMD, there’s a risk of the model becoming overly generalized, potentially

diminishing its classification sharpness. The model strives to learn domain-invariant features

through MMD minimization while preserving its performance on source data. Striking the right

balance is critical; focusing too heavily on one aspect can lead to overfitting or underfitting.

Figure 12 illustrates the intricate training dynamics of our model, showcasing three metrics: Average

Total Loss, Average MMD, and Average Classification Loss over 100 epochs, using 10-fold cross-

validation. These metrics reveal the delicate interplay between the competing goals of domain

adaptation. The shaded regions in the graphs suggest variance across cross-validation folds,

highlighting the inherent uncertainties in model training with diverse data subsets. Early in training,

the model primarily reduces classification loss on the source domain, which may not affect MMD

immediately. As training progresses, MMD minimization improves alignment of source and target

feature distributions, but an excessive focus on MMD can reduce classification specificity.

20

Figure 12 Training Dynamics of a Domain Adaptation Model: (a) Average Total Loss, (b) Average
Maximum Mean Discrepancy (MMD), and (c) Average Classification Loss over 100 Epochs 10
cross validation

Once the feature extractor 𝑓

is sufficiently trained to learn domain-invariant features, we

initialize a separate target classifier ℎ, parameterized by 𝜃ℎ. The feature extractor 𝑓 is then used

to transform the target data 𝑋𝑡𝑎𝑟𝑔𝑒𝑡. The classifier ℎ is trained by optimizing 𝜃ℎ to minimize the

combined loss function 𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑.

In other words, we update 𝜃ℎ to minimize a combined loss

function, ensuring predictions on transformed target data align with true labels as:

𝜃 (𝑡+1)
ℎ

= 𝜃 (𝑡)

ℎ − 𝜂 · ∇𝜃ℎ 𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 (𝜃 (𝑡)
ℎ )

(2.7)

Where 𝜂 is the learning rate. By comparing prediction from ℎ( 𝑓 (𝑋𝑡𝑎𝑟𝑔𝑒𝑡)) against the true labels

𝑌𝑡𝑎𝑟𝑔𝑒𝑡:

𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 (𝜃ℎ) = loss(ℎ( 𝑓 (𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑋𝑡𝑎𝑟𝑔𝑒𝑡)), (𝑌𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡))

(2.8)

A L2-regularization term 𝜀

2 |𝜃ℎ|2 is added to combat overfitting. This ensures target classification
improves without overfitting to any peculiarities in the target set. Overall, this domain adaptation

framework integrates data processing, source-target alignment, and target classifier refinement to

produce a versatile, robust model capable of handling distributional shifts across domains.

21

Figure 13 Architecture of the Generator and Discriminator networks in the GAN for AE signal
augmentation

2.7 GAN-Based Data Augmentation

The GAN architecture for data augmentation is illustrated in Figure 13, involves a generator and

a discriminator, both of which are sequential models. The problem formulation can be stated as

training a GAN to generate realistic data samples from an input latent space. Mathematically, the

GAN consists of a generator G and a discriminator D. The generator is a function that maps points

in the latent space 𝑧 ∼ 𝑝𝑧 (𝑧) to candidate data points G(z), while the discriminator is a function that

estimates the probability that a given data point is real (as opposed to generated). The generator

and discriminator compete through alternating training steps, each minimizing their respective

loss functions. The discriminator learns to maximize its ability to distinguish between real and

generated samples, while the generator aims to produce samples that can fool the discriminator.

This adversarial training process continues until the generator creates samples so convincing that

the discriminator can no longer effectively differentiate between real and synthetic data. The

discriminator is updated by minimizing

𝐿 𝐷 = 𝐸 [log(𝐷 (𝑥𝑟𝑒𝑎𝑙))] + 𝐸 [log(1 − 𝐷 (𝐺 (𝑧)))]

(2.9)

And then update the generator by minimizing

22

𝐿𝐺 = 𝐸 [log(1 − 𝐷 (𝐺 (𝑧)))]

(2.10)

Moreover, to prevent mode collapse, we implement a model collapse edge threshold 𝜏.

If

|𝐿 𝐷 − 𝐿𝐺 | > 𝜏 for a certain number of consecutive iterations, we reinitialize the models. Hyperpa-

rameter tuning was conducted using grid search to optimize the learning rates for the discriminator

and generator respectively. The overall framework of the innovative hybrid method. Our study

employs a novel approach that combines advanced data augmentation techniques with an adapted

Inception architecture to enhance the accuracy and robustness of AE source localization in com-

plex structures. The methodology integrates custom-designed GAN for data augmentation with

an Inception network specifically adapted for regression tasks. Figure 14 illustrates the overall

process flow. It commences with the collection of AE signals, each labeled with coordinates in

the form of (𝑑𝑖, 𝜃𝑖) for 35 distinct positions. These signals undergo a training/test split. The

training data is then augmented using four different GAN architectures (GAN, DCGAN, WGAN,

and TSAGAN) to address dataset imbalance and scarcity issues. This multi-GAN augmentation

approach is crucial as it generates synthetic AE signals that closely emulate the characteristics of

real data, effectively expanding the dataset and improving the model’s ability to generalize across

various AE source locations. Each GAN variant offers unique strengths in data synthesis, allowing

for a comprehensive augmentation strategy. The augmented training data from each GAN feeds

into an Inception Network specifically adapted for regression tasks, forming the core of our hybrid

approach. This adaptation of the Inception network, originally designed for image classification,

enables effective processing of AE signals across multiple scales, capturing both local and global

features crucial for accurate localization. This trained network is then utilized to predict AE source

locations from the test data.

The architecture of the Inception network used for regression is illustrated in Figure 15 Huang

et al. (2024b).

It initiates with an input layer, followed by an Inception Module that processes

data through multiple parallel pathways. The Inception Module is particularly adept at AE signal

processing as it can simultaneously extract features at different scales, which is essential given

23

Figure 14 Workflow of the hybrid network for AE source localization

the complex nature of AE waveforms. The outputs are concatenated and passed through batch

normalization. A ReLU activation function is then applied, followed by Global Average Pooling

to reduce spatial dimensions. Global Average Pooling is employed instead of traditional fully

connected layers to minimize the number of parameters, mitigate overfitting, and maintain spatial

information. Finally, a dense layer produces the output, predicting the AE source location as

continuous coordinates.

Figure 15 Architecture of the Inception network for regression

24

Table 2.3 Specification for GAN Variants

Feature
Generator
Architecture

GAN
DCGAN
Fully connected
1D transposed
convolutions,
layers
(128, 512, 1024) Dense layers

TSAGAN
Fully connected
layers
(128, 512, 1024)

WGAN
Fully connected
layers
(128, 512, 1024)

Input
Output layer

Discriminator
Architecture

Optimizer
Batch Size
Epochs
Special
Features

(128, 512, 1024)

Noise vector

Dense layer with Dense layer with Dense layer with Dense layer with
LeakyReLU
Fully connected
layers
(1024, 512, 64)

tanh
1D convolutions, Fully connected
Dense layers

LeakyReLU
1D convolutions,
Dense layers

LeakyReLU

layers
(1024, 512, 64)
Binary
cross-entropy

Binary
cross-entropy

Wasserstein
loss
Adam (learning rate = 0.0002, beta = 0.001)
64
2000

Model collapse Model collapse Model collapse Multiple critic
updates, Model
monitoring
collapse monitoring

monitoring

monitoring

Loss Function Binary

cross-entropy

The four GAN architectures - GAN, DCGAN, TSAGAN, and WGAN - share a common

foundation but differ in key aspects of their design and training approach is shown in Table 2.3.

The original GAN uses fully connected layers in both the generator and discriminator, with a

structure of (128, 512, 1024) neurons for the generator and (1024, 512, 64) for the discriminator.

It employs LeakyReLU activations throughout, with batch normalization (momentum 0.8) in both

networks.DCGAN introduces convolutional layers, specifically using 1D transposed convolutions in

the generator and 1D convolutions in the discriminator, which are particularly effective for capturing

spatial or temporal patterns in the data. It typically uses ReLU activations in the generator and

LeakyReLU in the discriminator, with a tanh activation in the final generator layer. TSAGAN,

tailored for time series data, reverts to a fully connected architecture similar to the original GAN

but is optimized for sequential data. The key innovation of WGAN lies not in its architecture,

which is similar to DCGAN with convolutional layers, but in its use of the Wasserstein loss

function and weight clipping in the discriminator (now called a critic). This change allows for more

stable training and potentially better-quality results. WGAN also typically includes dropout in the

25

discriminator, a feature that’s not present in the other architectures. All four models use the Adam

optimizer with learning rate 0.0002 and beta 0.001, a batch size of 64, and are trained for 2000

epochs. They all incorporate model collapse monitoring, but WGAN stands out with its multiple

critic updates per generator update. These architectural differences make each variant suitable for

different types of data and training scenarios, with DCGAN and WGAN often performing well on

complex, high-dimensional data.

Wasserstein Distance is a powerful metric for comparing probability distributions because it

captures how much “effort” it would take to move one distribution’s mass to match another’s.

Unlike traditional measures such as Kullback–Leibler (KL) divergence and Jensen–Shannon (JS)

divergence, Wasserstein Distance provides a well-defined gradient even when the two distributions

do not overlap. This property is particularly beneficial in training GANs, where the generator aims

to align its synthetic distribution with the real data distribution. By offering a continuous and

smooth gradient signal, Wasserstein Distance helps mitigate common GAN training issues such as

mode collapse and vanishing gradients. Essentially, it reflects the intuitive notion of how similar

two datasets truly are by quantifying the minimum “transport cost” to convert one distribution into

the other. As a result, it serves as a more stable and interpretable convergence metric, making it

well-suited for GAN training and synthetic data quality assessment.

Figure 16 Comparison of Wasserstein Distance convergence across epochs for four GAN variants
(GAN, TSAGAN, WGAN, and DCGAN)

26

Figure 16 shows the Wasserstein Distance between the original and synthetic datasets across

different epochs for four GAN variants, serving as a metric to quantify the similarity between the

two distributions. Lower values indicate higher similarity. The WGAN shows the most rapid

convergence, achieving the lowest Wasserstein Distance of about 2.5 by epoch 100 and maintaining

this level throughout training. The DCGAN and TSAGAN demonstrate similar convergence

patterns, starting with high distances but steadily decreasing to around 3 by epoch 2000. The

standard GAN, interestingly, shows the most volatile behavior, with an initial decrease followed by

a spike at epoch 500, before eventually converging to a distance similar to DCGAN and TSAGAN

by epoch 2000. This comparison reveals that while all GAN variants eventually achieve similar

levels of data similarity, they differ significantly in their convergence paths. The WGAN’s rapid

and stable convergence suggests it may be the most efficient in generating high-quality synthetic

data, despite its underperformance in the final localization task. The standard GAN’s volatility

indicates a need for careful monitoring during training, although it ultimately achieves competitive

results. Overall, the results confirm that with adequate training, the GAN-based augmentation

technique significantly enhances the dataset, providing a balanced and high-quality training set for

deep learning models.

2.8 Results and Discussion

In this chapter, we demonstrate the effectiveness of domain adaptation and transfer learning

in addressing data imbalance challenges in acoustic emission monitoring. Through the evaluation

of multiple deep learning architectures and GAN-based data augmentation methods, we achieved

significant improvements in source localization accuracy, with the hybrid GAN-Inception approach

reducing median localization compared to baseline methods. These findings establish a robust

framework for single-sensor AE monitoring systems, while highlighting the critical balance between

feature selection, model architecture, and data augmentation strategies for optimal performance in

structural health monitoring applications.

For the effect of data imbalance and domain adaptation, the F1-score performance under varying

feature counts (10 to 100) and different imbalance levels in the final damage class (reduction factors

27

of 0.1 to 0.9) is shown in Figure 17. DA models were trained for 30 epochs and compared to a

baseline CNN without DA (dashed line). Models with 10–40 features show relative robustness

against imbalanced data, maintaining higher F1 scores despite increasing reduction factors. Larger

feature sets (50–100) exhibit sharper performance drops, indicating potential overfitting rather than

improved adaptability.

Figure 17 F1 score feature count of 10 to 100, data imbalance with reduction factor on last class of
0.1 to 0.9, domain adaptation versus CNN

We highlighted the comparison with Baseline CNN. The CNN without domain adaptation

deteriorates significantly under heavy imbalance, while DA models maintain higher F1 scores.

Intermediate feature counts of 40 to 60 exhibit non-linear trends, suggesting an optimal range

where the number of features aligns well with specific imbalance levels. These findings underscore

the importance of calibrating the feature dimensionality in DA models to handle class imbalance

effectively. For classifier performance with transfer learning, Figures 18 and 19 compare six

classifiers (CNN, MLP, FCN, ResNet, Inception, Encoder), with and without transfer learning, on

Impact and PLB dataset. As for the impact dataset, CNN and MLP achieve accuracy above 0.8,

and transfer learning yields small but consistent gains in precision and recall. FCN remains at

∼0.2 accuracy and sees minimal improvement from transfer learning. ResNet benefits the most,

showing increased variability but higher scores with transfer learning. Inception sees minor gains

28

while encoder performance is largely unchanged except for a slight precision boost.

Figure 18 For Impact dataset, the distribution of (a) accuracy (b) precision, and (c) recall from
a 10-fold cross-validation for six classifiers on the Impact test dataset. Models without transfer
learning are indicated by red bars, while those with transfer learning are shown in blue bars

As for the PLB dataset, CNN and MLP again improve modestly with transfer learning. FCN’s

performance (<0.1 accuracy) degrades further when transfer learning is applied. ResNet shows

significant gains, while Inception performance drops slightly under transfer learning while Encoder

benefits marginally in all metrics. In the task of AE Source Localization with GAN-Based Data

Augmentation, Figure 20 compares two AE localization approaches on a 24” × 24” × 0.1” aluminum

plate: a hybrid deep learning model combining GAN-based augmentation and Inception networks

and an Inception network without augmentation respectively. Axes show dimensions in inches.

In Figure 20(a), square markers (true locations) closely match star markers (predicted), indicating

minimal deviations and robust performance across the plate. In Figure 20(b), errors grow toward

29

the plate edges, indicating poorer generalization with fewer training examples.

Figure 19 For PLB dataset, the distribution of (a) accuracy (b) precision, and (c) recall from a
10-fold cross-validation for six classifiers on the PLB test dataset. Models without transfer learning
are indicated by red bars, while those with transfer learning are shown in blue

Table 2.4 Mean and variance of prediction errors for the raw dataset and augmentation approaches

Performance
(prediction error)
Mean
Variance

Raw Data

6.08
0.42

Augmented Dataset
Noise GAN DCGAN TSAGAN WGAN
5.18
0.23

3.64
0.27

2.91
0.29

5.65
0.19

3.43
0.14

The comparison of the prediction error mean and variance for the raw dataset and five augmen-

tation strategies—noise-based, GAN, DCGAN, TSAGAN, and WGAN. The raw dataset registers

the highest mean error at 6.08 with a variance of 0.42, indicating relatively large and inconsistent

deviations in predictions. Introducing simple noise-based augmentation reduces the mean error to

5.18 and lowers the variance to 0.23, reflecting an incremental but limited improvement in predic-

30

Figure 20 Comparison of Acoustic Emission (AE) source localization performance. (a) Results
from the hybrid deep learning model with GAN-based data augmentation and Inception network.
(b) Results from the Inception network alone without GAN-based augmentation

tion accuracy. By contrast, all GAN-based methods yield more substantial benefits. Notably, the

standard GAN attains the lowest mean error of 2.91, underscoring its efficacy in generating syn-

thetic data that closely aligns with the real distribution. DCGAN, while not matching the standard

GAN’s overall mean error, exhibits the smallest variance of 0.14, suggesting greater consistency in

predictions across diverse samples. TSAGAN and WGAN show intermediate performances, with

TSAGAN recording a mean error of 3.64 (variance 0.27) and WGAN yielding 5.65 (variance 0.19).

Although WGAN converges steadily during training, its final prediction accuracy here remains

comparatively lower.

In the end, we highlight prediction errors comparison for two principal scenarios in Figure 21,

a baseline approach without GAN-based augmentation versus an enhanced approach incorporating

GAN-generated synthetic data. Without GAN, the distribution of errors exhibits larger variance,

indicating higher uncertainty and less accurate overall predictions. The GAN-augmented model

shows a significant decrease in the average prediction error and a more compact distribution,

reflecting not only a lower bias but also reduced variance. These outcomes strongly suggest that

synthetic samples produced by the GAN effectively enrich the training set, thereby enabling the

model to learn more discriminative features and improve localization performance. Consequently,

31

Figure 21 Comparison of prediction error distributions for the baseline model without GAN aug-
mentation versus the GAN-augmented model

the difference between these two scenarios underscores the utility of GAN-based data augmentation

in achieving more reliable and precise predictions in the tested environment.

32

CHAPTER 3

MULTI-FIDELITY SURROGATE MODELING FOR EFFICIENT SIMULATION
EXPERIMENT INTEGRATION

3.1

Introduction

Motion-Induced Eddy Current (MIEC) testing has emerged as a valuable nondestructive eval-

uation (NDE) modality for detecting surface and near-surface defects in high-speed metallic com-

ponents. While the induction of eddy currents in conductive materials is traditionally associated

with alternating magnetic fields, it can also occur through relative motion between a conductor

and a magnetic field source. By definition, MIEC arises when a magnetic source moves relative

to a conductive material, thereby generating velocity-dependent eddy currents Wang et al. (2020).

These induced currents significantly alter the local magnetic field near defects, complicating signal

interpretation—particularly at higher inspection speeds Wang et al. (2020); Li et al. (2006); Piao

et al. (2020); Park and Park (2004). This phenomenon not only has practical applications in NDT

but also holds fundamental theoretical significance in the broader field of electromagnetism. As

Figure 1 conceptually illustrates, the sample traveling at velocity v induces a electromotive force

(emf) computed by:

𝜺 = v × B𝐴

(3.1)

The resulting conduction currents create secondary magnetic fields that oppose changes in the

local flux, as per Lenz’s law, it is expressed as:

J = 𝜎(v × B)

(3.2)

Initially, MFL signals are governed primarily by the applied field B𝐴 and the defect geometry.

However, when inspection speed increases, the induced currents become stronger, distorting B𝑀 𝐹 𝐿.

In other words, B𝑀 𝐹 𝐿 is no longer solely a function of the defect’s geometry and B𝐴, but also

critically depends on velocity-induced eddy currents. Accurate defect characterization at higher

speeds thus necessitates compensating for MIEC effects Li et al. (2006). For instance, Piao et al.

33

(2020) reported that higher speeds and increased wall thickness reduce steel-pipe magnetization,

resulting in more pronounced MFL signal distortions. To address such challenges, researchers have

pursued compensation schemes or developed innovative data-integration methodologies aimed at

improving defect detection under demanding operational conditions Wilson and Tian (2006); Zhang

et al. (2019); Antipov and Markov (2018).

Figure 22 Illustration of the leakage field under the presence of velocity

Despite these challenges, MIECT have shown promise in high-speed inspection scenarios.

Comparative studies with established methods like MFL have illustrated opportunities for enhanced

defect detection, even at increased velocities Zhiye et al. (2005); Li et al. (2009). Numerical

simulations employing techniques such as the Finite Element Method (FEM) have played a pivotal

role in interpreting these interactions and guiding the development of improved detection strategies

Han et al. (2014). These advances have been demonstrated in diverse applications, from pipelines

and ferrite metals to high-speed rail systems, where speed-sensitive inspection solutions are urgently

needed to maintain safety and operational efficiency Zhao et al. (2023); Bao (2023); Zaini et al.

(2021); Li et al. (2022); Liu et al. (2023). In summary, MIEC testing has evolved as a complementary

and increasingly indispensable technique in the NDE toolbox. Through a growing body of research,

we now better understand how velocity-induced eddy currents shape defect signals. Ongoing

efforts to integrate advanced modeling, simulation methods, and hybrid data sets promise to further

refine MIEC-based inspection strategies, ensuring safer, more reliable assessments of critical

infrastructure at ever-increasing inspection speeds.

34

3.2 Motivation and objectives

Defect characterization in high-speed transport systems ensures safety and efficiency as inspec-

tion speeds increase. While field tests provide real-world accuracy, they are often limited by high

costs, physical constraints, and safety considerations, resulting in limited datasets. Simulations

offer broader parameter exploration at lower costs but rely on idealized assumptions that may not

fully capture real-world complexities. A significant concern in purely data-driven approaches is

the accuracy of approximations due to limited training data, particularly in complex physics-based

processes. A promising solution lies in combining lower-fidelity (LF) simulation data with higher-

fidelity (HF) experimental data. In this context, LF models might employ simplified physics or

coarser finite element meshes, trading accuracy for computational efficiency. While these models

are less accurate than the HF model, they are significantly cheaper to compute and can effectively

sample parameter spaces where expensive experimental measurements are unavailable. This in-

tegrated approach leverages the comprehensive coverage of simulations while using experimental

data to enhance prediction accuracy. However, this combination presents its own challenges. While

LF data are abundant and easily obtainable, their lower accuracy could potentially compromise the

model’s generalization ability. The key lies in developing methods that effectively balance the

quantity advantage of LF data with the accuracy of limited HF experimental data. To address

this, we propose a multi-fidelity surrogate modeling framework that combines LF simulation and

HF experimental data. This framework maximizes the advantages of both data sources—utilizing

simulations’ broad parameter coverage while maintaining experimental realism. The proposed

approach yields consistent predictions in both forward and inverse directions across different op-

erating conditions, making it valuable for engineering applications where extensive testing data is

difficult to collect.

3.3 Forward and Inverse Problem Formulations

In this section, we define two core mathematical problems—namely the forward and inverse

problems—central to defect characterization under varying operational conditions in Motion-

Induced Eddy Current Testing (MIECT). Given the defect geometry and the inspection velocity,

35

the forward problem seeks to predict the differential peak-to-peak amplitude Δ𝑉𝑝 𝑝 as the sensor

response. Here, we define a function

𝑓 : R3 → R1,

(𝑤, 𝑑, 𝑣) ↦→ Δ𝑉𝑝 𝑝

(3.3)

where 𝑤, 𝑑, 𝑣 represent defect width, depth, and velocity. In practice, 𝑓 could reflect Multiphysics

coupling involving motion-induced eddy currents, magnetic flux leakage, material nonlinearities,

and sensor-specific transfer functions. While forward predictions are crucial for predicting the

response signal so that we can optimize sensor design, another important task is to derive defect

parameters from measured data. This gives importance to the inverse problem, where we infer

defect geometry (𝑤, 𝑑) from Δ𝑉𝑝 𝑝 under certain velocity conditions. Mathematically, we define an

inverse mapping

𝑔 : R2 → R2,

(Δ𝑉𝑝 𝑝, 𝑣) ↦→ (𝑤, 𝑑).

(3.4)

In this study, surrogate models are utilized for both forward and inverse problems to improve

surface defect characterization in high-speed MIECT. The main objectives include developing a

multi-fidelity surrogate framework that integrates high-fidelity experimental data from a rotational

disc setup with low-fidelity finite element simulations to improve prediction.

3.4 Multi-Fidelity Surrogate Framework

3.4.1 Overview of Multi-Fidelity Concepts

High-fidelity (HF) models and data sources capture a broad range of real-world complexities,

including multi-physics interactions, complex boundary conditions, and actual operating conditions

Fernández-Godino et al. (2019). While HF models typically provide higher accuracy, they are

expensive and often limited in number. Obtaining high-fidelity data is challenging, resulting in

small datasets. In contrast, low-fidelity (LF) data, typically from simulations, is abundant but less

accurate due to simplifications such as dimensionality reduction, linearization, simplified physics

models, coarser computational domains, and partially converged results as shown in Figure 23.

These simplifications make LF data easier and cheaper to obtain, though at the cost of accuracy.

36

Figure 23 Connection between HFM and LFM

Multi-fidelity models (MFMs) aim to bridge the gap between rapid computation and high ac-

curacy Fernández-Godino et al. (2016). By introducing low-fidelity data into the training database,

the issue of insufficient training samples can be addressed. However, the low accuracy of LF data

may compromise the model’s accuracy and generalization ability. The classification of data/models

as HF or LF is determined by their ability to capture the underlying physics process, rather than

their source (experimental tests, analytical models, or numerical simulations), and can only be

determined relative to another. A surrogate’s accuracy depends on function complexity, experi-

mental design, domain size, simulation accuracy, and sample availability. Field tests with properly

calibrated sensors and rigorous testing protocols provide highly reliable information, achieving

authenticity that simulation models strive to replicate. While fidelity is not inherently tied to

whether data come from experiments or simulations, the practice of treating experimental data as

high-fidelity and simulation data as low fidelity remains practically advantageous. In the context

of MIECT data analysis, this framework provides a consistent approach for integrating simulation

and experimental sources efficiently and effectively.

3.4.2 Radial Basis Function-based Multi-Fidelity Scaling

The Radial Basis Function-based Multi-Fidelity Scaling (RBF-MFS) approach is a powerful

method for combining low-fidelity (LF) and high-fidelity (HF) data to construct an accurate sur-

rogate model. This method leverages the computational efficiency of LF simulations to explore a

broad parameter space and then refines the approximation using sparse HF data. In this approach,

A set of LF simulations is performed over a parameter grid. These simulations are computationally

37

efficient, allowing for extensive sampling of the parameter space. Each LF simulation yields an ap-

proximate system response, capturing a wide range of defect scenarios and operating conditions. A

baseline approximation ˆ𝑓𝐿𝐹 is first constructed using LF data with Radial Basis Functions (RBFs).

The LF approximation can be expressed as:

𝑁𝐿𝐹∑︁

ˆ𝑓𝐿𝐹 (𝑥) =

𝜆𝑖𝜙(|𝑥 − 𝑥 𝐿𝐹

𝑖

|)

(3.5)

𝑖=1
where 𝜆𝑖 are coefficients determined from LF data, 𝜙(·) is the chosen RBF kernel, and 𝑥 𝐿𝐹

𝑖

are LF

training points. Since the LF model is approximate, a discrepancy function 𝑑 (𝑥) is introduced to

correct the differences between the LF predictions and the sparse HF data. This function is also

modeled using RBFs. The final high-fidelity prediction ˆ𝑓𝐻𝐹 (𝑥) is obtained by combining the LF

model and the discrepancy function as:

ˆ𝑓𝐻𝐹 (𝑥) = ˆ𝑓𝐿𝐹 (𝑥) + 𝑑 (𝑥)

(3.6)

RBF-MFS is ideal for situations where HF simulations are computationally expensive. A large

parameter space needs to be explored. The system response is complex or nonlinear.

In sum-

mary, RBF-MFS is a powerful tool for multi-fidelity modeling. It combines the efficiency of LF

simulations with the accuracy of HF data, using RBFs to create a refined surrogate model. This

approach is widely applicable in engineering, physics, and other fields where computational cost

and accuracy are critical.

3.4.3 Gaussian Process Regression with Multi-Fidelity Scaling and Feature Discretization

The proposed GPR-MFS-FD method in Huang et al. (2025), leveraging Gaussian Process

Regression (GPR) with multi-fidelity data integration. Given the superior performance of the

RBF-MFS model in the forward problem, we assume that GPR will be beneficial in the inverse

problem. The RBF kernel, a key component of GPR, assesses the similarity between inputs from

simulation and field tests while smoothly handling the discretized feature input. The core of the

proposed method is the Gaussian Process (GP) model, characterized by a mean function 𝑚(𝑥)

and a covariance function 𝑘 (𝑥, 𝑥′). Integral to the multi-fidelity model is a scaling function 𝛼(𝑥),

38

designed to adjust LF data on the HF model, thus enhancing the accuracy of high-fidelity predictions

using:

𝑘 𝑀 𝐹 ((𝑥, 𝑓𝑖), (𝑥′, 𝑓 𝑗 )) = 𝛼(𝑥)𝑘 (𝑥, 𝑥′)𝛼(𝑥′)

(3.7)

where 𝑓𝑖, 𝑓 𝑗 ∈ { 𝑓 𝐿, 𝑓 𝐻 } denote the fidelity levels. Velocity features are modified by computing the

mean of each velocity range, transforming continuous velocity inputs into discretized features:

¯𝑣 =

𝑣𝑚𝑖𝑛 + 𝑣𝑚𝑎𝑥
2

(3.8)

where 𝑣𝑚𝑖𝑛 and 𝑣𝑚𝑎𝑥 are the minimum and maximum velocities in the range, respectively. This

modification captures the influence of velocity variations more effectively. With the model fine-

tuned, predictions for unknown defect size are derived by calculating the posterior predictive

distribution for new input 𝑥∗:

𝑦∗|𝑥∗, 𝐷 ∼ N (𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2

𝑛 𝐼)−1𝑦𝐻∗, 𝑘 (𝑥∗, 𝑥∗) − 𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2

𝑛 𝐼)−1𝑘 ∗𝑇 )

(3.9)

The predicted output 𝑦∗ given 𝑥∗ and the data 𝐷, follows a normal distribution N . The mean

of the distribution is computed by multiplying the covariance vector 𝑘 ∗ with the inverse of the

modified kernel matrix 𝐾𝑀 𝐹 + 𝜎2

𝑛 𝐼 where kernel matrix 𝐾𝑀 𝐹 compute the covariance among all

pairs of training inputs, adjusted for their fidelity levels and then this product is multiplied by the

high-fidelity output data 𝑦𝐻∗. The variance calculated by 𝑘 (𝑥∗, 𝑥∗) − 𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2

𝑛 𝐼)−1𝑘 ∗𝑇 provides

an estimate of the uncertainty of the prediction for 𝑥∗, the reduction is shown with model’s initial

uncertainty 𝑘 (𝑥∗, 𝑥∗) subtract the term with HF training data encoded. 𝑘 ∗(·) denotes the covariance

vector between the test inputs and the training inputs.

3.5 Computational Complexity and Trade-Offs

In Table 3.1, we compare the training and prediction complexities of our primary single-fidelity

and multi-fidelity methods. We denote 𝑛 as the number of data points in a low-fidelity (LF) dataset

and 𝑚 as the sizes of the high-fidelity (HF) for multi-fidelity (MF) approaches. For single-fidelity

39

Table 3.1 Computation complexity comparison

Model Type
Single-Fidelity Models
for forward problem

Method
PRS (Polynomial Response Sur-
face) Myers et al. (2016)

RBF-MQ (Radial Basis Function -
Multiquadric) Hardy (1971)

KRG (Kriging) Stein (1999)

Multi-Fidelity Models
for forward problem

CoRBF (Composite Radial Ba-
sis Function) Park and Sandberg
(1991)

LR-MFS
(Linear Regression-
based Multi-Fidelity Surrogate)
Kennedy and O’Hagan (2000)

CoKRG (Co-Kriging) Forrester
et al. (2007)

RBF-MFS (Radial Basis Function-
based Multi-Fidelity Surrogate)
Kumar et al. (2018)

Multi-Fidelity Models
for inverse problem

(Multi-Fidelity
MF-DGP-EM
Deep Gaussian Process with
Embedded Mapping)

Computational Complexity
Training: O (𝑛3)

Prediction: O (𝑛)
Training: O (𝑛3)

Prediction: O (𝑛)
Training: O (𝑛3)
Prediction: O (𝑛)
Training: O (𝑚3 + 𝑛3)

Prediction: O (𝑛)
Training: O (𝑛3)

Prediction: O (𝑛)
Training: O (𝑚3 + 𝑛3)

Prediction: O (𝑛)
Training: O (𝑚3 + 𝑛3)

Prediction: O (𝑛2)
Training: > O (𝑛3)

GPR-MFS-FD (Gaussian Process
Regression with Multi-Fidelity
Surrogate and Feature Discretiza-
tion)

Prediction: O (𝑛2) to O (𝑛3)
Training: O (𝑚3 + 𝑛3)

Prediction: O (𝑛2)

40

models like PRS, RBF-MQ, and KRG, training involves handling an 𝑛 × 𝑛 system, where 𝑛 denote

the number of training points. This leads to O (𝑛3) complexity during training. Once trained,

predictions can be made more efficiently, on the order of O (𝑛) per prediction, due to simpler

operations like evaluating kernels or polynomials.

In multi-fidelity approaches, the total cost is often split between the LF dataset and the HF

dataset. Each fidelity level requires building its covariance kernel matrix. Multi-fidelity methods

integrate low-fidelity (LF) and high-fidelity (HF) datasets. For example, models such as CoKRG,

CoRBF, and RBF-MFS will handle two datasets simultaneously, leading to O (𝑚3 + 𝑛3) complexity.

Typically, HF data 𝑚 is much smaller (due to cost or physical constraints), while LF data 𝑛 is

much larger. Each method’s complexity dictates practical constraints on dataset size and modeling

fidelity. Single-fidelity methods are straightforward but can underutilize HF information or broad

LF coverage. Multi-fidelity methods, though more computationally expensive at O (𝑚3 + 𝑛3), offer

superior predictive performance—critical for accurate forward and inverse analyses in Motion-

Induced Eddy Current Testing (MIECT).

3.6 Results and Discussion

3.6.1 Forward Problem

The proposed framework demonstrates state-of-the-art performance for both forward and in-

verse problem formulations. In forward modeling, RBF-MFS achieves consistently lower Mean

Squared Error (MSE) values than single-fidelity models such as PRS, RBF-MQ, and KRG. It

also outperforms more sophisticated multi-fidelity models like Co-Kriging and CoRBF. The error

distributions and violin plots illustrate that RBF-MFS maintains remarkably stable and accurate

predictions even as test set sizes increase, ensuring minimal variability and robust performance in

challenging conditions.

Cokriging outperforms Kriging as shown in Figure 24(a). At 50% test size, Kriging’s MSE

reaches 0.0007 while Cokriging maintains 0.00033. Cokriging achieves both lower median error

and tighter error distribution across test sizes, demonstrating effective LF data usage with limited

HF data.

In Figure 24(b), CoRBF and RBF-MFS maintain MSE below 2.5 × 10−5 across all

41

Figure 24 MSE as a Function of Test Size for (a) Cokriging versus Kriging Comparisons, (b)
Comparisons among Composite RBF, Radial Basis Function-based Multi-Fidelity Surrogate, and
Polynomial Response Surfaces, (c) Radial Basis Function-based Multi-Fidelity Surrogate versus
RBF with Multiquadric Analysis

test sizes. RBF-MFS delivers the lowest median errors with minimal variance. PRS performs

competitively at select test sizes despite higher average MSE. Figure 24(c) confirms RBF-MFS’s

superior performance.

3.6.2

Inverse Problem

For the inverse problem, GPR-MFS-FD clearly surpasses the MF-DGP-EM model, delivering

more accurate defect size estimations across a range of velocities. This performance gap can

be attributed to the integration of the feature discretization technique and multi-fidelity scaling,

which transforms continuous velocity ranges into a unified representation, enhancing the capture

of velocity-defect detection relationships.

42

Figure 25 Box plot of MSE for GPR-MFS-DF and MF-DGP-EM

3.6.3 Observations and Limitations

This performance gap can be attributed to the integration of the feature discretization technique

and multi-fidelity scaling, which transforms continuous velocity ranges into a unified representation,

enhancing the capture of velocity-defect detection relationships. The multi-fidelity scaling function

effectively projects low-fidelity data onto the high-fidelity model, improving prediction accuracy.

Moreover, the posterior predictive distribution for unknown defect sizes yields predictions with

reduced uncertainty by incorporating high-fidelity training data.

In contrast, MF-DGP-EM relies on a hierarchical arrangement of Gaussian processes and

employs an embedded mapping function represented by a neural network to align data from

different fidelities. While this approach aims to utilize deep learning and Gaussian processes, it

may struggle to effectively capture the complex interactions between velocity variations and defect

characteristics, potentially introducing additional complexity and sources of error. Furthermore,

MF-DGP-EM’s optimization process involves maximizing the marginal likelihood of the observed

data through gradient-based techniques, which may be more susceptible to local minima and require

careful hyperparameter tuning.

In comparison, we employ a more straightforward optimization approach based on the conjugate

gradient method, which may contribute to its superior performance. In summary, the performance

gap can be attributed to the innovative feature discretization technique, which enables a more effec-

tive representation of velocity variations and their influence on defect detection. The discretization

of the velocity feature, combined with a streamlined optimization process, contributes to the su-

perior accuracy and robustness across different velocity ranges. This highlights the importance

43

of developing targeted methodological optimizations that address the specific challenges posed by

MIEC data, as demonstrated by the success of the proposed approach.

3.7 Conclusion

In this chapter, we demonstrated how merging low-fidelity (LF) simulation data with high-

fidelity (HF) experimental data can effectively address both forward and inverse problems in

Motion-Induced Eddy Current Testing (MIECT). By leveraging physics-based finite element sim-

ulations for broad parametric coverage and complementing them with targeted laboratory mea-

surements, we increased predictive accuracy while controlling costs.Specifically, the synergy of

discrepancy-corrected models and multi-fidelity scaling underscored the improvement in capturing

real-world defect signals, especially under varying velocities. Feature discretization further miti-

gated operational uncertainties (e.g., inconsistent inspection speeds), ensuring robust performance

for both forward (signal prediction) and inverse (defect estimation) tasks. This multi-fidelity ap-

proach thus forms the central theme of the chapter: effectively combining abundant, less-accurate

data with sparse, more reliable measurements to enhance modeling fidelity and reliability.The

proposed methodology holds broad applicability beyond its immediate use in surface defect de-

tection for high-speed rail inspections. The principles of multi-fidelity data integration, feature

discretization, and surrogate modeling can be adapted to a wide range of nondestructive evalu-

ation (NDE) and structural health monitoring (SHM) scenarios—such as corrosion detection in

pipelines, crack characterization in aerospace components, or fatigue analysis in civil structures.

Any domain where engineering simulations and physical measurements coexist can benefit from

the cost-effective accuracy gains offered by multi-fidelity frameworks.

44

CHAPTER 4

COMPENSATION IN PULSED EDDY CURRENT TESTING VIA SURROGATE
MODELING

4.1

Introduction

Pulsed Eddy Current (PEC) testing is a key non-destructive evaluation (NDE) technique that

employs an excitation coil to generate a time-varying magnetic field that induces eddy currents

within the electrically conductive material Sophian et al. (2017). By analyzing the response signal,

PEC enables accurate characterization of material properties such as wall thickness, electrical

conductivity, and magnetic permeability Sun et al. (2021); Liu et al. (2022); Majidnia et al. (2014),

providing insights for industries ranging from oil and gas to power generation Bieri et al. (2005);

Fu et al. (2021).

Its high penetration depth and non-contact nature make it uniquely suited for

detecting subsurface corrosion, wall thinning, and cracks without requiring direct contact Wang

et al. (2021); Chen and Liu (2021), particularly when dealing with coated or insulated components

Rifai et al. (2016); Zhang et al. (2017).

However, several challenges limit PEC’s accuracy. Liftoff variation can reduce sensitivity and

increase measurement uncertainty Wang et al. (2021); Rao et al. (2017). Likewise, coatings and

insulation layers alter the electromagnetic environment, complicating the interpretation of PEC

signals and potentially masking underlying material degradation Chen and Liu (2021). Beyond

these geometric and material complexities, electromagnetic interference from external magnetic

fields significantly challenges PEC by distorting signals and reducing defect detection reliability

Cortês et al. (2023).

In particular, high-voltage feeder lines inside pipes generate strong, time-

varying magnetic fields that produce spatially varying permeability, violating the assumption of

uniform magnetic properties Chen and Lei (2015) and thus compromising data interpretation

methods Li et al. (2021). To overcome these issues, researchers have proposed methods such

as magnetic field calibration systems Janošek et al. (2019), adaptive interference suppression

algorithms Ponikvar et al. (2023), and enhanced electromagnetic field modeling Sereda and Korol

(2022); Zhang et al. (2017) to reduce the interference. Some approaches reduced error standard

45

deviations from thousands to less than 20 nT Zhang et al. (2017). These progresses, coupled with

improvements in probe design Wang et al. (2021); Rao et al. (2017); Shu et al. (2007); Rifai et al.

(2016), and advanced simulation tool, have significantly enhanced PEC’s robustness.

The primary objective of this chapter is to present a new compensation strategy that accounts for

the spatially varying magnetic permeability induced by internal power lines in high-voltage feeder

pipes. By explicitly modeling the position-dependent magnetic fields, this approach seeks to correct

signal distortions and restore accurate wall-thickness estimates. To accomplish this, we integrate

two complementary numerical models along with surrogate modeling. These tools enable rapid,

reliable estimates of how variations in magnetic permeability impact the transient PEC signal. In

turn, they form the basis of an efficient, physics-based correction scheme that significantly enhances

the reliability of PEC measurements in challenging high-voltage environments.

4.2 Challenges

High-voltage feeder lines enclosed within pipes pose significant challenges for PEC inspections

due to the strong electromagnetic fields generated by the internal cables and the nonlinear magnetic

behavior of ferromagnetic materials. These issues can produce spatially varying permeability in

the pipe wall, causing inconsistent PEC signals and leading to inaccurate wall-thickness estimates

or false-positive flaw detections. The three figures presented here illustrate the key hurdles in

inspecting such pipes and underscore why careful compensation and advanced modeling are critical.

In Figure 26, a schematic cross-section highlights the basic configuration of a feeder pipe with

an offset internal cable and an external PEC sensor. The cable carries high currents (on the order

of 10 kA at 60 Hz), generating a strong magnetic field that interacts with the ferromagnetic pipe.

When the cable is not centered, the resulting magnetic field distribution is asymmetrical around

the circumference. This asymmetry, compounded by the pipe’s ferromagnetic properties, leads to

locally different levels of magnetic flux density and, consequently, different relative permeability

in various angular positions of the pipe. Because conventional PEC algorithms generally assume

constant permeability, the varying field intensities at different angular positions introduce systematic

errors into the thickness measurements.

46

Figure 26 Cross-sectional schematic of a ferromagnetic pipe with an internal offset cable and an
external PEC sensor

The representative B-H curve and the corresponding relative magnetic permeability 𝜇𝑟 for fer-

romagnetic carbon steel is shown in Figure 27. The B-H curve illustrates the nonlinear relationship

between the applied magnetic field intensity H and the resulting flux density B. At larger field

strengths, the material approaches magnetic saturation, causing 𝜇𝑟 to drop steeply. In the context of

high-voltage feeder lines, certain pipe regions near the cable may experience magnetic fields well

into the saturation zone, dramatically reducing 𝜇𝑟. This spatial variation in 𝜇𝑟 complicates signal

interpretation because the PEC decay rate no longer correlates to wall thickness in a straightforward

manner. The impact is evident in Figure 28, which plots PEC voltage decay curves at two opposite

circumferential locations (0° and 180°) for the same nominal wall thickness (100%). Despite no

actual thickness difference, the curve measured at 180° decays more rapidly than the one at 0°

(black line). This discrepancy arises mainly from the stronger magnetic field and partial saturation

at the angular position closer to the cable. In practice, an inspector might mistakenly interpret the

faster decay as thinner material or increased corrosion. Without compensating for the local mag-

netic permeability changes, such signals can lead to false-positive flaw indications or significantly

overestimated metal loss.

In summary, we reveal the existing PEC data analysis is insufficient in high-voltage feeder

applications. The offset internal cable and large current intensities alter the magnetic conditions

47

Figure 27 Representative B-H curve and corresponding relative permeability illustrating the non-
linear magnetic response of carbon steel

Figure 28 Comparison of simulated voltage decay curves at two opposite angular locations (0° and
180°) showing signal discrepancies due to local magnetic saturation effects

inside the pipe wall, resulting in spatially varying permeability and distorted PEC signals. Over-

coming these effects requires robust compensation strategies—incorporating detailed multi-physics

modeling, spatially resolved permeability estimates, and calibration protocols—to ensure accurate,

reliable wall-thickness measurements.

4.3 Analytical Model

We start with a simplified analytical model correlating PEC decay signatures with material

electromagnetic characteristics. The PEC signal’s transient response, described by its dominant

time constant 𝜏, is fundamentally related to the wall thickness 𝑑, relative magnetic permeability 𝜇𝑟,

and electrical conductivity 𝜎. For a uniformly magnetized, defect-free region, the dominant time

48

constant obeys a proportionality 𝜏 ∝ 𝜇𝑟 𝜎𝑑2. However, in the presence of internal conductors and

non-uniform fields, 𝜇𝑟 becomes spatially dependent as shown in Figure 29.

Figure 29 Cross-section view of the pipe with internal cable

If we assume a constant permeability ˆ𝜇𝑟, the discrepancy between the estimated time constant

ˆ𝜏 and actual 𝜏(𝜃) will produce the erroneous estimation. When incorporating liftoff into our

mathematical model similar to the model proposed in Ulapane et al. (2018); Nafiah et al. (2020),

the induced voltage 𝑉 (𝑡, 𝑙) in the receiver coil, considering liftoff, can be expressed as:

𝑉 (𝑡, 𝑙) = 𝑏1𝑒−

𝜋2𝑡

𝜇𝑟 ( 𝜃 ) 𝜎𝑑2 𝑒−𝑘𝑙 +

∞
∑︁

𝑖=2

𝑏𝑖𝑒−𝑐𝑖𝑡

(4.1)

where 𝑏1 and 𝑏𝑖 are coefficients dependent on the system parameters related to the sensor

configuration, 𝑐𝑖 are higher-order time constants, and 𝑘 is a constant that accounts for the exponential

attenuation due to liftoff. The state-of-the-art method adopts the estimated time derivative based

on the assumed constant ˆ𝜇𝑟, therefore causing the prediction error:

Prediction error(𝜃) =

(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)

𝜋

√︃

𝜎 𝑑

𝑑𝑡 ln[𝑉 (𝑡, 𝑙)]

(cid:32)√︄

√︄

1
𝜇𝑟 (𝜃)

1
ˆ𝜇𝑟

−

(cid:33)

(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)

(4.2)

While this discrepancy is computed under the assumption of constant wall thickness, it becomes

inadequate when wall thinning occurs. This is because the wall thickness 𝑑 becomes a function of

49

the angular position 𝜃, i.e., 𝑑 = 𝑑 (𝜃), leading to an interdependency between 𝜇𝑟 (𝜃) and 𝑑 (𝜃). The

relative permeability becomes dependent on both 𝜃 and 𝑑 (𝜃) and expressed as:

𝜇𝑟 (𝜃, 𝑑 (𝜃)) =

𝑓𝐵𝐻 (𝐻 (𝜃, 𝑑 (𝜃)))
𝜇0𝐻 (𝜃, 𝑑 (𝜃))

(4.3)

where 𝑓𝐵𝐻 represents the B-H curve function. Determining 𝜇𝑟 (𝜃, 𝑑 (𝜃)) analytically is complex

due to its dependence on the pipe geometry and wall thickness variations.

4.4 Compensation Method via Surrogate Model

The proposed compensation algorithm combines multi-physics simulations with data-driven

modeling as shown in Figure 30. The Cable-Pipe Model simulates magnetic flux distribution,

generating a permeability map based on cable layout. The PEC-Pipe Model then uses these

permeability distributions to simulate transient PEC response, revealing how spatial variations

affect signal decay rates and thickness measurements.

Figure 30 Numerical modeling of PEC Probe Setup on Wax-Coated Pipes (a) Cable-Pipe Model
(b) PEC-Pipe Model

The process begins by modeling internal power line effects on the pipe’s magnetic permeability,

then uses these permeability distributions within a pulsed eddy current (PEC) simulation to predict

the voltage decay signal. Finally, surrogate models and a calibration procedure are applied to

correct real-world PEC measurements in the presence of spatially varying magnetic properties.

50

To reduce the computational complexities associated varying pipe parameters—especially cru-

cial for real-time applications in field inspections—we have developed surrogate models that ap-

proximate FEM simulation results, which allows for rapid predictions of decay time constants

without extensive computations. The surrogate models are constructed using Gaussian Process

Regression (GPR), a non-parametric Bayesian technique ideal for modeling complex, nonlinear

relationships with quantified uncertainties. In GPR, the underlying assumption is that the outputs

can be represented as realizations from a Gaussian process governed by a mean function and a

covariance function (kernel). Specifically, GPR models the relationship between the inputs X and

outputs y as:

y = 𝑓 (X) + 𝜖,

(4.4)

where 𝑓 (X) is an unknown latent function sampled from a Gaussian process, and 𝜖 ∼ N (0, 𝜎2

𝑛 𝐼)

represents independent and identically distributed Gaussian noise with variance 𝜎2

𝑛 . The latent

function 𝑓 (X) is characterized by a Gaussian process:

𝑓 (X) ∼ GP (𝑚(X), 𝑘 (X, X′))

(4.5)

where 𝑚(X) is the mean function, often set to zero without loss of generality, and 𝑘 (X, X′) is the

covariance function or kernel that encodes the relationship between data points. For the surrogate

models, the Radial Basis Function (RBF) kernel, also known as the Gaussian kernel, is chosen

due to its smoothness and infinite differentiability properties, which are suitable for modeling the

underlying physics of the problem. The RBF kernel is defined as:

𝑘 (𝑥𝑖, 𝑥 𝑗 ) = 𝜎2

𝑓 (−

1
2

(𝑥𝑖 − 𝑥 𝑗 )𝑇 Λ−1(𝑥𝑖 − 𝑥 𝑗 ))

(4.6)

where 𝜎2
𝑓

is the signal variance, controlling the vertical scale of the function variances, and

Λ = diag(𝑙2
1

, 𝑙2
2

, . . . , 𝑙2

𝑝) is a diagonal matrix of squared length-scale parameters. The input vectors

𝑥𝑖, 𝑥 𝑗 correspond to different observations.The RBF kernel measures the similarity between input

points, with greater values indicating more closely related inputs leading to similar outputs. The

51

surrogate models are trained using datasets generated from FEM simulations. The training process

involves optimizing the hyperparameters 𝜎2

𝑓 , Λ, 𝜎2

𝑛 of the GPR model. This is typically done by

maximizing the log-marginal likelihood function, which balances data fit and model complexity:

log 𝑝(y|X, 𝜃) = −

y⊤(𝐾 + 𝜎2

𝑛 𝐼)−1y −

1
2

1
2

log |𝐾 + 𝜎2

𝑛 𝐼 | −

𝑛

2

log 2𝜋

(4.7)

where 𝐾 is the kernel matrix computed using the RBF kernel for all pairs of training inputs, 𝑛

is the number of training data points, and 𝜃 represents all the hyperparameters. By optimizing

these hyperparameters and incorporating a 𝜏-alignment step that uses calibration points where the

field-measured decay times 𝜏calibration from field,𝑖 are known and the predicted decay time constants

𝜏simulation,𝑖 can be scaled with a factor of 𝛼 and by minimizing:

RMSE =

(cid:118)(cid:117)(cid:116)

1
𝑁

𝑁
∑︁

𝑖=1

(𝜏calibration from field,𝑖 − 𝛼𝜏simulation,𝑖)2

(4.8)

By optimizing these hyperparameters and incorporating 𝜏 alignment by minimizing RMSE, the

GPR models become finely tuned to the underlying patterns in the FEM simulation and field test

data, enabling accurate and efficient predictions during real-time inspections.

4.5 Field Tests and Results

4.5.1

Inspection Setup

In this section, we present the results of field tests conducted to validate the effectiveness of the

proposed auto-compensation algorithm in real-world PEC inspections of high-voltage feeder pipes.

We analyze the performance of the algorithm across multiple pipe segments, compare compensated

and uncompensated measurements, and discuss the practical implications of our findings.

Figure 31 illustrates the deployment of the PEC inspection system in an underground tunnel

environment, characterized by space constraints and complex arrangements of high-voltage feeder

pipes. To thoroughly assess the pipe surfaces, we employed a dual scanning strategy. First, we

conduct scans at 45° intervals around the pipe’s circumference to accurately capture wall thickness

and account for magnetic permeability variations due to internal cables. Next, we employed an

52

Figure 31 (a) Field deployment of PEC inspection system on high-voltage feeder pipes in an under-
ground tunnel (b) Schematic representation of circumferential scan positions and corresponding
axial scan lines

axial scanning method, during which sensors traversed the entire length of the pipe, systematically

covering the axial direction to map wall thickness variations longitudinally. Calibration of the PEC

instrument was critical for ensuring accuracy. We initiated the process by attaching the encoder and

preparing the system for scanning. An initial point scan targeted areas with near 100% nominal wall

thickness and minimal insulation, serving as a baseline. We then identified an optimal reference

point with a higher signal amplitude and slower decay rate, corresponding to the nominal (100%)

wall thickness. Calibrating at both the beginning and end of each inspection cycle ensured probe

stability and compensated for any system drift, thereby enhancing measurement reliability.

Table 4.1 summarizes the tested pipe segments specifications, including six horizontal and two

vertical carbon steel pipes from different feeder lines. Notably, segments PS17-19 through PS30-

14 from Feeder Line 34051 had available ultrasonic testing (UT) validation data. The removal

of protective coatings allowed for UT measurements, which confirmed consistent wall thickness

and indicated that any detected anomalies in uncompensated PEC data were false positives due to

electromagnetic interference rather than actual material loss. All pipes shared similar material prop-

erties and were inspected using the PEC-025-G2 probe under consistent configurations, although

scan modes varied between dynamic and grid mapping. The uniform dimensions (outer diameters

and wall thicknesses) facilitated comparative analysis. Data quality metrics showed low warning

percentages (1.52% to 3.93%) for the F34051 segments. The frequent false-positive wall loss

indications in uncompensated PEC data underscored the necessity of our compensation algorithm.

53

Table 4.1 Pipe segment and inspection specifications

Pipe ID

Type/Feeder Dimensions

Coverage

No. lines Validation

PS17-19

PS23-25

PS29-30

PS30-14

PS93-92

Horizontal
F34051

Horizontal
F34051

Horizontal
F34051

Horizontal
F34051

Horizontal
F63

PS100-99 Horizontal

F63

Vertical-4 Vertical

Vertical-8 Vertical

(OD/WT/Coating)
OD: 219.1 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 219.1 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 219.1 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 219.1 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 273.05 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 273.05 mm
WT: 6.35 mm
Coating: 6.35 mm
OD: 273.05 mm
WT: 6.35 mm
No Coating
OD: 273.05 mm
WT: 6.35 mm
No Coating

L: 7315.2 mm
Circ: 728.1 mm

L: 7315.2 mm
Circ: 728.1 mm

L: 7315.2 mm
Circ: 728.1 mm

L: 7315.2 mm
Circ: 728.1 mm

L: 7315.2 mm
Circ: 897.7 mm

L: 7315.2 mm
Circ: 897.7 mm

L: 6096 mm
Circ: 897.7 mm

L: 6096 mm
Circ: 897.7 mm

8

8

8

8

16

16

4

8

Status
UT Verified

UT Verified

UT Verified

UT Verified

N/A

N/A

N/A

N/A

While UT validation was possible for accessible areas, some uncertainty remains for non-validated

sections due to inaccessibility. Nevertheless, the compensation algorithm demonstrated improved

accuracy across all tested segments, reducing false positives and enhancing the reliability of PEC

inspections.

4.5.2 Comparison of Compensated vs. Uncompensated PEC Measurements

Figure 32 presents the probability density distributions of wall thickness measurements for four

segments, comparing compensated and uncompensated PEC data. The compensated measurements

exhibit higher means and significantly lower standard deviations, indicating enhanced accuracy and

precision. For the PS17-19 segment, the mean increased from 96.05% to 98.95%, while the standard

deviation was reduced from 7.12% to 1.45%. Similarly, PS23-25 measurements saw an increase

54

Figure 32 Probability density distributions of wall thickness measurements comparing compensated
(blue) and uncompensated (red) PEC inspections for four pipe segments: (a) PS17-19 (b) PS23-25
(c) PS29-30 and (d) PS30-14

in mean from 98.68% to 99.39% with a corresponding reduction in variability. In the PS29-30

segment, the mean rose from 91.43% to 100.33%. Lastly, the PS30-14 segment showed an increase

in mean from 89.71% to 100.53%. These results demonstrate that the compensation algorithm

effectively corrects underestimations caused by magnetic interference, aligning the measurements

closely with the nominal wall thickness of 100%. The reduced standard deviations reflect improved

measurement consistency. While some uncertainty persists in non-validated areas, the overall

enhancement in data quality underscores the practical value of the compensation algorithm in field

applications.

Figures 33 and 34 provide polar visualizations of wall thickness distributions for pipe segments

PS17-19 and PS30-14, respectively. Each figure divides the pipe circumference into eight segments

at 45° intervals, with concentric rings representing wall thickness percentages from 20% to 120%

of the nominal value. Before Compensation, both figures show significant underestimations in

55

Figure 33 PS17-19 Polar visualization of wall thickness distribution before and after compensation

Figure 34 PS30-14: Comparative polar plots of wall thickness distribution before and after com-
pensation

56

specific angular sectors (e.g., 270°–315° for PS17-19 and 270°–0° for PS30-14), with measurements

indicating up to 20% false material loss. These discrepancies are attributed to magnetic field

interference from internal power cables. After Compensation, the measurements uniformly range

from 95% to 105% of the nominal thickness in all segments. This uniformity confirms the

algorithm’s effectiveness in correcting spatial measurement errors and aligns with UT validations

of consistent wall thickness. The correction of the double-dip pattern in PS30-14 suggests that

the algorithm successfully addresses complex interference patterns, potentially caused by the

configuration of the internal three-phase power cables. These visualizations reinforce the practical

applicability of the compensation algorithm in enhancing inspection accuracy.

Figure 35 Vertical-8 Radial box plots displaying the wall thickness distribution before and after
compensation

Results of Vertical-8 with eight 45° intervals on a polar grid are shown in Figure 35. Reference

circles at 0%, 70%, and 100% nominal thickness are shown, with compressed scaling below 70%

to enhance visualization. Before compensation, significant measurement variability is evident,

particularly at 270° and 315°. After compensation, median values align near 100% nominal

thickness with reduced interquartile ranges, indicating enhanced accuracy and consistency.

Figure 36 displays radial box plots for segment PS100-99, comparing pre-and post-compensation at

16 circumferential positions (22.5° intervals) on a polar grid with reference circles at 0%, 70%, and

57

Figure 36 PS100-99 Radial box plots showing wall thickness distribution before and after compen-
sation

100% nominal thickness. For this horizontal pipe segment inspected at 16 circumferential positions,

the pre-compensation data shows heterogeneous interquartile ranges and medians deviating from

the nominal thickness. Post-compensation, the measurements display uniform medians near 100%,

reduced variability, and consistent whisker lengths across all angles. These results affirm that the

compensation algorithm effectively corrects systematic biases and reduces measurement variability

regardless of pipe orientation or surface conditions. The algorithm’s ability to improve accuracy

in both coated horizontal pipes and uncoated vertical pipes underscores its practical applicability

in diverse field environments. While the proposed method significantly improves measurement

accuracy, limitations exist. The algorithm’s performance in non-validated sections of the pipes

remains uncertain due to the lack of UT or visual inspection data. Additionally, the algorithm

assumes uniform material properties, which may not account for anomalies like localized corrosion

or material defects.

4.6 Conclusions

In this chapter, we present a novel approach to enhancing the accuracy of PEC measurements

for the inspection of high-voltage feeder cable pipes. By introducing new physics-based models

58

that, for the first time, effectively compensate for signal distortions caused by internal magnetic

fields, we have significantly improved PEC measurement accuracy. These models account for spa-

tially varying magnetic permeability induced by internal current-carrying conductors, addressing a

critical challenge that previously led to erroneous assessments of material integrity. The proposed

integrated compensation methodology combines empirical data, FEM simulations, and GPR sur-

rogate models. This integration enables accurate pipeline integrity assessments to be conducted

within minutes, making the approach practical for in situ inspections and real-time applications.

The use of surrogate models reduces computational demands without compromising accuracy,

facilitating rapid and precise wall thickness estimations. The compensation methods and accom-

panying software tool have been validated through field demonstrations on in-service pipelines.

Deploying the system in challenging underground tunnel environments confirmed its effectiveness

and robustness.

59

CHAPTER 5

PHYSICS GUIDED EXPLAINABLE NETWORKS FOR AE
CLASSIFICATION

5.1

Introduction

As a key diagnostic method in structural health monitoring (SHM) and nondestructive evaluation

(NDE), Acoustic Emission (AE) enables detailed assessment of material behavior and structural

conditions Strantza et al. (2015). AE signals often exhibit long-duration, non-stationary waveforms

with complex reverberation patterns Daugela et al. (2021b), making them challenging to interpret

through traditional machine learning methods. Deep learning models have revolutionized AE

analysis by revealing hidden patterns and extracting meaningful information Wu et al. (2020), with

promising results in applications such as crack propagation monitoring in concrete structures Haile

et al. (2020b) and bearing fault diagnosis in wind turbines Zhong and Chen (2023). Despite these

advances, the interpretability of these models remains a critical challenge Selvaraju et al. (2016),

especially when trust and transparency are paramount for infrastructure maintenance and safety

Wickstrøm et al. (2020); Ivaturi et al. (2021).

To address this issue, explainable AI (XAI) techniques have emerged as effective tools to clarify

deep learning’s decision-making processes Selvaraju et al. (2016); Wickstrøm et al. (2020); Ivaturi

et al. (2021); Nayebi et al. (2022); Shawi et al. (2019); Alam et al. (2023); Guillemé et al. (2019).

Among these, visualization-based methods such as Class Activation Mapping (CAM) and Gradient-

weighted CAM (Grad-CAM) have gained traction. CAM and Grad-CAM have proven effective

in visualizing discriminative regions contributing to classifier decisions, helping to align model

outputs with underlying physical phenomena. By highlighting which parts of the signal influence

the model’s classification, these methods reveal a more transparent decision-making process. They

have been successfully applied in diverse domains, including arrhythmia classification from ECG

signals Singh and Sharma (2022), MRI data for multiple sclerosis classification Zhang et al. (2021),

demonstrating their versatility and potential for AE analysis Wang et al. (2016); Parvatharaju et al.

(2021); Singh et al. (2020).

60

In the context of AE signals, the integration of CAM and Grad-CAM with physics-informed

segmentation based on the theoretical arrival times of fundamental Lamb wave modes exploits the

dispersive nature of Lamb waves in plate-like structures. By segmenting signals into physics-based

regions—the Pre-S0, the Transition Zone, and the Post-A0 Region—the model can focus on mode-

specific features that correspond to physically significant wave interactions. This approach enables

mode-specific analysis that aligns with underlying wave propagation physics, thereby building trust

in the classification outcomes Wu et al. (2020); Haile et al. (2020b); Singh and Sharma (2022).

Such interpretability methods mitigate the "black-box" nature of deep learning, providing "crucial

insights" and increasing user confidence in automated AE diagnostics Singh et al. (2020). As a

result, we can better understand the factors driving model decisions and leverage that knowledge

for more informed SHM and NDE assessments.

5.2 Physics-informed AE Segmentation

Physics-informed segmentation provides a solid foundation for extracting meaningful features

from complex Acoustic Emission (AE) signals, utilizing the theoretical arrival times of fundamental

Lamb wave modes: symmetric (𝑆0) and antisymmetric (𝐴0). Lamb waves are essential for AE signal

analysis in plate structures, displaying multimodal and dispersive properties, where propagation

velocities depend on frequency. These features enhance structural health monitoring and damage

localization. The 𝑆0 mode has higher group velocities at lower frequencies, while the 𝐴0 mode is

highly dispersive, carrying energy across a wide frequency range. Together, they are crucial for

understanding damage mechanisms and localizing AE sources Qiu et al. (2020).

The segmentation method exploits the dispersive, frequency-dependent behavior of Lamb

waves, enabling mode-specific analysis aligned with wave propagation physics. The AE signal

is segmented into three distinct regions based on these TOA calculations. The Pre-𝑆0 Region

captures environmental noise and electronic interference before the 𝑆0 mode arrives, defined by

𝑡 < 𝜏𝑆0 for time domain and 𝑡 < 𝜏𝑆0( 𝑓 ) for time-frequency domain. The 𝑆0-𝐴0 Transition Zone

includes interactions and conversions between 𝑆0 and 𝐴0 modes, occurring when 𝜏𝑆0 ≤ 𝑡 ≤ 𝜏𝐴0 for

time domain and 𝜏𝑆0( 𝑓 ) ≤ 𝑡 ≤ 𝜏𝐴0( 𝑓 ) for time-frequency domain. The Post 𝐴0 Region captures

61

later arrivals, such as reflections and higher-order modes, defined by 𝑡 > 𝜏𝐴0 for time domain and

𝑡 > 𝜏𝐴0( 𝑓 ) for time-frequency domain.

Figure 37 (a) AE segmentation in time domain (b) AE segmentation in frequency-time domain

This segmentation approach serves different purposes in time and time-frequency domains.

In the time domain, it helps isolate mode-specific contributions in 𝑠(𝑡) and improves feature

representation for 1D time series classification. For time-frequency analysis, it isolates mode-

specific contributions in 𝑠(𝑡, 𝑓 ) and enhances mode-specific features for 2D classification. Figure

37 illustrates this segmentation approach through an example acoustic emission signal. Figure

37(a) shows the time-domain signal with three distinct regions marked in different colors: Region

1 (red) for Pre 𝑆0 arrival, Region 2 (green) for the 𝑆0-𝐴0 transition zone, and Region 3 (blue)

for Post 𝐴0 arrival. The corresponding spectrogram (Figure 37(b)) displays the time-of-arrival

curves for both modes across the frequency spectrum of 50–500 kHz, with the 𝑆0 mode arrival

(solid black line) and 𝐴0 mode arrival (dashed black line) defining the region boundaries. This

physics-based framework enables effective feature extraction and subsequent analysis, valuable for

machine learning applications in structural health monitoring and damage detection.

5.3 Signal Preprocessing

In this paper, AE signals were acquired using a fiber-optic coil-based sensing system with a

broadband frequency response of 50–500 kHz. The signals were generated using the Hsu-Nielsen

pencil lead break (PLB) method, which involved a 2H mechanical pencil with a 0.5 mm diameter

62

lead. We conducted ten PLB tests at each of the ten marked locations on a 1/10-inch-thick aluminum

plate, with each test repeated to ensure statistical reliability and account for any variations in signal

generation. The aluminum plate was marked with a grid to ensure precise and repeatable testing.

AE signals were acquired using sensors with a broadband frequency response of 50–500 kHz, and

digitized at a sampling rate of 2 MHz to ensure accurate capture of high-frequency components.

An example of a raw AE signal is shown in Figure 38.

Figure 38 Signal filtered to 50-500 kHz bandwidth

To analyze the time-frequency characteristics of the AE signals, we employed the Continuous

Wavelet Transform (CWT), which provides a detailed representation of the signal’s frequency

content over time. The CWT is defined as:

−∞
where 𝑊𝑥 (𝑎, 𝑏) is the wavelet coefficient, 𝑥(𝑡) is the AE signal, 𝜓 is the mother wavelet (we use

𝑊𝑥 (𝑎, 𝑏) =

∫ ∞

𝑥(𝑡)𝜓∗

(cid:18) 𝑡 − 𝑏
𝑎

1
√︁|𝑎|

(cid:19)

𝑑𝑡,

(5.1)

the Morlet wavelet due to its good time-frequency localization properties), 𝑎 is the scale parameter,

and 𝑏 is the translation parameter. We used the Morlet wavelet as the mother wavelet due to its

excellent time-frequency localization properties, defined as:

𝜓(𝑡) =

1
√
𝜋
4

𝑒𝑖𝜔0𝑡𝑒−𝑡2/2

(5.2)

where 𝜔0 = 6 to satisfy the admissibility condition. We computed the CWT for scales corresponding

to frequencies from 50 kHz to 500 kHz, divided into 18 uniformly spaced frequency bands: 50–

63

75 kHz, 75–100 kHz, ..., 475–500 kHz. This multi-scale analysis allowed us to capture both

low-frequency trends and high-frequency transients characteristic of different AE sources.

Figure 39 Continuous Wavelet Transform (CWT) of an Acoustic Emission (AE) signal, showing
the time-frequency energy distribution with time (𝜇s) on the x-axis, frequency (kHz) on the y-axis,
and wavelet coefficient magnitude represented by color intensity

The resulting time-frequency representation provides a rich set of features that can reveal subtle

differences between various types of AE events. Figure 39 shows the transformation of the AE

signal into the time-frequency representation via CWT, displaying energy distribution over time

and frequency bands. This reveals how the frequency content evolves, essential for non-stationary

AE signals. Peaks in the CWT plot indicate high energy periods, corresponding to specific Lamb

wave modes (𝑆0 and 𝐴0) arriving at different times. The CWT separates overlapping modes by

highlighting their unique time-frequency features, enabling mode-specific analysis. This enables

the identification of damage mechanisms or AE events, essential for mode arrival analysis and

classification. Additionally, representing the signal in both time and frequency enhances the deep

learning model’s ability to learn discriminative, physics-informed features. Visualizing energy

concentration in the CWT aids interpretability through CAM, linking classifier’s performance to

physical events like mode arrivals and reflections.

64

5.4 Explainable CNN Architecture

5.4.1 CAM and Grad-CAM for 1D signal interpretation

The non-stationary nature of AE signals presents significant challenges for interpretation and

classification, particularly when using conventional machine learning models that operate as black

boxes without physical insights. To address this, we adopt an explainable deep learning framework

that integrates Class Activation Mapping (CAM), Gradient-weighted Class Activation Mapping

(Grad-CAM) and Physics-informed segmentation based on Lamb wave theory. This combination

highlights the specific time regions contributing most to a model’s decision, thus improving trans-

parency. The proposed architecture employs a one-dimensional Convolutional Neural Network

(CNN) for temporal feature extraction. Sequential Conv1D layers process the signal, followed by

dimensionality reduction through pooling or Global Average Pooling (GAP). The network culmi-

nates in a Fully Connected layer computing class scores, with a final SoftMax layer generating

classification probabilities. At the final classification step, suppose the network outputs a score 𝑆𝑘

for class 𝑘:

𝑆𝑘 =

256
∑︁

𝑛=1

𝑤 𝑘,𝑛𝐹𝑛 + 𝑏𝑘 ,

(5.3)

Where 𝐹𝑛 and 𝑛 denote the 𝑛𝑡ℎ feature and the number of feature channels respectively, 𝑤 𝑘,𝑛 and

𝑏𝑘 are the weights and biases of the fully connected layer. The larger the score 𝑆𝑘 , the stronger the

model’s belief that the input belongs to class 𝑘. In general, CAM provides a heatmap over time,

showing which parts of the signal most contribute to a specific class prediction. Instead of using the

pooled feature values 𝐹𝑛, CAM directly weights the final feature maps 𝐴𝑛 (𝑖) by the class-specific

weights 𝑤 𝑘,𝑛 which is the weight for class 𝑘 from the final layer.

CAM𝑘 (𝑖) =

256
∑︁

𝑤 𝑘,𝑛 𝐴𝑛 (𝑖)

(5.4)

𝑛=1
The CAM value at any time point 𝑖 indicates the signal regions that most strongly influence

the model’s classification decision for class 𝑘. When CAM𝑘 (𝑖) is visualized as an overlay on the

AE waveform input, it reveals the temporal segments that the model considers most significant for

65

its classification decision. Grad-CAM refines this approach by using gradients of 𝑆𝑘 with respect

to the feature maps 𝐴𝑛 (𝑖) as

𝜕𝑆𝑘
𝜕 𝐴𝑛 (𝑖) . Rather than relying on the final-layer weights, Grad-CAM

computes importance weights by averaging the gradients over time, then multiplying by the feature

maps. Through Grad-CAM, we can see not only what the final classification layer focused on, but

also how the model’s hidden representations responded to each segment of the signal.

5.4.2 DCAM for 2D signal interpretation

Wavelet transformation converts the original AE signal into a two-dimensional time-frequency

representation, mapping energy distribution across multiple frequency bands over time. This 2D

representation, combined with Dimensional-wise Class Activation Mapping (DCAM), provides

enhanced interpretability of the wavelet coefficient space. The key idea of DCAM is to introduce

small variations to the input 2D data and compute corresponding CAM, it identifies stable pat-

terns across perturbations while reducing the impact of transient features. The input consists of

multiple channels, where each corresponds to a specific frequency band obtained from the wavelet

decomposition with 18 frequencies. It begins with generating multiple permutations of the input

data to capture a diverse range of feature activations and to assess the model’s consistency across
variations in the data. For each permutation 𝑋𝜋 𝑝 and class 𝑘, activation maps CAM( 𝑝)

(𝑖, 𝑗) are

𝑘

computed using:

CAM( 𝑝)
𝑘

(𝑖, 𝑗) =

𝑤 𝑘,𝑛 𝐴( 𝑝)
𝑛

(𝑖, 𝑗)

256
∑︁

𝑛=1

(5.5)

where 𝐴( 𝑝)
𝑛

(𝑖, 𝑗) are the feature maps from the last convolutional layer for permutation 𝑝. These

CAMs are then averaged over all permutations to obtain a mean activation map:

CAM𝑘 (𝑖, 𝑗) =

1
𝑃

𝑃
∑︁

𝑝=1

CAM( 𝑝)
𝑘

(𝑖, 𝑗)

(5.6)

representing the most consistently significant features. Simultaneously, the variance of the CAMs

is calculated to assess the stability and reliability of the activations:

66

Var𝑘 (𝑖, 𝑗) =

1
𝑃

𝑃
∑︁

(cid:16)

𝑝=1

CAM( 𝑝)
𝑘

(𝑖, 𝑗) − CAM𝑘 (𝑖, 𝑗)

(cid:17) 2

The DCAM𝑘 (𝑖, 𝑗) is then generated by combining the average CAM and inverse variance:

DCAM𝑘 (𝑖, 𝑗) = CAM𝑘 (𝑖, 𝑗) ×

(cid:18)

1 −

(cid:19)

Var𝑘 (𝑖, 𝑗)
max (Var𝑘 )

(5.7)

(5.8)

The DCAM technique extends CAM by incorporating permutations of the input signal, pro-

viding a more robust and discriminative visualization of the features important for classification.

Wavelet transform is applied to the AE signals to decompose them into time-frequency represen-

tations, allowing for detailed analysis of the signal’s frequency content over time. The wavelet

coefficients are segmented based on frequency bands corresponding to different Lamb wave modes

and dispersive characteristics, facilitating mode-specific analysis. DCAM generates heatmaps from

2D wavelet data, revealing discriminative frequency bands, critical time intervals, and the robustness

of identified features. These heatmaps enable a deeper understanding of how the model interprets

complex AE signals and highlight the importance of specific wavelet coefficients, thereby providing

insights into the physical phenomena underlying the AE signals and enhancing the interpretability

of the classification process.

5.5

Improved Interpretability and Performance

5.5.1 Time-Domain Interpretation of AE Signals Using CAM and Grad-CAM

The analysis of Acoustic Emission (AE) signals from two distinct source locations (1 and 10)

provides significant insights into the model’s ability to differentiate between different sources based

on signal characteristics. Figures 40 and 41 present a comparison of raw AE signals using CAM

and Grad-CAM for locations 1 and 10 respectively.

The raw AE signals exhibit notable differences in their temporal profiles, particularly within

the initial 400 𝜇s, indicating location-specific wave propagation patterns due to variations in

propagation distance and attenuation. Using theoretical arrival times of fundamental Lamb wave

modes (𝑆0 and 𝐴0) as natural boundaries, denoted by dotted lines, the signal is partitioned into three

67

Figure 40 (a) Example AE signal from location 1 (b) CAM for classifying AE signal from location
1 (c) CAM for classifying AE signal from location 10

regions: pre-𝑆0, 𝑆0–𝐴0 transition zone, and post-𝐴0. In the CAM and Grad-CAM visualizations,

significant activation is observed in different regions for the two locations. For location 1, both

CAM and Grad-CAM highlight significant activation in the early signal components (before 400

𝜇s), particularly within the 𝑆0–𝐴0 Transition Zone. This indicates that the model focuses on

the initial wave arrivals and mode interactions for classification.

In contrast, for test location

10, the activation maps show broader activation across the signal, including the post-𝐴0 region,

suggesting that the model incorporates features from later-arriving wave components for more

distant sources. Particularly, Grad-CAM provides more localized feature importance compared to

CAM, potentially enabling more precise identification of critical signal components. The heatmaps

reveal distinct activation patterns between locations 1 and 10, highlighting the model’s ability to

68

Figure 41 (a) Example AE signal from location 10 (b) CAM for classifying AE signal from location
1 (c) CAM for classifying AE signal from location 10

discern subtle, location-specific signal features. This observation supports the hypothesis that the

proposed physics-informed segmentation enhances the model’s sensitivity to spatial variations in

AE signal characteristics by correlating the model’s attention to specific wave modes and their

interactions. The comparative analysis between CAM and Grad-CAM indicates that gradient-

weighted approaches offer superior resolution in identifying salient signal features. This enhanced

resolution could be particularly valuable in complex source localization scenarios where subtle

differences in signal features are crucial.

69

5.5.2 Time-Frequency Analysis Using DCAM

The time-frequency analysis of AE signals, as illustrated in Figures 42 and 43, reveals significant

insights into the efficacy of DCAM for enhancing signal interpretation. Figure 42 presents a

conventional spectrogram of an AE signal. Notably, the energy concentration around 150–200 𝜇s

in the 50–200 kHz range, corresponding to the expected arrival of the 𝑆0 mode. The corresponding

DCAM result is shown in Figure 43, both overlaid with theoretical TOA curves for the 𝑆0 and 𝐴0

derived from dispersion relations.

Figure 42 Time-frequency analysis of an acoustic emission signal

Figure 43 Dispersion Compensated Acoustic Monitoring (DCAM) results visualized as a heat map

This enhanced definition facilitates more precise identification of mode-specific components,

which is crucial for accurate source localization and damage characterization. Furthermore, the

DCAM results unveil a distinct high-energy region at approximately 600–700 𝜇s across a broad

frequency spectrum, which is substantially less discernible in the conventional spectrogram. This

underscores DCAM’s capability to highlight late-arriving wave components, potentially corre-

70

sponding to reflections, mode conversions, or higher-order modes, which are critical in complex

structural geometries. The superposition of theoretical dispersion curves denoted by white dotted

curves provides a physics-informed framework for signal interpretation. The alignment of high-

activation regions in the DCAM representation with the theoretical arrival times of the 𝑆0 and

𝐴0 modes corroborates the method’s effectiveness in compensating for dispersion effects, thereby

improving the temporal localization of frequency components.

5.5.3 Quantitative Assessment of Segment Contributions

To quantitatively assess the contributions of different signal segments to the classification

decisions made by our deep learning model, we propose a framework that leverages the activation

maps generated by CAM, Grad-CAM, and DCAM. This framework is designed to be both specific

and precise, ensuring that the analysis is grounded in the physical characteristics of the AE signals

and the model’s internal processes.The activation maps 𝐴(𝑡) and 𝐴(𝑡, 𝑓 ) for the time-frequency

domain represent the model’s attention or focus on different parts of the input signal during

classification. High activation values indicate that the model considers those regions particularly

important for making its decision. By integrating these activation maps over predefined signal

segments, we can quantify the total contribution of each segment to the classification outcome:

∫
𝑠𝑖
∫
𝑆
Similarly, for the time-frequency domain analysis, where the signal is represented as 𝑠(𝑡, 𝑓 ), the

𝐴(𝑡) 𝑑𝑡

𝐴(𝑡) 𝑑𝑡

𝐶𝑖 =

(5.9)

segment contribution is:

𝐶𝑖 =

∬
𝑠𝑖
∬
𝑆

𝐴(𝑡, 𝑓 ) 𝑑𝑡 𝑑𝑓

𝐴(𝑡, 𝑓 ) 𝑑𝑡 𝑑𝑓

(5.10)

This formulation directly measures the proportion of the model’s total attention allocated to each

segment. To account for segment size variations, we introduced the relative importance factor,

adjusting the normalized contribution. To enable comparison across different signals and segments,

we further normalize these contributions:

71

𝐶′

𝑖 =

𝐶𝑖
(cid:205)𝑖 𝐶𝑖

(5.11)

This normalization ensures that the sum of all normalized contributions 𝐶′

𝑖 equals 1, allowing for

a direct comparison of segment importance irrespective of the signal’s absolute activation levels.

However, segments vary in size—both in time duration Δ𝑡𝑖 for time-domain signals and in area

𝐷𝑖 for time-frequency representations. Larger segments might naturally accumulate more activa-

tion simply due to their size, not necessarily because they are more significant for classification.

To account for this, we introduce the relative importance factor, which adjusts the normalized

contribution by the proportion of the segment’s size to the total signal size. For the time domain:

where Δ𝑡𝑖 is the duration of segment 𝑠𝑖 and 𝑇 is the total signal duration. For the time-frequency

Importance𝑖 =

𝐶′
𝑖
Δ𝑡𝑖/𝑇

(5.12)

domain:

Importance𝑖 =

𝐶′
𝑖
𝐷𝑖/𝐷

(5.13)

where 𝐷𝑖 is the area of segment 𝑠𝑖 in the time-frequency plane, and 𝐷 is the total area of the

spectrogram. A higher value indicates that the segment contributes more to the classification

decision than would be expected based on its size alone.

5.5.3.1 Analysis of Average Importance Heatmaps

Figure 44 presents the Average Importance Heatmaps for CAM (44(a)), Grad-CAM (44(b)),

and DCAM (44(c)), offering significant insights into the contribution of different signal regions to

the classification process across various labels (source locations). The CAM heatmap in Figure

44(a) demonstrates a notable emphasis on the 𝑆0–𝐴0 transition zone, particularly for Label 1, with

a peak importance value of 0.7739. This suggests that the model heavily relies on features within

this transition zone for classifying signals from this specific location. Labels 5–10 show a more

balanced distribution of importance across all three regions, indicating that the model incorporates

features from multiple phases of the signal for distant source locations.

72

The Grad-CAM results in Figure 44(b) present a different pattern, with a general shift of

importance towards the post-𝐴0 region. While the 𝑆0–𝐴0 transition zone still shows high importance

for test point 1 (0.7391), for test points 5–10, the post-𝐴0 region consistently demonstrates higher

importance values. This shift suggests that Grad-CAM captures more nuanced features in the later

parts of the signal, potentially accounting for reflections and dispersive effects more effectively

than CAM.The DCAM heatmap in Figure 44(c) reveals the high importance in the 𝑆0–𝐴0 transition

zone across all labels, with an exceptionally high value of 20.6584 for Label 1. This dramatic

difference in scale indicates that DCAM assigns substantially higher relative importance to specific

time-frequency regions. The high concentration in the 𝑆0–𝐴0 region aligns with the physical

understanding of mode interactions and their significance in damage characterization and source

localization. The row in each heatmap shows that CAM has a slight preference for the 𝑆0–𝐴0

transition zone (0.4056), while Grad-CAM and DCAM indicate higher overall importance for the

post-𝐴0 region (0.3220 and 0.9273, respectively). These findings underscore the value of employing

multiple visualization techniques to gain a comprehensive understanding of the model’s decision-

making process. By isolating mode-specific contributions and enhancing feature representation,

this framework has the potential to improve source localization accuracy and advance damage

characterization in SHM applications.

73

Figure 44 (a) Class Activation Mapping (CAM) Average Importance Heatmap (b) Gradient-
weighted Class Activation Mapping (GCAM) Average Importance Heatmap (c) Dispersion Com-
pensated Acoustic Monitoring (DCAM) Average Importance Heatmap

74

CHAPTER 6

CONCLUSIONS AND FUTURE WORK

6.1 Conclusions and Contribution

This dissertation advances acoustic emission–based structural health monitoring by integrating

simulation-driven techniques, domain adaptation, and explainable AI. Through comprehensive nu-

merical modeling, robust machine learning architectures, and GAN-based data augmentation, the

research addresses key challenges such as limited annotated data, imbalanced class distributions,

and the gap between simulation and experiment. Multi-fidelity surrogate modeling further expands

the applicability of these methods to electromagnetic inspection scenarios, broadening their impact

on defect characterization and real-time monitoring. Overall, the proposed frameworks deliver

improved AE source localization, damage classification, and interpretability, laying a strong foun-

dation for broader deployment of data-driven SHM solutions. By fusing high-fidelity simulations

with innovative learning strategies, the results demonstrate that even single-sensor deployments

can achieve precise, explainable damage detection in complex engineering environments. These

contributions collectively underline the importance of synergizing modeling, data augmentation,

and interpretability for safer and more reliable infrastructure monitoring.

i. A robust finite element modeling framework was developed to simulate pencil-lead break

(PLB) and impact sources on aluminum plates. These high-fidelity simulations enabled

effective pre-training of deep learning models, thereby reducing the burden on experimental

data collection.

ii. We bridged the simulation-to-reality gap using a domain adaptation pipeline that aligns

features between domains. The approach uses MMD minimization and feature matching to

enhance real-world performance while reducing the need for large labeled datasets.

iii. Generative Adversarial Networks (GANs) were leveraged to mitigate data imbalance in dam-

age classes and to increase the diversity of AE training samples. Various GAN architectures

75

(e.g., DCGAN, WGAN) were evaluated, revealing significant gains in localization and clas-

sification accuracy when synthetic minority class samples were introduced.

iv. By combining an Inception-based regression model with GAN-synthesized data, we im-

proved AE localization. The method’s single-sensor design offers easy deployment without

sacrificing detection quality.

v. A framework that integrates low-fidelity (simulation) and high-fidelity (experimental) data

for inspection techniques like motion-induced and pulsed eddy current testing was proposed.

These multi-fidelity surrogate models offer efficient yet accurate defect characterization in

harsh electromagnetic environments.

vi. Class Activation Mapping (CAM), Grad-CAM, and Dimension-wise Compensated Acoustic

Monitoring (DCAM) were introduced to visualize and interpret deep learning decisions.

By highlighting crucial wave modes and time-frequency regions, the approach enhanced

transparency and trust in the classification pipeline.

6.2 Future Work

6.2.1 Domain-Adaptive Framework

The methodologies and findings presented in this thesis open several promising avenues for

future research and practical implementation. This chapter outlines key directions for extending

the current work across five interconnected areas: domain-adaptive frameworks, multi-fidelity

modeling, electromagnetic interference compensation, physics-guided explainable AI, and system-

level integration. While current domain adaptation studies predominantly focus on single-sensor

setups, future work could examine attention-based sensor fusion to handle multi-sensor Acoustic

Emission (AE) arrays. By learning to weight the most informative sensor signals dynamically, the

model could adapt to diverse propagation paths and noise conditions, potentially yielding more

precise and reliable localization performance.

Incorporating wave propagation constraints into

the loss function (e.g., penalizing predictions that deviate significantly from known Lamb wave

76

arrival times) can better align model outputs with physical reality. Such constraints could be

especially useful in cases where limited experimental data lead to overfitting or unrealistic wave

mode representations.

Recent advances in natural language processing (NLP) have demonstrated that large language

models, such as ChatGPT, can learn transferable representations from extensive pre-training on

diverse datasets. In future research, one promising direction is to investigate how these pretrained

architectures might be repurposed for AE signal analysis. While ChatGPT is primarily trained

on textual data, its underlying transformer-based structure could be adapted for sequential signal

inputs through careful feature engineering. A key challenge is bridging the gap between GPT’s

linguistic knowledge and the unique nature of AE signals, which exhibit complex time-frequency

patterns rather than human language. Fine-tuning ChatGPT (or similar LLMs) with domain-

specific AE datasets could potentially boost performance on classification, source localization, or

damage characterization tasks. The approach would involve mapping AE signal features into an

input format compatible with transformer architectures, then retraining (or partially fine-tuning)

the model on specialized corpora of annotated AE signals. Beyond classification and localization,

there may be untapped opportunities to transfer the learned embeddings from fine-tuned ChatGPT

to other AE tasks—such as anomaly detection, multi-sensor data integration, or real-time damage

prognosis. By systematically exploring transfer learning strategies, researchers could adapt the same

pretrained backbone to address a broad spectrum of AE problems, reducing both data requirements

and development time.

Investigating these may significantly expand the toolkit for data-driven

NDE, allowing large language model paradigms to be harnessed for highly specialized AE signal

interpretation and multi-domain SHM applications.

6.2.2 Multi-Fidelity Modeling and Signal Processing

Although multi-fidelity surrogate models reduce computational overhead, predictive uncertainty

remains a challenge. Incorporating Bayesian methods (e.g., Gaussian process models with posterior

distributions) or Monte Carlo dropout into the surrogate could provide confidence intervals for defect

parameter estimates, giving operators clearer risk assessments. To reduce redundant high-fidelity

77

simulations or experimental measurements, active learning or Bayesian optimization methods

could direct sampling to the most uncertain regions of the parameter space. This would maximize

information gain while minimizing overall testing costs. While the dissertation employs Gaussian

Process Regression and Radial Basis Function approaches, deep neural architectures (e.g., deep

Gaussian Processes or physics-informed neural networks) could better capture highly nonlinear

phenomena such as crack branching or complex eddy current distributions.

6.2.3 Compensation Method

Figure 45 Pancake Eddy Current Sensor Array for Circumferential Pipe Inspection

In future study, we plan to deploy a circumferential sensing system using pancake eddy current

(EC) sensors or magnetometers arranged in a ring formation around the pipe. As shown in Figure 45,

schematic design illustrated the multi-channel pancake eddy current sensor array wrapped around

a pipe, illustrating both exciter and receiver coils. The array is designed to detect circumferential

defects and changes in wall thickness by inducing and measuring eddy currents along the pipe’s

surface. This configuration enables the detection and precise localization of internal feeder cables,

determining both their offset distance and angular position. The system employs multiple pancake

coils positioned at equal angular intervals around the pipe’s outer circumference. Operating

78

frequencies will be selected to achieve optimal penetration through the pipe wall, achieving a

balance between detection sensitivity and resistance to noise interference. Special attention is

given to maintaining consistent liftoff distances for each coil, considering any protective coatings

such as wax tape. The measurement process follows a sequential activation pattern where each

coil serves as a transmitter while the others function as receivers. This approach generates a

comprehensive set of transmitter-receiver coupling measurements. Each sensor captures voltage

responses that contain encoded information about the cable’s location through characteristic field

distortion patterns. The collected sensor data can be processed to determine the precise position of

the internal cable.

79

BIBLIOGRAPHY

Ai, L., Soltangharaei, V., Bayat, M., Van Tooren, M., and Ziehl, P. (2021). Detection of impact
on aircraft composite structure using machine learning techniques. Measurement Science and
Technology, 32(8):084013.

Alam, M. U., Zaki, M., Ramachandra, V., et al. (2023). SHAMSUL: Systematic Holistic Analysis
to Investigate Medical Significance Utilizing Local Interpretability Methods in Deep Learning
for Chest Radiography Pathology Prediction. Nordic Machine Intelligence, 1(2):45–58.

Antipov, A. and Markov, A. (2018). 3d simulation and experiment on high speed rail mfl inspection.

NDT & E International, 98:177–185.

Assarar, M., Scida, D., El Mahi, A., Poilâne, C., and Ayad, R. (2015). Monitoring of damage
mechanisms in sandwich composite materials using acoustic emission. International Journal of
Damage Mechanics, 24(5):787–804.

Bao, Y. (2023). Modeling of eddy current ndt simulations by kriging surrogate model. Research

in Nondestructive Evaluation, 34:154–168.

Bengio, Y. (2012). Deep learning of representations for unsupervised and transfer learning. In
Guyon, I., Dror, G., Lemaire, V., Taylor, G., and Silver, D., editors, Proceedings of the ICML
Workshop on Unsupervised and Transfer Learning, volume 27 of JMLR Workshop and Confer-
ence Proceedings, pages 17–36. PMLR.

Bhuiyan, M. Y. and Giurgiutiu, V. (2017). Experimental and computational analysis of acoustic

emission waveforms for shm applications. Structural Health Monitoring, 16(5):608–620.

Bieri, O., Markl, M., and Scheffler, K. (2005). Analysis and compensation of eddy currents in

balanced ssfp. Magnetic Resonance in Medicine, 54(1):129–137.

Bouzid, O. M., Tian, G. Y., Cumanan, K., and Neasham, J. (2015). Wireless ae event and
environmental monitoring for wind turbine blades at low sampling rates. In Ohtsu, M., editor,
Advances in Acoustic Emission Technology, pages 533–546. Springer.

Chadha, M., Yang, Y., Hu, Z., and Todd, M. D. (2023). Evolutionary sensor network design for
structural health monitoring of structures with time-evolving damage. In Proceedings of the 14th
International Workshop on Structural Health Monitoring (IWSHM), pages 368–380. Stanford
University.

Chen, S.-Z. et al. (2021). Development of data-driven prediction model for cfrp-steel bond
strength by implementing ensemble learning algorithms. Construction and Building Materi-
als, 303:124470.

Chen, S. Z. and Feng, D. C. (2022). Multifidelity approach for data-driven prediction models of

80

structural behaviors with limited data. Computer-Aided Civil and Infrastructure Engineering,
37(12):1566–1581.

Chen, X. and Lei, Y. (2015). Electrical conductivity measurement of ferromagnetic metallic

materials using pulsed eddy current method. NDT & E International, 75:33–38.

Chen, X. and Liu, X. (2021). Pulsed eddy current-based method for electromagnetic parameters of

ferromagnetic materials. IEEE Sensors Journal, 21(12):6376–6383.

Ciampa, F. and Meo, M. (2010). Acoustic emission source localization and velocity determination
of the fundamental mode a0 using wavelet analysis and a Newton-based optimization technique.
Smart Materials and Structures, 19(4):045002.

Cortês, G. d. S., da Silva Junior, D., de Carvalho, T. B., Soares, E. L., dos Santos Oliveira, J. C.,
and Sattler, M. A. (2023). Analysis of pec technique and external magnetic fields for detection
of corrosion under insulation: 3d finite element model. Concilium, 15(2):145–160.

Cuadra, J., Vanniamparambil, P. A., Servansky, D., Bartoli, I., and Kontsos, A. (2015). Acoustic
emission source modeling using a data-driven approach. Journal of Sound and Vibration,
341:222–236.

Daugela, A., Chang, C.-L., and Peterson, D. (2021a). Deep learning based characterization of
nanoindentation induced acoustic events. Materials Science and Engineering: A, 800:140273.

Daugela, A., Chang, C.-L., and Peterson, D. (2021b). Deep learning based characterization of
nanoindentation induced acoustic events. Materials Science and Engineering: A, 800:140273.
Duplicate entry renamed to avoid collision with Daugela2021.

De Almeida, V. A. D., Baptista, F. G., and De Aguiar, P. R. (2015). Piezoelectric transducers
IEEE

assessed by the pencil lead break for impedance-based structural health monitoring.
Sensors Journal, 15(2):693–702.

Delashmit, W. H. and Manry, M. T. (2005). Recent developments in multilayer perceptron neu-
ral networks. In Proceedings of the Seventh Annual Memphis Area Engineering and Science
Conference (MAESC), pages 1–7. MAESC.

Dong, S., Yuan, M., Wang, Q., and Liang, Z. (2018). A modified empirical wavelet transform for
acoustic emission signal decomposition in structural health monitoring. Sensors, 18(5):1645.

Eaton, M. J., Pullin, R., and Holford, K. M. (2012). Towards improved damage location using
acoustic emission. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of
Mechanical Engineering Science, 226(9):2141–2153.

Ebrahimkhanlou, A., Dubuc, B., and Salamone, S. (2019). A generalizable deep learning framework
for localizing and characterizing acoustic emission sources in riveted metallic panels. Mechanical

81

Systems and Signal Processing, 130:248–272.

Ebrahimkhanlou, A. and Salamone, S. (2017a). Acoustic emission source localization in thin
metallic plates: A single-sensor approach based on multimodal edge reflections. Ultrasonics,
78:134–145.

Ebrahimkhanlou, A. and Salamone, S. (2017b). A probabilistic framework for single-sensor
acoustic emission source localization in thin metallic plates. Smart Materials and Structures,
26(9):095026.

Ebrahimkhanlou, A. and Salamone, S. (2018a). A deep learning approach for single-sensor acoustic
emission source localization in plate-like structures. Structural Health Monitoring, 17(5):1335–
1351.

Ebrahimkhanlou, A. and Salamone, S. (2018b). Single-sensor acoustic emission source localiza-
tion in plate-like structures: A deep learning approach. Health Monitoring of Structural and
Biological Systems XII, 10600:106001O.

Ebrahimkhanlou, A. and Salamone, S. (2018c). Single-sensor acoustic emission source localization

in plate-like structures using deep learning. Aerospace, 5(2):50.

Farcaş, I.-G. et al. (2023). Context-aware learning of hierarchies of low-fidelity models for multi-
fidelity uncertainty quantification. Computer Methods in Applied Mechanics and Engineering,
406:115908.

Fawaz, H. I., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. (2018). Transfer learning
for time series classification. In Proceedings of the 2018 IEEE International Conference on Big
Data (Big Data), pages 1367–1376. IEEE.

Fernández-Godino, M. G. et al. (2016). Review of multi-fidelity models.

arXiv preprint

arXiv:1609.07196.

Fernández-Godino, M. G. et al. (2019). Issues in deciding whether to use multifidelity surrogates.

AIAA Journal, 57(5):2039–2054.

Forrester, A. I. J., Sóbester, A., and Keane, A. J. (2007). Multi-fidelity optimization via surro-
gate modelling. Proceedings of the Royal Society A: Mathematical, Physical and Engineering
Sciences, 463(2088):3251–3269.

Fu, J., Ning, Z., and Chang, Y. (2021). Active compensation method for strong magnetic interference

of mems electronic compass. IEEE Access, 9:48860–48872.

Garrett, J. C., Mei, H., and Giurgiutiu, V. (2022). An artificial intelligence approach to fatigue
crack length estimation from acoustic emission waves in thin metallic plates. Applied Sciences,
12(3):1372.

82

Guillemé, M., Lemaitre, A., Devaux, M., and Lannoo, B. (2019). Agnostic local explanation for
time series classification. In 2019 IEEE 31st International Conference on Tools with Artificial
Intelligence (ICTAI), pages 432–439. IEEE.

Haile, M., Zhu, E., Hsu, C. D., and Bradley, N. (2020a). Deep machine learning for detection of

acoustic wave reflections. Structural Health Monitoring, 19(5):1340–1350.

Haile, M., Zhu, E., Hsu, C. D., and Bradley, N. (2020b). Deep machine learning for detection
of acoustic wave reflections. Structural Health Monitoring, 19(5):1340–1350. Duplicate entry
renamed to avoid collision with Haile2020.

Hamstad, M. A. (2007). Acoustic emission signals generated by monopole (pencil-lead break)
versus dipole sources: Finite element modeling and experiments. Journal of Acoustic Emission,
25:92–106.

Han, W., Liu, Y., Zhang, B., and Wang, Y. (2014). Fast estimation of defect profiles from
the magnetic flux leakage signal based on a multi-power affine projection algorithm. Sensors
(Basel), 14(9):16454–16466.

Hardy, R. L. (1971). Multiquadric equations of topography and other irregular surfaces. Journal

of Geophysical Research, 76(8):1905–1915.

Hashim, K. A., Md Nor, N., and Idrus, J. (2021). Determination of acoustic emissions data
characteristics under the response of pencil lead fracture procedure. Journal of Failure Analysis
and Prevention, 21(6):2064–2071.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pages 770–778. IEEE.

Holford, K. M., Pullin, R., Evans, S. L., Eaton, M. J., Hensman, J., and Worden, K. (2009).
Acoustic emission for monitoring aircraft structures. Proceedings of the Institution of Mechanical
Engineers, Part G: Journal of Aerospace Engineering, 223(5):525–532.

Hsu, T.-M. H., Chen, W.-Y., Hou, C.-A., Tsai, Y.-H. H., Yeh, Y.-R., and Wang, Y.-C. F. (2015).
In Proceedings of the

Unsupervised domain adaptation with imbalanced cross-domain data.
2015 IEEE International Conference on Computer Vision (ICCV), pages 4121–4129. IEEE.

Huang, X., Elshafiey, O., Farzia, K., Udpa, L., Han, M., and Deng, Y. (2023). Acoustic emission
source localization using deep transfer learning and finite element modeling-based knowledge
transfer. Materials Evaluation, 81(7):71–84.

Huang, X., Elshafiey, O., Mukherjee, S., Karim, F., Zhu, Y., Udpa, L., Han, M., and Deng, Y.
(2024a). Deep learning-assisted structural health monitoring: acoustic emission analysis and
domain adaptation with intelligent fiber optic signal processing. Engineering Research Express,

83

6(2):025222.

Huang, X., Han, M., and Deng, Y. (2024b). A hybrid gan-inception deep learning approach for en-
hanced coordinate-based acoustic emission source localization. Applied Sciences, 14(19):8811.

Huang, X., Li, Z., Peng, L., Chu, Y., Miles, Z., Chakrapani, S. K., Han, M., Poudel, A., and Deng,
Y. (2025). A novel multi-fidelity gaussian process regression approach for defect characterization
in motion-induced eddy current testing. NDT & E International, 150:103274.

Ismail-Fawaz, A., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. (2022). Deep learning
for time series classification using new hand-crafted convolution filters. In Proceedings of the
2022 IEEE International Conference on Big Data (Big Data), pages 972–981. IEEE.

Ivaturi, P., Reiss, J., and Mahmood, F. (2021). A comprehensive explanation framework for
IEEE Journal of Biomedical and Health Informatics,

biomedical time series classification.
25:2398–2408.

Jain, N., Manikonda, L., Olmo Hernandez, A., Sengupta, S., and Kambhampati, S. (2018).
Imagining an engineer: On gan-based data augmentation perpetuating biases. arXiv preprint
arXiv:1811.03751.

Janošek, M., Kopan, T., Zach, P., Janda, P., Mikirtek, P., Němec, M., and Ripka, P. (2019). Magnetic
calibration system with interference compensation. IEEE Transactions on Magnetics, 55(1):1–6.

Jiang, X., Zhao, F., Ge, Z., Yang, J., Xie, X., and Shi, S. (2020). Implicit class-conditioned domain

alignment for unsupervised domain adaptation. arXiv preprint arXiv:2006.04996.

Jones, M. R., Rogers, T., and Cross, E. J. (2022). Constraining gaussian processes for physics-

informed acoustic emission mapping. arXiv preprint arXiv:2206.01495.

Joseph, R. P. (2020). Acoustic emission and guided wave modeling and experiments for structural

health monitoring and non-destructive evaluation. Unpublished work or thesis.

Jung, B. H., Kim, Y. W., and Lee, J. R. (2019). Laser-based structural training algorithm for
acoustic emission localization and damage accumulation visualization in a bolt joint structure.
Structural Health Monitoring, 18(6):1851–1861.

Kalivarapu, V. and Winer, E. (2008). A multi-fidelity software framework for interactive modeling
of advective and diffusive contaminant transport in groundwater. Environmental Modelling &
Software, 23(12):1370–1383.

Kampolis, I. C. and Giannakoglou, K. C. (2008). A multilevel approach to single- and multiobjective
aerodynamic optimization. Computer Methods in Applied Mechanics and Engineering, 197(33–
40):2963–2975.

84

Karim, F., Zhu, Y., and Han, M. (2021). Modified phase-generated carrier demodulation of

fiber-optic interferometric ultrasound sensors. Optics Express, 29(16):25011–25021.

Kats, V. and Volkov, A. (2019). Features extraction from non-destructive testing data in cyber-
physical monitoring system of construction facilities. Journal of Physics: Conference Series,
1312(1):012015.

Kats, V. and Volkov, A. (2020). Features extraction from non-destructive testing data in cyber-
physical monitoring system of construction facilities. In Journal of Physics: Conference Series,
volume 1425, page 012149. IOP Publishing.

Kennedy, M. C. and O’Hagan, A. (2000). Predicting the output from a complex computer code

when fast approximations are available. Biometrika, 87(1):1–13.

Kontogiannis, S. G. et al. (2020). A comparison study of two multifidelity methods for aerodynamic

optimization. Aerospace Science and Technology, 97:105592.

Kumar, D. et al. (2018). A wireless shortwave near-field probe for monitoring structural integrity

of dielectric composites and polymers. NDT & E International, 96:9–17.

Leary, S. J., Bhaskar, A., and Keane, A. J. (2003). A knowledge-based approach to response surface

modelling in multifidelity optimization. Journal of Global Optimization, 26:297–319.

Li, D., Chen, D., Goh, J., and Ng, S.-K. (2018). Anomaly detection with generative adversarial

networks for multivariate time series. arXiv preprint arXiv:1809.04758.

Li, M., Zhou, Z., Xu, D., and Zhang, X. (2022). A new multi-fidelity surrogate modelling method
for engineering design based on neural network and transfer learning. Engineering Computations,
39(7):2480–2498.

Li, X.-B., Zhao, X., and Zhang, Y. (2009). Numerical simulation and experiments of magnetic flux
leakage inspection in pipeline steel. Journal of Mechanical Science and Technology, 23(1):109–
113.

Li, Y., Tian, G. Y., and Ward, S. (2006). Numerical simulation on magnetic flux leakage evaluation

at high speed. NDT & E International, 39(5):367–373.

Li, Y., Zhao, X., and Liu, J. (2021). Analysis of the anti-magnetic interference characteristics of the
stacked magneto-optical current sensor and error compensation method. In 2021 International
Conference of Optical Imaging and Measurement (ICOIM), pages 243–249. IEEE.

Liu, C., Lou, Y., Liu, C., Yang, Q., Yang, Z., Zhang, Q., Sun, H., and Zhao, X. (2022). Synthesized
magnetic field focusing for the non-destructive testing of oil and gas well casing pipes using
pulsed eddy-current array. IEEE Transactions on Magnetics, 58(9):1–10.

85

Liu, C., Wu, X., Mao, J., and Liu, X. (2017a). Acoustic emission signal processing for rolling
bearing running state assessment using compressive sensing. Mechanical Systems and Signal
Processing, 91:395–406.

Liu, C., Wu, X., Mao, J., and Liu, X. (2017b). Acoustic emission signal processing for rolling
bearing running state assessment using compressive sensing. Journal of Vibration and Acoustics,
139(5):051007.

Liu, G., Zhu, Y., Sheng, Q., and Han, M. (2020). Polarization-insensitive, omnidirectional fiber-

optic ultrasonic sensor with quadrature demodulation. Optics Letters, 45(15):4164–4167.

Liu, T., Lai, X., Song, X., and Guo, Z. (2023). A multi-fidelity surrogate model by optimal model
selection. In Proceedings of the International Conference on Automation Control, Algorithm,
and Intelligent Bionics (ACAIB 2023), page Paper 127593F.

Mahajan, H. and Banerjee, S. (2023). Acoustic emission source localisation for structural health
monitoring of rail sections based on a deep learning approach. Measurement Science and
Technology, 34(4):044010.

Majidnia, S., Rudlin, J., and Nilavalan, R. (2014). Investigations on a pulsed eddy current system
for flaw detection using an encircling coil on a steel pipe. Insight – Non-Destructive Testing and
Condition Monitoring, 56(10):560–565.

Malekzadeh, M., Atia, G., and Catbas, F. N. (2015). Performance-based structural health monitoring
through an innovative hybrid data interpretation framework. Journal of Civil Structural Health
Monitoring, 5(3):287–305.

Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2016). Response Surface Method-
ology: Process and Product Optimization Using Designed Experiments. Wiley, Hoboken, NJ,
4th edition.

Nafiah, F., Abidin, A. F. Z., Dzulkarnain, M. A., Abdullah, A. H., Yusoff, M. Z., and Jasni,
M. Z. (2020). Pulsed eddy current: Feature extraction enabling in-situ calibration and improved
estimation for ferromagnetic application. Journal of Nondestructive Evaluation, 39(3):40.

Nayebi, A., Karami, M., Gupta, P., Benesty, J., and Anjum, A. (2022). Windowshap: An effi-
cient framework for explaining time-series classifiers based on shapley values. arXiv preprint
arXiv:2211.06507.

Park, G. S. and Park, S. H. (2004). Analysis of the velocity-induced eddy current in mfl type ndt.

IEEE Transactions on Magnetics, 40(2):663–666.

Park, J. S. and Sandberg, I. W. (1991). Universal approximation using radial-basis-function net-

works. Neural Computation, 3(2):246–257.

86

Parvatharaju, P. S., Gupta, A., Gatterbauer, W., et al. (2021). Learning saliency maps to explain deep
time series classifiers. In Proceedings of the 30th ACM International Conference on Information
& Knowledge Management (CIKM), pages 3852–3856. ACM.

Piao, G., Li, Y., Tian, G. Y., Yin, W., and Wang, R. (2020). The effect of motion-induced eddy
current on high-speed magnetic flux leakage (mfl) inspection for thick-wall steel pipe. Research
in Nondestructive Evaluation, 31(1):48–67.

Ponikvar, D., Zupanič, E., and Jeglič, P. (2023). Magnetic interference compensation using the

adaptive lms algorithm. Electronics, 12(5):2360.

Qiu, X., Hu, X., Sun, W., and Chen, C. (2020). Acoustic emission propagation characteristics
and damage source localization of asphalt mixtures. Construction and Building Materials,
252:119074.

Rao, K. S. S., Rao, B. P. C., and Thirunavukkarasu, S. (2017). Development of pulsed eddy current
IETE Technical

instrument and probe for detection of sub-surface flaws in thick materials.
Review, 34(5):572–578.

Rifai, D., Tian, G. Y., Sophian, A., and Al-Turki, Y. A. (2016). Giant magnetoresistance sensors: A
review on structures and non-destructive eddy current testing applications. Sensors, 16(6):823.

Sause, M. G. R. (2011). Investigation of pencil-lead breaks as acoustic emission sources. Technical
Report 29-184, University of Augsburg, Institute for Physics, Experimental Physics II, Augsburg,
Germany.

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2016). Grad-cam:
Visual explanations from deep networks via gradient-based localization. International Journal
of Computer Vision, 128:336–359.

Sen, D. and Nagarajaiah, S. (2018). Data-driven approach to structural health monitoring using
In Proceedings of the 9th European Workshop on Structural

statistical learning algorithms.
Health Monitoring, pages 295–305. European Workshop on Structural Health Monitoring.

Sereda, O. and Korol, O. (2022). The external magnetic field modeling features of electrical
complexes and systems before and after its compensation. Bulletin of the National Technical
University "KhPI". Series: Energy: Reliability and Energy Efficiency, 2022(4).

Shao, S., Wang, P., and Yan, R. (2019). Generative adversarial networks for data augmentation in

machine fault diagnosis. Computers in Industry, 106:85–93.

Shawi, R. E., Li, J., Yan, Q., Navab, N., and Albarqouni, S. (2019). Interpretability in healthcare:
A comparative study of local machine learning interpretability techniques. In 2019 IEEE 32nd
International Symposium on Computer-Based Medical Systems (CBMS), pages 275–280. IEEE.

87

Shi, W., Zhu, R., and Li, S. (2022). Pairwise adversarial training for unsupervised class-imbalanced
domain adaptation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Dis-
covery and Data Mining (KDD), pages 1276–1285. ACM.

Shu, L., Songling, H., and Wei, Z. (2007). Development of differential probes in pulsed eddy

current testing for noise suppression. Sensors and Actuators A: Physical, 135(2):675–679.

Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image

recognition. arXiv preprint arXiv:1409.1556.

Singh, A., Sengupta, S., and Lakshminarayanan, V. (2020). Explainable deep learning models in

medical image analysis. Journal of Imaging, 6(6):52.

Singh, P. and Sharma, A. (2022).

Interpretation and classification of arrhythmia using deep

convolutional network. IEEE Transactions on Instrumentation and Measurement, 71:1–12.

Sophian, A., Tian, G. Y., and Fan, M. (2017). Pulsed eddy current non-destructive testing and

evaluation: A review. Chinese Journal of Mechanical Engineering, 30(3):500–514.

Sorin, V., Barash, Y., Konen, E., and Klang, E. (2020). Creating artificial images for radiology
applications using generative adversarial networks (gans) - a systematic review. Academic
Radiology, 27(8):1175–1185.

Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in

Statistics. Springer, New York, NY.

Strantza, M., Bianchi, G., Mahato, B., Mencattelli, L., Weaver, P., Dulieu-Barton, J. M., and Potter,
K. (2015). Evaluation of shm system produced by additive manufacturing via acoustic emission
and other ndt methods. Sensors, 15:26709–26725.

Sun, J., Li, K., Wang, M., Hou, Z., Li, X., and Lin, Z. (2021). Domain adaptation with geometrical

preservation and distribution alignment. Neurocomputing, 454:152–167.

Sun, L., Chen, J., Xu, Y., Gong, M., Yu, K., and Batmanghelich, K. (2022). Hierarchical amortized
gan for 3d high resolution medical image synthesis. IEEE Journal of Biomedical and Health
Informatics, 26(9):3966–3975.

Sun, R. (2020). Optimization for deep learning: An overview. Journal of the Operations Research

Society of China, 8(2):249–294.

Takagi, T., Fukutomi, H., and Tani, J. (1998). Numerical evaluation of correlation between crack
size and eddy current testing signal by a very fast simulator. IEEE Transactions on Magnetics,
34(5):2581–2584.

Ulapane, N., Mi, J., Zhu, Y., and Long, M. (2018). Non-destructive evaluation of ferromagnetic

88

material thickness using pulsed eddy current sensor detector coil voltage decay rate. NDT & E
International, 100:108–114.

van der Maaten, L. and Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine

Learning Research, 9:2579–2605.

Verstrynge, E., Schueremans, L., Van Gemert, D., and Wevers, M. (2021). A review on acoustic
emission monitoring for damage detection in masonry structures. Construction and Building
Materials, 268:121089.

Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. (2008). Extracting and composing
robust features with denoising autoencoders. In Proceedings of the 25th International Conference
on Machine Learning (ICML), pages 1096–1103. ACM.

Wang, H., Li, P., Lang, X., Tao, D., Ma, J., and Li, X. (2023). Ftgan: A novel gan-based data
augmentation method coupled time–frequency domain for imbalanced bearing fault diagnosis.
IEEE Transactions on Instrumentation and Measurement, 72:1–14.

Wang, J., Wen, S., Yao, Y., Zhao, X., Gao, K., Gao, L., and Li, P. (2021). Discriminative feature
alignment: Improving transferability of unsupervised domain adaptation by gaussian-guided
latent alignment. Pattern Recognition, 116:107936.

Wang, R., Zhang, Y., Li, Y., Tian, G. Y., and Yin, W. (2020). Motion induced eddy current based
testing method for the detection of circumferential defects under circumferential magnetization.
International Journal of Applied Electromagnetics and Mechanics, 64(1-4):501–508.

Wang, Z., Yan, W., and Oates, T. (2016). Time series classification from scratch with deep neural

networks: A strong baseline. arXiv preprint arXiv:1611.06455.

Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of

Big Data, 3(1):9.

Wevers, M. and Lambrighs, K. (2009). Applications of acoustic emission for shm: A review.
In Chang, F.-K. and Sohn, H., editors, Encyclopedia of Structural Health Monitoring, pages
289–302. Wiley.

Wickstrøm, K., Kampffmeyer, M., and Jenssen, R. (2020). Uncertainty-aware deep ensembles for
reliable and explainable predictions of clinical time series. IEEE Journal of Biomedical and
Health Informatics, 25:2435–2444.

Wilson, J. W. and Tian, G. Y. (2006). 3d magnetic field sensing for magnetic flux leakage defect
characterisation. Insight-Non-Destructive Testing and Condition Monitoring, 48(6):357–359.

Wu, J., Xing, D., Song, H., Lin, X., and Chen, W. (2020). Acoustic emission signal classifica-
tion using feature analysis and deep learning neural network. Fluctuation and Noise Letters,

89

19:2150030.

Wu, Y. and Huang, X. (2022). Unsupervised reinforcement adaptation for class-imbalanced text

classification. arXiv preprint arXiv:2205.13139.

Xiao, N. and Zhang, L. (2021). Dynamic weighted learning for unsupervised domain adaptation.
In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), pages 15237–15246. IEEE.

Zahir, M. K. and Gao, Z. (2013). Variable-fidelity optimization with design space reduction.

Chinese Journal of Aeronautics, 26(4):841–849.

Zaini, M. H., Hussin, F. J., Mohamed, Z. A., Yusof, M. H., and Marghany, M. (2021). Extraction
of flux leakage and eddy current signals induced by submillimeter backside slits on carbon steel
plate using a low-field amr differential magnetic probe. IEEE Access, 9:146755–146770.

Zhang, J. et al. (2019). A comparative study between magnetic field distortion and magnetic flux
leakage techniques for surface defect shape reconstruction in steel plates. Sensors and Actuators
A: Physical, 288:10–20.

Zhang, Y., Lee, J. D., Wainwright, M. J., and Jordan, M. I. (2017). On the learnability of fully-
connected neural networks. In Singh, A. and Zhu, J., editors, Proceedings of the 20th Interna-
tional Conference on Artificial Intelligence and Statistics (AISTATS), volume 54 of Proceedings
of Machine Learning Research, pages 83–91. PMLR.

Zhang, Y., Li, G., Wang, H., Wang, S., and Liu, Y. (2022). Multi-scale signed recurrence plot
based time series classification using inception architectural networks. Pattern Recognition,
123:108402.

Zhang, Y., Yang, Y., Liu, G., Wu, Q., and Bai, X. (2021). Grad-cam helps interpret the deep learning
models trained to classify multiple sclerosis types using clinical brain magnetic resonance
imaging. Journal of Neuroscience Methods, 353:109095.

Zhao, H., Zhang, W., Li, X., Yang, X., and Liu, Y. (2023). A mfl mechanism-based self-supervised
method for defect detection with limited labeled samples. IEEE Transactions on Instrumentation
and Measurement, 72:1–10.

Zhiye, D., Jiangjun, R., and Shifeng, Y. (2005). 3d mfl of steel pipe computation based on
nodal-edge element coupled method. In Proceedings of the 6th International Conference on
Electromagnetic Field Problems and Applications (ICEF), pages 252–256. IEEE.

Zhong, K.-Z. and Chen, J. (2023). Experimental applying acoustic emission to fault diagnosis and
prediction of autonomous devices. In Proceedings of the 2023 Sixth International Symposium
on Computer, Consumer and Control (IS3C), pages 1–4. IEEE.

90

Zhu, Y., Sheng, Q., and Han, M. (2020). Effect of laser polarization on fiber bragg gratings

fabry-perot interferometer for ultrasound detection. IEEE Photonics Journal, 12(4):1–9.

91

Appendix A

APPENDIX

To visualize the distribution of synthetic (augmented) and original datasets in a lower-dimensional

space, we employed t-Distributed Stochastic Neighbor Embedding (t-SNE) van der Maaten and

Hinton (2008). t-SNE is a powerful technique for visualizing high-dimensional data by mapping

each datapoint to a two or three-dimensional space. While t-SNE was originally designed for static

data, it has been adapted for use with time series data in some cases. Visualizing AE data can be

challenging due to its complexity and high dimensionality. However, t-SNE can be used to map

time series data onto a low-dimensional space while preserving its underlying structure. To apply

t-SNE to time series data, we first need to transform the sequential nature of the data into a set

of fixed-length feature vectors that can be used as input to t-SNE. This can be done using various

techniques such as sliding windows or feature extraction methods like Fourier transforms or wavelet

transforms. Once we have transformed the time series data into feature vectors, we can compute

pairwise similarities between them using a Gaussian kernel:

p𝑖, 𝑗 =

−||x𝑖 − x 𝑗 ||2/2𝜎2)
(cid:205)𝑘 (cid:205)𝑙 −||x𝑘 − x𝑙 ||2/2𝜎2)

(1)

where 𝑥𝑖 and 𝑥 𝑗 are two feature vectors, 𝜎 is a parameter that controls the width of the Gaussian

kernel, and 𝑝𝑖, 𝑗 is the probability that 𝑥𝑖 would pick 𝑥 𝑗 as its neighbor if neighbors were picked in

proportion to their probability density under a Gaussian centered at 𝑥𝑖. Next, we compute pairwise

similarities between points in the low-dimensional map using a Student-t distribution:

q𝑖, 𝑗 =

(1 + ||y𝑖 − y 𝑗 ||2)−1
(cid:205)𝑘 (cid:205)𝑙 (1 + ||y𝑘 − y𝑙 ||2)−1

(2)

where 𝑦𝑖 and 𝑦 𝑗 are two points in the low-dimensional map, and 𝑞𝑖, 𝑗 is the probability that 𝑦𝑖 would

pick 𝑦 𝑗 as its neighbor if neighbors were picked uniformly at random from all other points. Finally,

t-SNE minimizes the difference between these two distributions using gradient descent on a cost

function that measures their divergence:

92

KL(𝑃||𝑄) =

∑︁

∑︁

𝑖

𝑗

p𝑖, 𝑗 log

p𝑖, 𝑗
q𝑖, 𝑗

(3)

We’ve employed this t-SNE technique to enhance our understanding of the relationship between

our simulation and experimental datasets. The evolution of synthetic versus original data in a two-

dimensional t-SNE space across multiple training epochs (1, 10, 100, 500, 1000, and 2000) for the

GAN are presented in Figure 46. At the earliest epoch (1), the synthetic samples form a distinctly

dense cluster that is visibly separate from the original data points, indicating a wide divergence

between the generated and real distributions. By epoch 10, the synthetic cluster expands and starts

intermingling with scattered regions of the original dataset, suggesting an initial but still incomplete

alignment. At epoch 100, a more pronounced overlap emerges: the red points begin to envelop

various pockets of blue points, reflecting a notable reduction in distributional discrepancies. As

training progresses to epoch 500 and epoch 1000, the synthetic data cloud becomes more diffuse yet

steadily conforms to the general shape and spread of the original dataset, highlighting the GAN’s

increasing capability to capture underlying structural relationships. Finally, at epoch 2000, the red

points appear deeply integrated throughout the blue distribution, demonstrating a high degree of

similarity. This late-stage visualization underscores the GAN’s success in emulating the original

data’s manifold.

Overall, the progressive changes in t-SNE projections confirm that the standard GAN effectively

narrows the gap between the synthetic and real datasets over successive epochs. This trend reinforces

the notion that prolonged training with a carefully tuned objective function can yield realistic

augmented samples that closely mirror the complexities of the experimental data space.

As shown in Figure 47, in the early stages of GAN training, the augmented data points are

sparsely distributed and show little overlap with the original data, indicating poor alignment and

high divergence. By epoch 2000, the GAN-augmented data points are well-integrated with the

original data, demonstrating the GAN’s capability to generate synthetic data that closely resembles

the original dataset. WGAN shows a unique pattern where augmented data forms distinct clusters

that encompass the original data points, suggesting it captures the overall distribution well but may

93

Figure 46 t-SNE visualizations at six training epochs of 1 to 2000 for the standard GAN, illustrating
how synthetic (red) points progressively converge toward the distribution of original (blue) data in
low-dimensional space

94

Figure 47 T-SNE Visualization of Synthetic and Original Datasets

95

over-segment the data space. DCGAN and TSAGAN both show good integration of augmented and

original data, with TSAGAN appearing to have a slightly more uniform distribution. In the last im-

age, noise-based augmentation produces a distinct pattern where augmented data forms concentric

circles around original data points, indicating a simple additive noise approach that doesn’t capture

the underlying data distribution as effectively as GAN-based methods. These visualizations high-

light the effectiveness of GAN-based augmentation techniques in generating high-quality synthetic

data that closely mimics the original dataset. The increasing overlap and similarity in distribution

between original and augmented data points across different GAN architectures demonstrate their

capability to produce diverse yet representative samples, providing a robust foundation for deep

learning models and ensuring balanced representation across all labels.

Appendix B

It is important to emphasize that Lamb waves in thin plates (or thin-walled structures) are

inherently dispersive, meaning that the phase and group velocities vary with frequency Ciampa

and Meo (2010); Joseph (2020). As a result, for each frequency component, there can be multiple

wave modes (commonly labeled 𝑆0, 𝐴0, 𝑆1, 𝐴1, etc.) whose velocities also depend on the product

of frequency and plate thickness. For many practical applications, especially in the low-to-medium

frequency range, the fundamental modes 𝑆0 (symmetric) and 𝐴0 (antisymmetric) dominate the

wavefield. Moreover, the phase velocity 𝑐 𝑝 in Equation (5) should be interpreted as 𝑐 𝑝 ( 𝑓 ), i.e., a

function of frequency 𝑓 , further reinforcing the dispersive nature of Lamb waves. This dispersion

can lead to significant spreading of wave packets during propagation, which is an important

consideration when analyzing signals in structural health monitoring or acoustic emission testing.

Below is an example to derive the dispersion curve. The first step is to define the material properties

and parameters.

These parameters are necessary for the subsequent calculations of the phase and group velocities.

The dispersion relation for Lamb waves in a thin plate is given by the following transcendental

equation:

96

Table 1 Material Properties and Parameters

Quantity
Young’s modulus 𝐸
𝜐
Poisson’s ratio
𝜌
Density
𝑑
Thickness

Name Expression

206 GPa
0.3
2700 kg/m3
0.1 inch

tan(𝛽𝑑) =

4𝛽2𝑘 2 − 2𝑘 2𝛽2
(2𝛽2 − 𝑘 2) (𝑘 2 − 𝛽2)

(4)

where 𝛽 is the wavenumber in the material, 𝑘 is the wavenumber in the fluid, and 𝑑 is the thickness

of the plate. The wavenumbers are defined as:

𝛽 =

𝜔
𝑐 𝑝

and 𝑘 =

𝜔
𝑐 𝑓

(5)

where 𝜔 is the angular frequency, 𝑐 𝑝 is the phase velocity in the material, and 𝑐 𝑓 is the phase

velocity in the fluid. To find the dispersion curves, the dispersion relation is solved numerically. A

range of frequencies is considered, and for each frequency, the dispersion relation is solved using

a bisection method to find the corresponding phase velocity. This numerical sweep is carried out

separately for each mode of interest (e.g., 𝑆0 and 𝐴0).

Once the phase velocity 𝑐 𝑝 ( 𝑓 ) is obtained, one can directly observe how the velocity changes

with frequency, yielding dispersion curves for the plate. The thickness 𝑑 = 0.1 inch (approximately

2.54 mm) ensures that the lower-order modes are typically the most relevant in this frequency range.

Furthermore, to assess the energy transport properties, the group velocity can be calculated from

the derivative of 𝜔 with respect to 𝑘. In practice, the group velocity is critical for interpreting

wave arrival times and energy propagation in nondestructive testing applications, such as acoustic

emission. Because of the strong frequency dependence, both phase and group velocity curves must

be considered in analyzing wave propagation in thin plates.The group velocity is calculated from

the phase velocity using the following formula:

𝑣𝑔 =

𝑑𝜔
𝑑𝑘

97

(6)

where 𝑣𝑔 is the group velocity, 𝜔 is the angular frequency, and 𝑘 is the wavenumber. The derivative

is approximated using finite differences. The dispersion curves are then plotted as phase/group

velocity versus frequency. The dispersion curves for the symmetric and antisymmetric modes are

plotted separately.

Figure 48 Dispersion curves of two fundamental Lamb wave modes (blue and red) plotted as
velocity versus frequency

The phase (or group) velocity dispersion curves for two Lamb wave modes as a function of

frequency in the 0–2 MHz range is shown in Figure 48. The blue curve exhibits a noticeable drop

in velocity around the mid-frequency range, then rises again toward higher frequencies, indicating

strong dispersion.

Appendix C

Multi-fidelity methods have been applied across various engineering domains to improve mod-

eling accuracy and computational efficiency. Some approaches focus on enhancing visualization

and interactivity in complex, heterogeneous systems, particularly for groundwater or fluid flows, by

combining analytical elements with intuitive 3D user interfaces. Others leverage surrogate-based

optimization tools to handle industrial design problems, using trust-region or expected-improvement

methods to achieve effective trade-offs between solution quality and computational cost. Neural

network and kriging-based techniques also appear, where inexpensive (low-fidelity) models provide

“knowledge” that refines the training of more accurate (high-fidelity) surrogates, significantly re-

ducing the number of expensive evaluations. In structural applications, ensemble machine learning

98

has demonstrated high accuracy for bond strength predictions, while multi-level frameworks can

manage different model complexities at each stage of aerodynamic or structural design. To tackle

high-dimensional problems, some strategies pinpoint smaller, promising regions using low-fidelity

data, then refine those regions with high-fidelity samples, saving considerable resources. In ad-

dition, machine learning–based multi-fidelity approaches can maintain robust performance even

when high-fidelity data is scarce, provided low-fidelity sources capture relevant trends. Finally,

context-aware Monte Carlo methods on high-performance computing (HPC) systems show how

speedups on the order of multiple magnitudes can be achieved by balancing training costs and

variance reduction strategies. The list of studies are in the Table.

Table 2 Summary of multi-fidelity approaches in various engineering applications

Study

Aims/Focus

Analysis

Tech-

Major

Find-

nique/Procedure

ings/Limitations

Kalivarapu et

Develop

a multi-

Utilizes

the Superblock

More accurate heteroge-

al. Kalivarapu

fidelity

framework

Analytical

Element

neous flows

representa-

and

Winer

for modeling advec-

Method (AEM); Offers in-

tion than FD/FE methods;

(2008)

tive

and

diffusive

tuitive 3D GUI for setup &

User-friendly scenario cre-

transport;

Enhance

visualization; Desktop/VR

ation; May require signifi-

visualization

&

deployment

cant computational setup

interactivity for com-

plex,

heterogeneous

groundwater flows

99

Table 2 (cont’d)

Study

Aims/Focus

Analysis

Tech-

Major

Find-

nique/Procedure

ings/Limitations

Kontogiannis

Compare

multi-

Trust-region vs. expected-

Trust-region:

quick

et al. Konto-

objective

multi-

improvement

approach;

solutions,

good Pareto

giannis et al.

fidelity

surrogate-

Comparison with single-

efficiency;

Expected-

(2020)

based

optimization

fidelity and co-Kriging;

improvement:

better

tools;

Evaluate

in-

Complex aerodynamic test

exploration; Performance

dustrial aerodynamic

cases

depends on HPC resources

design cases

Leary et

al.

Reduce computational

Neural

networks

&

Knowledge-based kriging

Leary et

al.

costs via multi-fidelity

knowledge-based kriging;

matches neural

alterna-

(2003)

approaches;

Incor-

Uses cheap model outputs

tives; Significant computa-

porate cheap model

as prior information; LF

tional savings; Success de-

knowledge into train-

data

learning

for HF

pends on LF model quality

ing

improvement

Chen

Chen

et

et

al.

al.

Develop

data-driven

Ensemble

learning

Best accuracy (R2 = 0.98);

ensemble ML models

(GBDT, RF); Multiple ML

Effective ensemble meth-

(2021)

for CFRP-steel bond

algorithm

comparison;

ods; Requires large dataset

strength;

Compare

Variable importance anal-

ensemble methods vs.

ysis

traditional ML

100

Table 2 (cont’d)

Study

Aims/Focus

Analysis

Tech-

Major

Find-

nique/Procedure

ings/Limitations

Kampolis et al.

Propose

multilevel

Multilevel

structure

Efficient

Pareto

front

Kampolis and

framework for aerody-

with varying strategies;

search; Good

solution

Giannakoglou

namic

optimization;

Metamodel-assisted evo-

refinement;

Complex

(2008)

Integrate

different

lution;

Gradient-based

implementation

evaluation software

refinement

Zahir et al. Za-

Reduce design space

LF model for promising re-

39% computational cost

hir and Gao

for high-dimensional

gion identification; LF-to-

reduction; High-fidelity re-

(2013)

optimization; Use LF

HF surrogate modeling; It-

sults in refined regions;

model for region re-

erative update strategy

Limited to specific prob-

finement

lems

Chen

et

al.

Improve structural be-

ML-based multi-fidelity

Enhanced prediction accu-

Chen and Feng

havior prediction; Ad-

method;

LF-HF

data

racy; Depends on LF data

(2022)

dress limited HF data

integration; RC beam case

quality; Additional calibra-

challenges

study

tion needed

Farcaş et al.

Present

context-

Multi-fidelity Monte Carlo

72

days → 4

hours

Farcaş et al.

aware multi-fidelity

with variance reduction;

speedup; HPC viability;

(2023)

Monte Carlo method;

Cost-balanced

training;

Efficient variance reduc-

Optimize

training

HPC plasma simulation

tion; HPC-specific appli-

vs.

sampling costs;

Handle multiple LF

models

cation

101