AI-ENABLED KNOWLEDGE TRANSFER AND LEARNING FOR NONDESTRUCTIVE EVALUATION TOWARD INTELLIGENT AND ADAPTIVE SYSTEMS By Xuhui Huang A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical Engineering—Doctor of Philosophy 2025 ABSTRACT Critical infrastructure integrity and reliability, from composites to high-voltage feeder pipes and railway tracks, demands precise, robust, and explainable nondestructive evaluation (NDE) techniques. In this thesis, the growing need for accurate damage detection, localization, and characterization under conditions involving high-speed inspections, varying sensor positions, and electromagnetic interference is addressed through a suite of data-driven, physics-informed, and domain-adaptive methodologies. The proposed framework integrates deep learning, advanced signal processing, and multi-fidelity modeling to address key challenges such as limited and unbal- anced experimental datasets, mismatched simulation-to-field conditions, and signal distortion from environmental noise. First, a novel hybrid deep learning architecture is introduced that combines Generative Adversarial Networks (GANs) with Inception-based neural networks for coordinate- based Acoustic Emission (AE) source localization. This method achieves significant reductions in localization estimation errors, enabling reliable single-sensor monitoring. Complementing this effort, explainable deep convolutional neural network models are proposed for AE signal classi- fication. Guided by physics-informed signal segmentation and advanced visualization techniques such as Class Activation Mapping (CAM) and Gradient-weighted CAM (Grad-CAM), these models illuminate the underlying mechanisms of Lamb wave mode interactions, thereby instilling trust and interpretability into the machine learning pipeline. Domain adaptation and transfer learning are central to this work. Specifically, the gap between abundant simulated data (source domain) and limited experimental measurements (target domain) is bridged to ensure that feature representations learned from large-scale synthetic datasets can be effectively transferred and fine-tuned. By inte- grating physics-informed constraints and knowledge transfer, the resulting models exhibit higher accuracy, are less prone to overfitting, and maintain interpretability in varied scenarios. A multi- fidelity Gaussian Process Regression (GPR) strategy is further presented for motion-induced eddy current testing (MIECT) to manage both forward (signal prediction) and inverse (defect estimation) problems under in-service inspection, sensor motion, and environmental noise. These GPR-based surrogates fuse low-fidelity finite element simulations with high-fidelity experimental data, accu- rately predicting sensor responses at inspection speeds exceeding typical laboratory conditions and enabling robust inverse estimations of defect geometries. In parallel, a novel auto-compensation algorithm for Pulsed Eddy Current (PEC) inspections is developed to address electromagnetic interference in feeder lines carrying high currents, significantly improving thickness estimation accuracy and reducing false-positive indications in underground pipeline applications. Copyright by XUHUI HUANG 2025 ACKNOWLEDGEMENTS I am deeply grateful to my advisor, Professor Dr. Yiming Deng, for his exceptional mentorship throughout my doctoral studies. His expert guidance helped me navigate numerous challenges in NDE applications. Beyond technical supervision, he fostered my academic development by offering opportunities to lead diverse projects and guiding me through problem formulation, model development, and research publication. His passion for research, commitment to teaching excel- lence, and professional approach have profoundly shaped my academic journey. As committee chair, his direction was extremely important in developing my research scope.I extend my sincere thanks to my committee members, Professors Dr. Ming Han, Dr. Lalita Udpa, Dr. Satish Udpa, and Dr. Tapabrata Maity, for their generous support and guidance. Professor Maity’s constructive feedback shaped the GAN-based data augmentation work, complemented by Professor Satish’s insights on pretrained networks, knowledge transfer, and motion-induced eddy current physics. Professor Han provided crucial guidance on test design and AE configurations, while Professor Udpa’s expertise enhanced my understanding of NDE challenges and experimental protocols, re- fining both equipment requirements and thesis content. Moreover, in the Electrical and Computer Engineering department, I found an extraordinary community that has significantly shaped my academic development. To my friends and colleagues at the Nondestructive Evaluation Laboratory (NDEL), I am profoundly thankful for your friendships, for introducing me to compelling real- world research problems, and for contributing to my scientific growth. Finally, I am profoundly indebted to my family for their support and belief in my pursuits, which has made every goal seem achievable. Their constant encouragement and faith in my abilities have been the bedrock of my journey, inspiring me to push beyond perceived limitations. v TABLE OF CONTENTS CHAPTER 1 1.1 Motivation & Objectives 1.2 Scope of Research . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 CHAPTER 2 DOMAIN ADAPTIVE FRAMEWORK FOR SIMULATION TO 4 EXPERIMENT KNOWLEDGE TRANSFER . . . . . . . . . . . . . . . 2.1 Domain-Adaptive Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . 2.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Imbalanced and Limited Training Data . . . . . . . . . . . . . . . . . . . . . . 14 2.4 2.5 Transfer Learning and Pre-Training on Simulation Data . . . . . . . . . . . . . 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.6 Domain Adaptation . . 22 2.7 GAN-Based Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 CHAPTER 3 . . . . Introduction . MULTI-FIDELITY SURROGATE MODELING FOR EFFICIENT SIMULATION EXPERIMENT INTEGRATION . . . . . . . . . . . . . 33 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Motivation and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 . 35 3.3 Forward and Inverse Problem Formulations 3.4 Multi-Fidelity Surrogate Framework . . . . . . . . . . . . . . . . . . . . . . . 36 3.5 Computational Complexity and Trade-Offs . . . . . . . . . . . . . . . . . . . . 39 3.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 4 Introduction . COMPENSATION IN PULSED EDDY CURRENT TESTING VIA SURROGATE MODELING . . . . . . . . . . . . . . . . . . . . . . . . 45 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 . 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 . 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.1 4.2 Challenges . 4.3 Analytical Model 4.4 Compensation Method via Surrogate Model 4.5 Field Tests and Results . 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 5 Introduction . PHYSICS GUIDED EXPLAINABLE NETWORKS FOR AE CLASSIFICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 . 61 5.2 Physics-informed AE Segmentation . . . . . . . . . . . . . . . . . . . . . . 5.3 Signal Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.4 Explainable CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Improved Interpretability and Performance . . . . . . . . . . . . . . . . . . . . 67 5.5 . . . . CHAPTER 6 . 75 6.1 Conclusions and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 75 CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . vi 6.2 Future Work . . . BIBLIOGRAPHY . . . . APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 vii CHAPTER 1 INTRODUCTION 1.1 Motivation & Objectives In the past decades, there has been a growing interest in developing more effective Structural Health Monitoring (SHM) and Nondestructive Evaluation (NDE) techniques for critical infras- tructure systems. Aging bridges, composite panels, high-voltage feeder lines, and railway tracks require thorough inspection to ensure early detection of damage and long-term reliability. However, as these infrastructures become increasingly geometrically complex and operate under demanding conditions, such as in-service inspections at high speeds, varying load levels, and harsh environ- mental noise, traditional inspection approaches frequently fail to identify initial defects or predict crack propagation rate with sufficient accuracy. This not only risks safety and reliability but also increases maintenance cost, therefore highlighting the need of data-driven methodologies capable of adapting to real-world operational scenarios. Among various diagnostic methods, Acoustic Emission (AE) techniques exhibit great potential for characterizing micro-crack formation, de- lamination, and other damage phenomena in both metallic and composite materials. In practice, however, analyzing AE signals involves confronting multiple complexities, including multi-modal wave propagation (with mode conversions and dispersive behaviors), non-stationary signal patterns, and ambient operational noise. These factors impede accurate identification of AE characteristics and risk obscuring nascent failure indicators. Recent innovations in deep learning frameworks—such as convolutional and recurrent neural architectures—offer a transformative potential for SHM and NDE by automatically extracting salient features from high-dimensional, noisy signals. This alleviates many longstanding challenges associated with the manual engineering of signal descriptors, especially when addressing rare or subtle fault conditions. Nonetheless, the limited availability of labeled experimental data continues to pose a formidable hurdle. In many critical infrastructures, it is often costly to gather a sufficient quantity of in situ measurements, and for rare or infrequent failure modes, such data may be virtually unattainable. Consequently, there has been a marked rise in approaches 1 grounded in transfer learning, domain adaptation, data augmentation (e.g., Generative Adversarial Networks), and multi-fidelity modeling, where simplified finite element analyses are fused with high-fidelity experimental data to form robust, cost-effective detection. Simultaneously, the “black box” perception of deep learning algorithms has prevented their deployment in safety-critical applications, such as aerospace or nuclear power generation. In these applications, it is critical not only to obtain accurate predictions but also to understand and explain the underlying mechanism. Explainable AI tools—such as Class Activation Mapping and its variants—enable visualization of the internal decision-making, aligning model interpretations with physical wave phenomena (e.g., Lamb wave modes). Transparency in this process builds confidence and facilitates regulatory approval for AI-powered inspection systems. Ultimately, our research aims to elevate SHM from a proactive, predictive methodology. By uniting AI-based algorithms with physics insights and domain-adaptive knowledge transfer, the present work seeks to advance damage detection, curtail false alarms, and strengthen interpretability across diverse applications. 1.2 Scope of Research In this dissertation, we investigate data-driven approaches for enhancing the accuracy and in- terpretability of AE-based monitoring and other NDE methods by addressing two core challenges: limited experimental data and imbalanced label distributions. Such challenges commonly arise in real-world structural health monitoring scenarios, where rare crack initiation events, combined with demanding test conditions, often constrain the quantity and quality of available labeled datasets. The objective is to develop robust, physics-informed methods that enable more reliable damage detection, localization, and characterization across various materials, including metals, compos- ites, and electromagnetically active structures. Chapter 1 reviews the motivation and objectives for next-generation SHM and previews the organizational structure of this dissertation. In particu- lar, it introduces the fundamental issues related to the data scarcity, class imbalance, and domain adaptation needs. Chapter 2 presents the Domain-Adaptive Framework, illustrating how domain adaptation and transfer learning can bridge discrepancies between simulated datasets (source do- main) and limited experimental data (target domain) in AE applications. Through strategies such as 2 fine-tuning pre-trained networks, minimizing domain mismatches, and implementing GAN-based data augmentation, the chapter demonstrates how reliance on extensive experimental data can be mitigated. Chapter 3 describes the Multi-Fidelity Surrogate Modeling approach for Motion- Induced Eddy Current Testing, integrating low-fidelity simulations with high-fidelity experimental measurements. Radial Basis Function scaling, Gaussian Process Regression, and feature discretiza- tion are employed to enhance both forward signal predictions and inverse defect estimations, while ensuring computational efficiency suitable for real-time applications. Chapter 4 outlines a novel compensation method for Pulsed Eddy Current (PEC) Testing in environments where strong, spa- tially varying magnetic fields from high-current power lines adversely affect thickness readings. Leveraging surrogate modeling and finite element simulations, this novel correction algorithm sig- nificantly reduces false positives by compensating for electromagnetic interference, as evidenced by field validations in working pipeline segments. Chapter 5 introduces Physics-Guided Explainable Networks for AE signal classification, focusing on explainable AI (XAI) techniques such as Class Activation Mapping (CAM), Gradient-weighted CAM (Grad-CAM), and Dispersion-Compensated CAM (DCAM). By correlating these visualization methods with Lamb wave segmentation, the chapter demonstrates how time-frequency regions critical to model predictions can be clearly iden- tified, thereby enhancing interpretability and fostering confidence among domain experts. Chapter 6 summarizes the conclusions and contributions while outlining future research directions. This chapter explores potential developments in domain-adaptive frameworks, multi-fidelity modeling and signal processing, and compensation methods. Chapter 7 contains the appendix. In summary, these chapters combine numerical simulations, laboratory experiments, advanced signal processing, and deep learning to expand the frontiers of AE-based and electromagnetic-based structural health monitoring. Through robust domain adaptation, multi-fidelity integration, compensation strategies, and interpretable neural networks, this dissertation aspires to provide actionable insights, reduce inspection costs, and increase diagnostic certainty in support of mission-critical industries such as aerospace, nuclear power, and civil infrastructure. 3 CHAPTER 2 DOMAIN ADAPTIVE FRAMEWORK FOR SIMULATION TO EXPERIMENT KNOWLEDGE TRANSFER Acoustic Emission (AE) provides real-time capabilities for detecting and localizing structural damage Wevers and Lambrighs (2009). This monitoring technique employs surface-mounted sen- sors to detect stress waves propagating through the structure. Figure 1 illustrates how these captured signals are processed by data acquisition systems to identify both the source and characteristics of AE events. AE monitoring is distinguished by its exclusive sensitivity to active damage processes, enabling real-time detection of evolving defects. The continuous, real-time characteristics enable dynamic structural integrity assessment, providing crucial early warning of developing damage across multiple engineering applications Bouzid et al. (2015); Holford et al. (2009). This facilitates the detection of micro-cracking, fiber breakage, delamination, and other early damage mechanisms long before they become visually apparent Bhuiyan and Giurgiutiu (2017). This diagnostic ca- pability aids in preventing catastrophic failures, enhancing maintenance strategies, and extending infrastructure service life Sen and Nagarajaiah (2018); Chadha et al. (2023); Malekzadeh et al. (2015). Technological advances in sensing hardware and signal acquisition have substantially improved AE detection sensitivity and reliability. Recent research has focused on robust sensor placement, novel waveguide designs, and miniaturized wireless devices capable of operating under harsh environments Dong et al. (2018). Figure 1 AE technique For experimental validation, this thesis utilizes fiber-optic sensing technology, with the stan- dardized pencil-lead break (PLB) test serving as the controlled acoustic emission source Sause 4 (2011); De Almeida et al. (2015); Hashim et al. (2021) as shown in Figure 2. The fiber optic sensor, known for its high sensitivity and small size, records elastic waves generated by the event. They have been extensively applied for measuring a variety of parameters including temperature, strain, pressure, vibration, and acoustics due to their high detection sensitivity and small size Zhu et al. (2020). The source localization has advanced from traditional time-difference-of-arrival (TDOA) methods to identification frameworks, enabling localization with reduced sensor array complex- ity. Existing data analysis methods employ techniques such as time-domain, frequency-domain, and time-frequency analysis to extract key features from signals Takagi et al. (1998); Eaton et al. (2012). Advanced instrumentation and digital acquisition systems provide enhanced signal qual- ity and timing precision, enabling detailed damage characterization across metals and composites Ebrahimkhanlou and Salamone (2017b,a, 2018c); Mahajan and Banerjee (2023). Figure 2 PLB test as AE source and fiber coil sensor as sensor Deep learnings have revolutionized AE signal analysis, offering powerful tools for damage detection Ebrahimkhanlou and Salamone (2018c); Mahajan and Banerjee (2023). Deep learning architectures, particularly convolutional and recurrent neural networks, enable single-sensor source localization by automatic feature extraction from complex AE waveforms Ebrahimkhanlou and Salamone (2018a,b); Ebrahimkhanlou et al. (2019). Researchers have also explored data augmen- tation, physics-informed modeling, and noise reduction strategies to tackle the inherent challenges of AE signals, which are highly susceptible to noise and environmental factors Assarar et al. (2015); Ai et al. (2021). Such methods refine AE interpretation, characterizing impact-induced acoustic events Daugela et al. (2021a) and facilitating accurate detection of wave reflections Haile 5 et al. (2020a). Efforts to integrate AE with complementary SHM modalities, leverage probabilistic frameworks for source identification, and adopt adaptive models have led to greater reliability and applicability across broad operational conditions Jung et al. (2019); Jones et al. (2022); Garrett et al. (2022); Verstrynge et al. (2021). In summary, AE-based SHM has evolved into a highly refined practice, combining advanced sensing, improved data acquisition, and cutting-edge signal interpretation methods. This has increased the AE’s sensitivity and reliability, enabling this tech- nique to detect subtle signs of material deformation, track damage progression in real-time Wevers and Lambrighs (2009), and ultimately enhance the safety and longevity of critical engineering structures. 2.1 Domain-Adaptive Framework A significant challenge in AE modeling is bridging the gap between simulation-trained models and real-world applications, where signal distributions, environmental noise, and wave propagation characteristics differ substantially Li et al. (2018). To address this challenge, domain adaptation, transfer learning, and data augmentation techniques have emerged as critical strategies for improv- ing robustness and generalization Sun (2020); Bengio (2012); Ismail-Fawaz et al. (2022). Domain adaptation focuses on aligning feature distributions between source (simulated) and target (exper- imental) domains, enabling models to perform well despite shifts in data characteristics Fawaz et al. (2018); Zhang et al. (2017); Weiss et al. (2016). Transfer learning leverages knowledge acquired from a related, abundant domain—often simulation-based—to expedite learning in a data-scarce target domain, thereby reducing the number of required experimental samples Ismail- Fawaz et al. (2022); Fawaz et al. (2018).Recent innovations have also highlighted the potential of generative adversarial networks (GANs) to overcome these domain disparities. Among these innovative approaches, generative adversarial networks (GANs) have emerged as a powerful tool in the field of data augmentation, offering significant potential for enhancing time series signal analysis Ebrahimkhanlou et al. (2019). By producing synthetic AE signals that closely mimic real data characteristics Sun et al. (2022), GAN-based augmentation mitigates the imbalance and scarcity of experimental data, thereby improving model training. Furthermore, GANs can signifi- 6 cantly expand training datasets Sun et al. (2022), an especially valuable feature when dealing with minority damage classes and irregular signal patterns Shao et al. (2019); Sorin et al. (2020); Wang et al. (2023); Jain et al. (2018). Such synthetic data generation has proven effective in related fields, including machine fault diagnosis and medical imaging, where GAN-augmented training sets improve performance and address data limitations Shao et al. (2019); Sorin et al. (2020). Nonetheless, challenges such as training instability and mode collapse Jain et al. (2018) re- quire careful architectural refinements and training protocols. To address these domain adaptation challenges for AE source localization, researchers have explored approaches that combine feature extraction techniques Kats and Volkov (2020, 2019); Liu et al. (2017a,b), unsupervised adaptation strategies Wu and Huang (2022); Jiang et al. (2020); Xiao and Zhang (2021); Shi et al. (2022); Hsu et al. (2015), and transfer learning paradigms. For example, unsupervised domain adaptation with imbalanced cross-domain data is tackled by Hsu et al. (2015) through Closest Common Space Learning (CCSL), ensuring robust learning despite limited labeled targets. Similarly, reinforcement learning-based domain adaptation can learn robust domain-invariant representations Wu and Huang (2022), while sampling-based implicit alignment approaches help to mitigate class imbalance im- pact Jiang et al. (2020). By integrating these methodologies—domain-adaptive feature extraction, GAN-based augmentation, and transfer learning—recent studies have reported improved AE lo- calization accuracy and reduced training data requirements Li et al. (2018); Sun (2020); Bengio (2012). Techniques such as fine-tuning pre-trained models on simulated data allow pre-trained models to generalize well to variations in acoustic emission signals Ismail-Fawaz et al. (2022), ultimately guiding the practical deployment of single-sensor monitoring systems and advancing structural health monitoring practices Bengio (2012). To address this limitation, we propose an integrated domain-adaptive framework incorporating three key strategies as shown in Figure 3: (1) pre-training on simulated AE data to establish ro- bust feature representations, (2) domain adaptation techniques to align simulated and experimental feature distributions, and (3) generative adversarial networks (GANs) for synthetic data augmen- tation of limited experimental datasets. This proposed domain adaptive framework significantly 7 Figure 3 Overview of Advanced Methods for Bridging Simulation-Based AE Data with Experi- mental Data enhances AE source localization and characterization accuracy and reliability in practical applica- tions, demonstrating effective knowledge transfer between simulation and experiment domains. 2.2 Experimental Setup The experimental setup used in this research centers on a novel fiber-optic coil-based AE sensing system. As shown in Figure 4, the arrangement features a grid-marked aluminum plate and a fiber-optic AE sensor developed in Liu et al. (2020). This sensor includes a fiber coil with two identical Fiber Bragg Gratings (FBGs) forming a low-finesse Fabry–Perot Interferometer (FPI). With an 8 mm outer diameter and 6 mm inner diameter, the sensor can be flexibly mounted onto the sample surface, ensuring high ultrasound sensitivity while adapting to environmental perturbations. Ultrasound waves induce refractive index variations within the coiled fiber, causing shifts in the FPI’s spectral fringes. By employing a modified phase-generated carrier method Karim et al. (2021), the AE signal can be extracted with good linearity and high sensitivity regardless of the laser wavelength alignment with the sensor fringes. 8 Figure 4 Schematic of the fiber-optic coil-based Acoustic Emission (AE) sensing and monitoring system Light from a narrow-linewidth, tunable diode laser passes through a circulator and polarization controller before reaching the FBG-FPI sensor. Reflected light returns through the circulator to a photodetector (PD). The output from the PD is processed through dual parallel paths, each mixed, filtered, and amplified to enhance the signal-to-noise ratio (SNR) over the frequency band of interest (50–500 kHz). These steps enable effective isolation of the AE signals from background noise and interference. 2.2.1 PLB Test and Impact Test To generate AE signals, the widely accepted Hsu–Nielsen pencil lead break (PLB) test Sause (2011) is employed. As depicted in Figure 5(a), the aluminum plate is divided into a well-defined grid layout with multiple test points. Each of the designated locations undergoes repeated PLB tests using a 2H mechanical pencil with a 0.5 mm diameter lead. This standardized procedure ensures that each location underwent ten PLB tests, providing a consistent and reproducible dataset. In addition to PLB tests, impact-like signals are generated by dropping small steel balls onto the plate from controlled heights. The diversity in signal types and locations allows for evaluation of the AE source localization. Furthermore, impact-like signals were gathered by dropping steel balls (4.7 mm diameter) from a height of 25 mm at the same AE sensor location illustrated in Figure 5(b). The equipment and settings for this experiment mirrored those utilized for the PLB tests. The recorded signals were distinguished and examined for AE source identification and localization, using these procedures. The experimental setup facilitated the collection of precise and accurate 9 data, thereby enabling the evaluation of the proposed method’s efficacy. Figure 5 Experimental test setups: (a) PLB Test (b) Impact Test 2.2.2 AE Data Labeling and Classification Figure 6 (a) Tests conducted on an aluminum plate that is segregated into nine identified zones (b) manual crack propagation test on the sample with three zones labeled For this study, PLB tests were conducted on a 2.54 mm thick aluminum plate measuring 0.30 x 0.30 m. The grid-marked approach and repeated testing across numerous points yield a robust set of AE signals crucial for training and validating the deep learning models. Method for AE signal generation Sause (2011), was used in this study. PLB tests were conducted on a 1/10-inch-thick aluminum plate. The plate was partitioned into nine distinct locations as delineated in Figure 6(a). Each of the nine representative points, denoted by a red dot, underwent the PLB test ten times, using a 2H mechanical pencil with a 0.5 mm diameter lead. To associate AE waveforms corresponding to crack propagation levels, A 1-inch pre-crack notch simulated crack initiation shown in Figure 6(b). A 2H lead pencil, angled at 45°, was fractured on the plate’s surface in ten repetitions at cracking 10 tip to mimic AE generated during crack propagation, aiding in the classification of different crack stages. The propagation step size, denoted as Δ𝐿, is set at 0.1 inches. The tests strive for uniformity by breaking nearly identical lengths of pencil leads at the same angle to the surface. We aim to demonstrate the ability to discern different cracking levels through the parametric analysis of AE signals. 2.3 Simulation Setup 2.3.1 Analytical Excitation Function We characterize AE burst in consistent with the approach in Cuadra et al. (2015). We pre- train Deep Learning models using AE data obtained from Finite Element Method (FEM). These simulations include both impact-type and PLB tests to improve source localization in the specimen Hamstad (2007). In the setup, the PLB source was positioned in the out-of-plane direction at a predefined location on the plate, with the sensor located an inch from the right and upper edges of the plate. This approach enables the production of pre-training data for DL models, fostering more accurate and efficient localization of AE sources in practice. In essence, simulated AE signals serve as effective training surrogates, allowing us to bolster the accuracy and efficiency of real-world AE signal-processing algorithms. Figure 7 (a) Excitation function to simulate PLB test (b) Excitation function to simulate impact test Two excitation functions were employed in the simulations: one for the PLB test shown in Figure 7(a) and another for the impact test shown in Figure 7(b). For the PLB test, the excitation signal 𝐹1(𝑡) was chosen due to its gradual increase in amplitude, representing mechanical loading. 11 The function is defined as: 𝐹1(𝑡) = −2𝑡/𝑡1, 0 < 𝑥 < 𝑡1 − cos(𝜋(𝑡 − 𝑡1)) − 1, 𝑡1 < 𝑥 < 𝑡2 (2.1) 0, 𝑡2 < 𝑥    In this expression, 𝑡1 and 𝑡2 define specific time intervals during which the signal ramps up and then decrease to zero, effectively simulating the mechanical loading process. For the impact test, the excitation signal 𝐹2(𝑡) is expressed as: 𝐹2(𝑡) = 𝐶𝑒−𝛾𝑡/𝑡0 sin (cid:19) (cid:18) 4𝜋 1 + 𝑡0/𝑡 (2.2) Here, 𝐶 denotes the initial amplitude, 𝛾 is a damping factor, and 𝑡0 represents a characteristic time, with the value shown in Table 2.1. This damped sinusoidal wave is commonly observed in impact tests, capturing the material response to mechanical loading. 𝐹1(𝑡) and 𝐹2(𝑡) coupled with FEM simulations, yield waveforms that closely approximate real AE signals in both PLB and impact scenarios. Table 2.1 Analytical function parameter specification Parameter Young’s modulus Poisson’s ratio Density 𝑡0 𝑡1 𝑡2 Decay rate 𝛾 Value (unit) 206 (GPa) 0.3 2710 (kg/m3) 5 (𝜇s) 6.5 (𝜇s) 7.5 (𝜇s) 1.85 2.3.2 Wave Propagation By applying external loads, tensile stresses develop, especially at the extremities. The resulting stress intensity factor, 𝐾, gauges the intensity of the stress field near the crack tip. In this research, we model crack growth using a step-by-step approach where crack length increases by predetermined amounts. The rate and path of propagation are governed by factors such as stress intensity range, 12 maximum stress intensity, fracture toughness 𝐾𝐼𝐶, and local stresses at the crack tip. The stress ratio 𝑅 = 𝜎𝑚𝑖𝑛/𝜎𝑚𝑎𝑥, and the differential stress intensity factor Δ𝐾𝐼 = (1 − 𝑅)𝐾𝐼 characterize the cyclic loading conditions associated with time-dependent energy dispersal from the crack. Table 2.2 details the relevant material properties. When the predicted growth rate aligns with the critical fracture toughness 𝐾𝐼𝐶, a catastrophic fracture event is anticipated. Because direct measurement of 𝐾 is challenging due to the inherent stress singularity, alternative energy-based methods such as the J-integral and energy release rate become indispensable. These methods facilitate the stress field assessment near the crack tip, improving accuracy in evaluating fracture mechanics. Table 2.2 Material Property Name Expression Quantity 𝐸 Young’s modulus 𝜈 Poisson’s ratio Coefficient in Paris’ Law 𝐴 𝑚 Exponent in Paris’ Law 206 GPa 0.3 1.4×10−11 3.1 Figure 8 Snapshots of Von Mises stress distribution in MPa obtained from fatigue loading FEM simulation as the crack propagate: (a) crack length of 0.1 inch (b) crack length of 1.5 inch (c) crack length of 3 inch The snapshots of stress distribution as the crack propagates at different lengths (0.1 inch, 1.5 inches, and 3 inches) is illustrated in Figure 8. These highlight how stress concentrations developed with increasing crack size. To further investigate the transient dynamics, time-domain simulations capture wave propagation, illustrating how AE waveforms are generated and how their characteristics change as the fracture progresses. Such simulations offer profound insights 13 into crack behavior, underscoring the utility of FEM analyses in nondestructive evaluation and structural health monitoring. 2.4 Imbalanced and Limited Training Data We rely on simulation data to develop training sets and enhance Deep Learning models, yet real-world fatigue testing continues to play a critical role. In this research, crack lengths are categorized into three stages as shown in Figure 9—initiation (0–1 inch), stable propagation (1–2 inches), and unstable propagation (2–3 inches)—to classify AE signals more effectively. However, data imbalance, particularly in the 2–3 inch range where rapid failure occurs, presents a common machine learning challenge. While we do not show it here, earlier analyses highlighted the need for balanced datasets or data augmentation techniques to ensure that DL models effectively learn from all stages of crack evolution. By coupling FEM simulations with controlled experiments, we have developed a coherent framework that not only enhances AE source localization through DL pre-training but also provides deeper insights into fracture mechanics. This approach underscores the potential of simulation-based pre-training. Ultimately, the integration of FEM simulations and experimental validation paves the way for improved structural health monitoring, fostering more reliable, data-driven fracture assessments in complex materials and components. Figure 9 (a) Crack lengths divide into 3 categories (b) Imbalanced dataset with a reduction factor of 0.5, 0.9 on the last class 14 2.5 Transfer Learning and Pre-Training on Simulation Data In the context of AE signal classification via deep learning, CNN filters act as universal pattern detectors in AE waveforms, similar to how shapelets identify distinctive acoustic signatures. AE waveform shows patterns such as burst emissions, reflected waves, and attenuative phenomena. Convolutional filters, initially optimized for micro-crack detection, can serve as pretrained feature extractors for related AE signal classifications. These transferable features include characteristic wave shapes, frequency components, and amplitude variations that are common across different types of material failure mechanisms. This transfer learning approach enables parameter opti- mization in similar detection tasks. The application of transfer learning to new data, leveraging the insights obtained from pre-trained models. A variety of deep learning (DL) models, including Con- volutional Neural Network (CNN), Fully Connected Neural Network (FCNN), Encoder, Residual Network (ResNet), Inception, and Multi-Layer Perceptron (MLP), were assessed for their ability to analyze simulated datasets and to extract underlying features using a layer-wise fine-tuning strategy. The employed methodology entailed signal acquisition from the simulated datasets, followed by data preprocessing, feature extraction via fine-tuned DL models, and finally classification based on Acoustic Emission (AE) source location. To investigate impact and PLB test simulations, six DL models with distinct architectures and capabilities were investigated. This innovative strategy leads to a broader comprehension of the data, permitting the recognition of overlooked patterns and features when using a singular model. The CNN architecture consists of two sequential convolutional blocks, each containing a 1D convolution layer followed by instance normalization, dropout, and max pooling. The extracted hierarchical features are flattened before passing through a SoftMax classifier, with the network trained using categorical cross-entropy loss and the Adam optimizer Simonyan and Zisserman (2014). FCN shares similarities with CNN but substitutes max pooling with global average pooling to minimize spatial information loss, compacting spatial information into a 1D vector before passing to the SoftMax classifier Zhang et al. (2017). ResNet employs residual blocks to address the vanishing gradient issue by adding the input directly to stacked convolutional layers, utilizing 15 batch normalization and L2 regularization, with each residual block containing two Conv1D layers followed by batch normalization and activation, before average pooling and transmission to the SoftMax classifier He et al. (2016). Encoder mirrors CNN’s convolutional blocks but implements ReLU activation and instance normalization, followed by an attention mechanism that weights feature maps to focus on relevant features before flattening and classification Vincent et al. (2008). MLP replaces convolutional layers with two dense layers incorporating dropout for regularization, flattening the input time series before processing and utilizing SoftMax activation for classification Delashmit and Manry (2005). Inception employs an inception module featuring parallel branches of 1x1, 3x3, and 5x5 convolutions and max pooling, concatenating their outputs to form the inception module, implementing batch normalization and dropout post-module before flattening and transmitting to the SoftMax classifier Zhang et al. (2022). Figure 10 Schematic and structure of knowledge transfer via deep transfer learning We demonstrate the method’s effectiveness by enhancing DNN performance in acoustic emis- sion (AE) source localization in Huang et al. (2023). The approach begins with pretraining a DNN on extensive simulated AE data to establish baseline signal feature recognition. Following this, 16 we fine-tuned the DNN on a smaller experimental dataset, a process that facilitated the network’s learning of specific features present in the experimental data. Our process involves first trans- ferring layers from a pretrained model, and subsequently freezing their parameters. As new AE monitoring data is processed, it passes through these frozen layers before progressing through the trainable layers, allowing us to localize the AE source. Owing to the intrinsic connection between simulation and experimental data, the feature extractor can be applied to the latter, incorporating it as a non-adjustable layer in our model. We designate the high-level features extracted from these layers as “bottleneck” features due to their high level of condensation and their position at the classifier’s preceding constriction point (as illustrated in Figure 10). The applied deep learning architecture comprises one of six classifiers, each consisting of multiple fully connected layers following global pooling. This design enables nonlinear mapping of bottleneck features to AE source localization. Additionally, a fusion layer is utilized to amalgamate extracted features, and an extra layer is employed to link bottleneck features to location predictions. During fine-tuning, the pretrained model’s weights serve as the initial values, and the model undergoes further training with available target domain data. As a consequence, the fine-tuned model can acclimatize to the target domain’s unique characteristics, offering superior performance to a model trained from scratch. 2.6 Domain Adaptation The goal of domain adaptation is to learn features that are invariant to both the source and unannotated target domains, thus bridging the distribution gap between them. This strategy en- ables precise localization without dependence on labeled training data, resulting in beneficial data augmentation. The proposed approach efficiently achieves unsupervised domain adaptation of time series by utilizing a substantial amount of data for training in the absence of labeled data. In this section, we present domain adaptation consists of three main components: a feature extractor 𝑓 , a source classifier 𝑔, and a target domain classifier ℎ in Huang et al. (2024a). The feature extractor functions as a four-layer neural network that transforms AE waveforms into 64- dimensional features, utilizing ReLU activation, batch normalization, and dropout for optimal regularization. The source classifier processes these extracted features through a network with one 17 hidden layer to predict class probabilities across 9 categories using softmax activation, while the domain classifier employs a similar structure with sigmoid activation to determine whether inputs belong to the target domain. These components are integrated and trained using a combined loss function that balances classification accuracy and domain adaptation objectives, where the feature extractor is shared between the source classifier and domain classifier to enable simultaneous optimization. The training workflow progresses from dataset preparation through fold division, followed by concurrent training of the source classifier and feature extractor, and culminates in the training of a target classifier on combined source and target data to achieve domain invariance. During this workflow, we begin by preparing the dataset, then divide it into training and testing folds. Next, we design and train a source classifier 𝑔 alongside a feature extractor 𝑓 using a combined loss function that balances classification and domain adaptation goals. Finally, we initialize and train the target classifier ℎ on combined source and target data, tracking its performance over multiple epochs to assess how well domain invariance is achieved. Figure 11 High-Level Overview of Domain Adaptation Workflow: From Data Preparation to Evaluation and Implementation High-Level overview outlines the workflow in Figure 11, which unfolds in three distinct phases: Data Preparation and Cross-Validation, Model Initialization and Training, and Evaluation and Im- 18 plementation. Initially, data is prepared and processed through 10-fold cross-validation. The model architecture is then designed by selecting 𝑖 relevant features. Training of the source classifier 𝑔 begins, optimizing a combined loss function of Maximum Mean Discrepancy (MMD) and clas- sification loss to enable domain adaptation. This training proceeds to sufficiently minimize the loss. Subsequently, the target classifier ℎ is initialized, trained on combined source and target data, and its performance is methodically recorded via test accuracy and confusion matrices after each epoch, facilitating a comprehensive assessment of the model’s domain adaptation efficacy. We em- ploy 10-fold cross-validation which consists of two key steps: feature normalization to standardize measurement scales, and feature selection to identify key predictors. Each fold maintains balanced class distributions and representative sampling, ensuring reliable model evaluation and creating a solid foundation for subsequent training phases. The initial training phase focuses on the source domain dataset {𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑠𝑜𝑢𝑟𝑐𝑒}, where we simultaneously optimize the feature extractor 𝑓 and classifier 𝑔. The training process minimizes the classification loss 𝐿𝑐𝑙𝑠, through gradient descent, updating the parameters 𝜃 𝑓 and 𝜃𝑔 of both networks according to: 𝜃 𝑓 ← 𝜃 𝑓 − 𝜂 · ∇𝜃 𝑓 𝐿𝑐𝑙𝑠 𝜃𝑔 ← 𝜃𝑔 − 𝜂 · ∇𝜃𝑔 𝐿𝑐𝑙𝑠 (2.3) To align the feature distributions of source and target domains, we incorporate the Maximum Mean Discrepancy (MMD) metric. Conceptually, MMD measures the difference between mean embed- dings of source and target samples in a Reproducing Kernel Hilbert Space (RKHS). Formulated as: 𝑀 𝑀 𝐷 = (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) 1 𝑛 𝑛 ∑︁ 𝑖=1 𝜙(𝑥𝑖) − 1 𝑚 𝑚 ∑︁ 𝑗=1 𝜙(𝑦 𝑗 ) (cid:12) 2 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) H (2.4) It is designed to reduce the difference between the distributions, facilitating domain adaptation. 𝑥𝑖 and 𝑦 𝑗 are samples from the source and target distributions, respectively. 𝜙 represents the feature map to the RKHS, transforming the samples into a high-dimensional space where the mean difference is computed. 𝑛 and 𝑚 denote the number of samples in the source and target datasets, 19 respectively. The norm | · |2 H indicates the square of the RKHS norm, measuring the distance between the mean embeddings of the two distributions in the RKHS. Moreover, the model undergoes adjustments to minimize both the classification loss on the source domain and the domain discrepancy, striving for a balance between classification accuracy and domain invariance. Joint Parameter Update Reflecting Classification Accuracy and Domain Invariance. The investigation focuses on the trade-off by fixing 𝜆 = 100 and considering the number of epochs 𝑀 as a variable. The optimization of the feature extractor’s parameters considers the effects of both 𝐿𝑐𝑙𝑠 and MMD. The combined loss function guiding this process is: 𝐿𝑡𝑜𝑡𝑎𝑙 = 𝐿𝑐𝑙𝑠 + 𝜆 · MMD 𝜃 𝑓 , 𝜃𝑔 ← Update based on ∇(𝐿𝑡𝑜𝑡𝑎𝑙) (2.5) (2.6) When training our domain adaptation model, we balance minimizing the classification loss (𝐿𝑐𝑙𝑠) against the Maximum Mean Discrepancy loss (MMD). Initially, the model prioritizes 𝐿𝑐𝑙𝑠 reduction, which may not immediately impact MMD. As training progresses and domain features begin to align, reducing MMD, there’s a risk of the model becoming overly generalized, potentially diminishing its classification sharpness. The model strives to learn domain-invariant features through MMD minimization while preserving its performance on source data. Striking the right balance is critical; focusing too heavily on one aspect can lead to overfitting or underfitting. Figure 12 illustrates the intricate training dynamics of our model, showcasing three metrics: Average Total Loss, Average MMD, and Average Classification Loss over 100 epochs, using 10-fold cross- validation. These metrics reveal the delicate interplay between the competing goals of domain adaptation. The shaded regions in the graphs suggest variance across cross-validation folds, highlighting the inherent uncertainties in model training with diverse data subsets. Early in training, the model primarily reduces classification loss on the source domain, which may not affect MMD immediately. As training progresses, MMD minimization improves alignment of source and target feature distributions, but an excessive focus on MMD can reduce classification specificity. 20 Figure 12 Training Dynamics of a Domain Adaptation Model: (a) Average Total Loss, (b) Average Maximum Mean Discrepancy (MMD), and (c) Average Classification Loss over 100 Epochs 10 cross validation Once the feature extractor 𝑓 is sufficiently trained to learn domain-invariant features, we initialize a separate target classifier ℎ, parameterized by 𝜃ℎ. The feature extractor 𝑓 is then used to transform the target data 𝑋𝑡𝑎𝑟𝑔𝑒𝑡. The classifier ℎ is trained by optimizing 𝜃ℎ to minimize the combined loss function 𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑. In other words, we update 𝜃ℎ to minimize a combined loss function, ensuring predictions on transformed target data align with true labels as: 𝜃 (𝑡+1) ℎ = 𝜃 (𝑡) ℎ − 𝜂 · ∇𝜃ℎ 𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 (𝜃 (𝑡) ℎ ) (2.7) Where 𝜂 is the learning rate. By comparing prediction from ℎ( 𝑓 (𝑋𝑡𝑎𝑟𝑔𝑒𝑡)) against the true labels 𝑌𝑡𝑎𝑟𝑔𝑒𝑡: 𝐿𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 (𝜃ℎ) = loss(ℎ( 𝑓 (𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑋𝑡𝑎𝑟𝑔𝑒𝑡)), (𝑌𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡)) (2.8) A L2-regularization term 𝜀 2 |𝜃ℎ|2 is added to combat overfitting. This ensures target classification improves without overfitting to any peculiarities in the target set. Overall, this domain adaptation framework integrates data processing, source-target alignment, and target classifier refinement to produce a versatile, robust model capable of handling distributional shifts across domains. 21 Figure 13 Architecture of the Generator and Discriminator networks in the GAN for AE signal augmentation 2.7 GAN-Based Data Augmentation The GAN architecture for data augmentation is illustrated in Figure 13, involves a generator and a discriminator, both of which are sequential models. The problem formulation can be stated as training a GAN to generate realistic data samples from an input latent space. Mathematically, the GAN consists of a generator G and a discriminator D. The generator is a function that maps points in the latent space 𝑧 ∼ 𝑝𝑧 (𝑧) to candidate data points G(z), while the discriminator is a function that estimates the probability that a given data point is real (as opposed to generated). The generator and discriminator compete through alternating training steps, each minimizing their respective loss functions. The discriminator learns to maximize its ability to distinguish between real and generated samples, while the generator aims to produce samples that can fool the discriminator. This adversarial training process continues until the generator creates samples so convincing that the discriminator can no longer effectively differentiate between real and synthetic data. The discriminator is updated by minimizing 𝐿 𝐷 = 𝐸 [log(𝐷 (𝑥𝑟𝑒𝑎𝑙))] + 𝐸 [log(1 − 𝐷 (𝐺 (𝑧)))] (2.9) And then update the generator by minimizing 22 𝐿𝐺 = 𝐸 [log(1 − 𝐷 (𝐺 (𝑧)))] (2.10) Moreover, to prevent mode collapse, we implement a model collapse edge threshold 𝜏. If |𝐿 𝐷 − 𝐿𝐺 | > 𝜏 for a certain number of consecutive iterations, we reinitialize the models. Hyperpa- rameter tuning was conducted using grid search to optimize the learning rates for the discriminator and generator respectively. The overall framework of the innovative hybrid method. Our study employs a novel approach that combines advanced data augmentation techniques with an adapted Inception architecture to enhance the accuracy and robustness of AE source localization in com- plex structures. The methodology integrates custom-designed GAN for data augmentation with an Inception network specifically adapted for regression tasks. Figure 14 illustrates the overall process flow. It commences with the collection of AE signals, each labeled with coordinates in the form of (𝑑𝑖, 𝜃𝑖) for 35 distinct positions. These signals undergo a training/test split. The training data is then augmented using four different GAN architectures (GAN, DCGAN, WGAN, and TSAGAN) to address dataset imbalance and scarcity issues. This multi-GAN augmentation approach is crucial as it generates synthetic AE signals that closely emulate the characteristics of real data, effectively expanding the dataset and improving the model’s ability to generalize across various AE source locations. Each GAN variant offers unique strengths in data synthesis, allowing for a comprehensive augmentation strategy. The augmented training data from each GAN feeds into an Inception Network specifically adapted for regression tasks, forming the core of our hybrid approach. This adaptation of the Inception network, originally designed for image classification, enables effective processing of AE signals across multiple scales, capturing both local and global features crucial for accurate localization. This trained network is then utilized to predict AE source locations from the test data. The architecture of the Inception network used for regression is illustrated in Figure 15 Huang et al. (2024b). It initiates with an input layer, followed by an Inception Module that processes data through multiple parallel pathways. The Inception Module is particularly adept at AE signal processing as it can simultaneously extract features at different scales, which is essential given 23 Figure 14 Workflow of the hybrid network for AE source localization the complex nature of AE waveforms. The outputs are concatenated and passed through batch normalization. A ReLU activation function is then applied, followed by Global Average Pooling to reduce spatial dimensions. Global Average Pooling is employed instead of traditional fully connected layers to minimize the number of parameters, mitigate overfitting, and maintain spatial information. Finally, a dense layer produces the output, predicting the AE source location as continuous coordinates. Figure 15 Architecture of the Inception network for regression 24 Table 2.3 Specification for GAN Variants Feature Generator Architecture GAN DCGAN Fully connected 1D transposed convolutions, layers (128, 512, 1024) Dense layers TSAGAN Fully connected layers (128, 512, 1024) WGAN Fully connected layers (128, 512, 1024) Input Output layer Discriminator Architecture Optimizer Batch Size Epochs Special Features (128, 512, 1024) Noise vector Dense layer with Dense layer with Dense layer with Dense layer with LeakyReLU Fully connected layers (1024, 512, 64) tanh 1D convolutions, Fully connected Dense layers LeakyReLU 1D convolutions, Dense layers LeakyReLU layers (1024, 512, 64) Binary cross-entropy Binary cross-entropy Wasserstein loss Adam (learning rate = 0.0002, beta = 0.001) 64 2000 Model collapse Model collapse Model collapse Multiple critic updates, Model monitoring collapse monitoring monitoring monitoring Loss Function Binary cross-entropy The four GAN architectures - GAN, DCGAN, TSAGAN, and WGAN - share a common foundation but differ in key aspects of their design and training approach is shown in Table 2.3. The original GAN uses fully connected layers in both the generator and discriminator, with a structure of (128, 512, 1024) neurons for the generator and (1024, 512, 64) for the discriminator. It employs LeakyReLU activations throughout, with batch normalization (momentum 0.8) in both networks.DCGAN introduces convolutional layers, specifically using 1D transposed convolutions in the generator and 1D convolutions in the discriminator, which are particularly effective for capturing spatial or temporal patterns in the data. It typically uses ReLU activations in the generator and LeakyReLU in the discriminator, with a tanh activation in the final generator layer. TSAGAN, tailored for time series data, reverts to a fully connected architecture similar to the original GAN but is optimized for sequential data. The key innovation of WGAN lies not in its architecture, which is similar to DCGAN with convolutional layers, but in its use of the Wasserstein loss function and weight clipping in the discriminator (now called a critic). This change allows for more stable training and potentially better-quality results. WGAN also typically includes dropout in the 25 discriminator, a feature that’s not present in the other architectures. All four models use the Adam optimizer with learning rate 0.0002 and beta 0.001, a batch size of 64, and are trained for 2000 epochs. They all incorporate model collapse monitoring, but WGAN stands out with its multiple critic updates per generator update. These architectural differences make each variant suitable for different types of data and training scenarios, with DCGAN and WGAN often performing well on complex, high-dimensional data. Wasserstein Distance is a powerful metric for comparing probability distributions because it captures how much “effort” it would take to move one distribution’s mass to match another’s. Unlike traditional measures such as Kullback–Leibler (KL) divergence and Jensen–Shannon (JS) divergence, Wasserstein Distance provides a well-defined gradient even when the two distributions do not overlap. This property is particularly beneficial in training GANs, where the generator aims to align its synthetic distribution with the real data distribution. By offering a continuous and smooth gradient signal, Wasserstein Distance helps mitigate common GAN training issues such as mode collapse and vanishing gradients. Essentially, it reflects the intuitive notion of how similar two datasets truly are by quantifying the minimum “transport cost” to convert one distribution into the other. As a result, it serves as a more stable and interpretable convergence metric, making it well-suited for GAN training and synthetic data quality assessment. Figure 16 Comparison of Wasserstein Distance convergence across epochs for four GAN variants (GAN, TSAGAN, WGAN, and DCGAN) 26 Figure 16 shows the Wasserstein Distance between the original and synthetic datasets across different epochs for four GAN variants, serving as a metric to quantify the similarity between the two distributions. Lower values indicate higher similarity. The WGAN shows the most rapid convergence, achieving the lowest Wasserstein Distance of about 2.5 by epoch 100 and maintaining this level throughout training. The DCGAN and TSAGAN demonstrate similar convergence patterns, starting with high distances but steadily decreasing to around 3 by epoch 2000. The standard GAN, interestingly, shows the most volatile behavior, with an initial decrease followed by a spike at epoch 500, before eventually converging to a distance similar to DCGAN and TSAGAN by epoch 2000. This comparison reveals that while all GAN variants eventually achieve similar levels of data similarity, they differ significantly in their convergence paths. The WGAN’s rapid and stable convergence suggests it may be the most efficient in generating high-quality synthetic data, despite its underperformance in the final localization task. The standard GAN’s volatility indicates a need for careful monitoring during training, although it ultimately achieves competitive results. Overall, the results confirm that with adequate training, the GAN-based augmentation technique significantly enhances the dataset, providing a balanced and high-quality training set for deep learning models. 2.8 Results and Discussion In this chapter, we demonstrate the effectiveness of domain adaptation and transfer learning in addressing data imbalance challenges in acoustic emission monitoring. Through the evaluation of multiple deep learning architectures and GAN-based data augmentation methods, we achieved significant improvements in source localization accuracy, with the hybrid GAN-Inception approach reducing median localization compared to baseline methods. These findings establish a robust framework for single-sensor AE monitoring systems, while highlighting the critical balance between feature selection, model architecture, and data augmentation strategies for optimal performance in structural health monitoring applications. For the effect of data imbalance and domain adaptation, the F1-score performance under varying feature counts (10 to 100) and different imbalance levels in the final damage class (reduction factors 27 of 0.1 to 0.9) is shown in Figure 17. DA models were trained for 30 epochs and compared to a baseline CNN without DA (dashed line). Models with 10–40 features show relative robustness against imbalanced data, maintaining higher F1 scores despite increasing reduction factors. Larger feature sets (50–100) exhibit sharper performance drops, indicating potential overfitting rather than improved adaptability. Figure 17 F1 score feature count of 10 to 100, data imbalance with reduction factor on last class of 0.1 to 0.9, domain adaptation versus CNN We highlighted the comparison with Baseline CNN. The CNN without domain adaptation deteriorates significantly under heavy imbalance, while DA models maintain higher F1 scores. Intermediate feature counts of 40 to 60 exhibit non-linear trends, suggesting an optimal range where the number of features aligns well with specific imbalance levels. These findings underscore the importance of calibrating the feature dimensionality in DA models to handle class imbalance effectively. For classifier performance with transfer learning, Figures 18 and 19 compare six classifiers (CNN, MLP, FCN, ResNet, Inception, Encoder), with and without transfer learning, on Impact and PLB dataset. As for the impact dataset, CNN and MLP achieve accuracy above 0.8, and transfer learning yields small but consistent gains in precision and recall. FCN remains at ∼0.2 accuracy and sees minimal improvement from transfer learning. ResNet benefits the most, showing increased variability but higher scores with transfer learning. Inception sees minor gains 28 while encoder performance is largely unchanged except for a slight precision boost. Figure 18 For Impact dataset, the distribution of (a) accuracy (b) precision, and (c) recall from a 10-fold cross-validation for six classifiers on the Impact test dataset. Models without transfer learning are indicated by red bars, while those with transfer learning are shown in blue bars As for the PLB dataset, CNN and MLP again improve modestly with transfer learning. FCN’s performance (<0.1 accuracy) degrades further when transfer learning is applied. ResNet shows significant gains, while Inception performance drops slightly under transfer learning while Encoder benefits marginally in all metrics. In the task of AE Source Localization with GAN-Based Data Augmentation, Figure 20 compares two AE localization approaches on a 24” × 24” × 0.1” aluminum plate: a hybrid deep learning model combining GAN-based augmentation and Inception networks and an Inception network without augmentation respectively. Axes show dimensions in inches. In Figure 20(a), square markers (true locations) closely match star markers (predicted), indicating minimal deviations and robust performance across the plate. In Figure 20(b), errors grow toward 29 the plate edges, indicating poorer generalization with fewer training examples. Figure 19 For PLB dataset, the distribution of (a) accuracy (b) precision, and (c) recall from a 10-fold cross-validation for six classifiers on the PLB test dataset. Models without transfer learning are indicated by red bars, while those with transfer learning are shown in blue Table 2.4 Mean and variance of prediction errors for the raw dataset and augmentation approaches Performance (prediction error) Mean Variance Raw Data 6.08 0.42 Augmented Dataset Noise GAN DCGAN TSAGAN WGAN 5.18 0.23 3.64 0.27 2.91 0.29 5.65 0.19 3.43 0.14 The comparison of the prediction error mean and variance for the raw dataset and five augmen- tation strategies—noise-based, GAN, DCGAN, TSAGAN, and WGAN. The raw dataset registers the highest mean error at 6.08 with a variance of 0.42, indicating relatively large and inconsistent deviations in predictions. Introducing simple noise-based augmentation reduces the mean error to 5.18 and lowers the variance to 0.23, reflecting an incremental but limited improvement in predic- 30 Figure 20 Comparison of Acoustic Emission (AE) source localization performance. (a) Results from the hybrid deep learning model with GAN-based data augmentation and Inception network. (b) Results from the Inception network alone without GAN-based augmentation tion accuracy. By contrast, all GAN-based methods yield more substantial benefits. Notably, the standard GAN attains the lowest mean error of 2.91, underscoring its efficacy in generating syn- thetic data that closely aligns with the real distribution. DCGAN, while not matching the standard GAN’s overall mean error, exhibits the smallest variance of 0.14, suggesting greater consistency in predictions across diverse samples. TSAGAN and WGAN show intermediate performances, with TSAGAN recording a mean error of 3.64 (variance 0.27) and WGAN yielding 5.65 (variance 0.19). Although WGAN converges steadily during training, its final prediction accuracy here remains comparatively lower. In the end, we highlight prediction errors comparison for two principal scenarios in Figure 21, a baseline approach without GAN-based augmentation versus an enhanced approach incorporating GAN-generated synthetic data. Without GAN, the distribution of errors exhibits larger variance, indicating higher uncertainty and less accurate overall predictions. The GAN-augmented model shows a significant decrease in the average prediction error and a more compact distribution, reflecting not only a lower bias but also reduced variance. These outcomes strongly suggest that synthetic samples produced by the GAN effectively enrich the training set, thereby enabling the model to learn more discriminative features and improve localization performance. Consequently, 31 Figure 21 Comparison of prediction error distributions for the baseline model without GAN aug- mentation versus the GAN-augmented model the difference between these two scenarios underscores the utility of GAN-based data augmentation in achieving more reliable and precise predictions in the tested environment. 32 CHAPTER 3 MULTI-FIDELITY SURROGATE MODELING FOR EFFICIENT SIMULATION EXPERIMENT INTEGRATION 3.1 Introduction Motion-Induced Eddy Current (MIEC) testing has emerged as a valuable nondestructive eval- uation (NDE) modality for detecting surface and near-surface defects in high-speed metallic com- ponents. While the induction of eddy currents in conductive materials is traditionally associated with alternating magnetic fields, it can also occur through relative motion between a conductor and a magnetic field source. By definition, MIEC arises when a magnetic source moves relative to a conductive material, thereby generating velocity-dependent eddy currents Wang et al. (2020). These induced currents significantly alter the local magnetic field near defects, complicating signal interpretation—particularly at higher inspection speeds Wang et al. (2020); Li et al. (2006); Piao et al. (2020); Park and Park (2004). This phenomenon not only has practical applications in NDT but also holds fundamental theoretical significance in the broader field of electromagnetism. As Figure 1 conceptually illustrates, the sample traveling at velocity v induces a electromotive force (emf) computed by: 𝜺 = v × B𝐴 (3.1) The resulting conduction currents create secondary magnetic fields that oppose changes in the local flux, as per Lenz’s law, it is expressed as: J = 𝜎(v × B) (3.2) Initially, MFL signals are governed primarily by the applied field B𝐴 and the defect geometry. However, when inspection speed increases, the induced currents become stronger, distorting B𝑀 𝐹 𝐿. In other words, B𝑀 𝐹 𝐿 is no longer solely a function of the defect’s geometry and B𝐴, but also critically depends on velocity-induced eddy currents. Accurate defect characterization at higher speeds thus necessitates compensating for MIEC effects Li et al. (2006). For instance, Piao et al. 33 (2020) reported that higher speeds and increased wall thickness reduce steel-pipe magnetization, resulting in more pronounced MFL signal distortions. To address such challenges, researchers have pursued compensation schemes or developed innovative data-integration methodologies aimed at improving defect detection under demanding operational conditions Wilson and Tian (2006); Zhang et al. (2019); Antipov and Markov (2018). Figure 22 Illustration of the leakage field under the presence of velocity Despite these challenges, MIECT have shown promise in high-speed inspection scenarios. Comparative studies with established methods like MFL have illustrated opportunities for enhanced defect detection, even at increased velocities Zhiye et al. (2005); Li et al. (2009). Numerical simulations employing techniques such as the Finite Element Method (FEM) have played a pivotal role in interpreting these interactions and guiding the development of improved detection strategies Han et al. (2014). These advances have been demonstrated in diverse applications, from pipelines and ferrite metals to high-speed rail systems, where speed-sensitive inspection solutions are urgently needed to maintain safety and operational efficiency Zhao et al. (2023); Bao (2023); Zaini et al. (2021); Li et al. (2022); Liu et al. (2023). In summary, MIEC testing has evolved as a complementary and increasingly indispensable technique in the NDE toolbox. Through a growing body of research, we now better understand how velocity-induced eddy currents shape defect signals. Ongoing efforts to integrate advanced modeling, simulation methods, and hybrid data sets promise to further refine MIEC-based inspection strategies, ensuring safer, more reliable assessments of critical infrastructure at ever-increasing inspection speeds. 34 3.2 Motivation and objectives Defect characterization in high-speed transport systems ensures safety and efficiency as inspec- tion speeds increase. While field tests provide real-world accuracy, they are often limited by high costs, physical constraints, and safety considerations, resulting in limited datasets. Simulations offer broader parameter exploration at lower costs but rely on idealized assumptions that may not fully capture real-world complexities. A significant concern in purely data-driven approaches is the accuracy of approximations due to limited training data, particularly in complex physics-based processes. A promising solution lies in combining lower-fidelity (LF) simulation data with higher- fidelity (HF) experimental data. In this context, LF models might employ simplified physics or coarser finite element meshes, trading accuracy for computational efficiency. While these models are less accurate than the HF model, they are significantly cheaper to compute and can effectively sample parameter spaces where expensive experimental measurements are unavailable. This in- tegrated approach leverages the comprehensive coverage of simulations while using experimental data to enhance prediction accuracy. However, this combination presents its own challenges. While LF data are abundant and easily obtainable, their lower accuracy could potentially compromise the model’s generalization ability. The key lies in developing methods that effectively balance the quantity advantage of LF data with the accuracy of limited HF experimental data. To address this, we propose a multi-fidelity surrogate modeling framework that combines LF simulation and HF experimental data. This framework maximizes the advantages of both data sources—utilizing simulations’ broad parameter coverage while maintaining experimental realism. The proposed approach yields consistent predictions in both forward and inverse directions across different op- erating conditions, making it valuable for engineering applications where extensive testing data is difficult to collect. 3.3 Forward and Inverse Problem Formulations In this section, we define two core mathematical problems—namely the forward and inverse problems—central to defect characterization under varying operational conditions in Motion- Induced Eddy Current Testing (MIECT). Given the defect geometry and the inspection velocity, 35 the forward problem seeks to predict the differential peak-to-peak amplitude Δ𝑉𝑝 𝑝 as the sensor response. Here, we define a function 𝑓 : R3 → R1, (𝑤, 𝑑, 𝑣) ↦→ Δ𝑉𝑝 𝑝 (3.3) where 𝑤, 𝑑, 𝑣 represent defect width, depth, and velocity. In practice, 𝑓 could reflect Multiphysics coupling involving motion-induced eddy currents, magnetic flux leakage, material nonlinearities, and sensor-specific transfer functions. While forward predictions are crucial for predicting the response signal so that we can optimize sensor design, another important task is to derive defect parameters from measured data. This gives importance to the inverse problem, where we infer defect geometry (𝑤, 𝑑) from Δ𝑉𝑝 𝑝 under certain velocity conditions. Mathematically, we define an inverse mapping 𝑔 : R2 → R2, (Δ𝑉𝑝 𝑝, 𝑣) ↦→ (𝑤, 𝑑). (3.4) In this study, surrogate models are utilized for both forward and inverse problems to improve surface defect characterization in high-speed MIECT. The main objectives include developing a multi-fidelity surrogate framework that integrates high-fidelity experimental data from a rotational disc setup with low-fidelity finite element simulations to improve prediction. 3.4 Multi-Fidelity Surrogate Framework 3.4.1 Overview of Multi-Fidelity Concepts High-fidelity (HF) models and data sources capture a broad range of real-world complexities, including multi-physics interactions, complex boundary conditions, and actual operating conditions Fernández-Godino et al. (2019). While HF models typically provide higher accuracy, they are expensive and often limited in number. Obtaining high-fidelity data is challenging, resulting in small datasets. In contrast, low-fidelity (LF) data, typically from simulations, is abundant but less accurate due to simplifications such as dimensionality reduction, linearization, simplified physics models, coarser computational domains, and partially converged results as shown in Figure 23. These simplifications make LF data easier and cheaper to obtain, though at the cost of accuracy. 36 Figure 23 Connection between HFM and LFM Multi-fidelity models (MFMs) aim to bridge the gap between rapid computation and high ac- curacy Fernández-Godino et al. (2016). By introducing low-fidelity data into the training database, the issue of insufficient training samples can be addressed. However, the low accuracy of LF data may compromise the model’s accuracy and generalization ability. The classification of data/models as HF or LF is determined by their ability to capture the underlying physics process, rather than their source (experimental tests, analytical models, or numerical simulations), and can only be determined relative to another. A surrogate’s accuracy depends on function complexity, experi- mental design, domain size, simulation accuracy, and sample availability. Field tests with properly calibrated sensors and rigorous testing protocols provide highly reliable information, achieving authenticity that simulation models strive to replicate. While fidelity is not inherently tied to whether data come from experiments or simulations, the practice of treating experimental data as high-fidelity and simulation data as low fidelity remains practically advantageous. In the context of MIECT data analysis, this framework provides a consistent approach for integrating simulation and experimental sources efficiently and effectively. 3.4.2 Radial Basis Function-based Multi-Fidelity Scaling The Radial Basis Function-based Multi-Fidelity Scaling (RBF-MFS) approach is a powerful method for combining low-fidelity (LF) and high-fidelity (HF) data to construct an accurate sur- rogate model. This method leverages the computational efficiency of LF simulations to explore a broad parameter space and then refines the approximation using sparse HF data. In this approach, A set of LF simulations is performed over a parameter grid. These simulations are computationally 37 efficient, allowing for extensive sampling of the parameter space. Each LF simulation yields an ap- proximate system response, capturing a wide range of defect scenarios and operating conditions. A baseline approximation ˆ𝑓𝐿𝐹 is first constructed using LF data with Radial Basis Functions (RBFs). The LF approximation can be expressed as: 𝑁𝐿𝐹∑︁ ˆ𝑓𝐿𝐹 (𝑥) = 𝜆𝑖𝜙(|𝑥 − 𝑥 𝐿𝐹 𝑖 |) (3.5) 𝑖=1 where 𝜆𝑖 are coefficients determined from LF data, 𝜙(·) is the chosen RBF kernel, and 𝑥 𝐿𝐹 𝑖 are LF training points. Since the LF model is approximate, a discrepancy function 𝑑 (𝑥) is introduced to correct the differences between the LF predictions and the sparse HF data. This function is also modeled using RBFs. The final high-fidelity prediction ˆ𝑓𝐻𝐹 (𝑥) is obtained by combining the LF model and the discrepancy function as: ˆ𝑓𝐻𝐹 (𝑥) = ˆ𝑓𝐿𝐹 (𝑥) + 𝑑 (𝑥) (3.6) RBF-MFS is ideal for situations where HF simulations are computationally expensive. A large parameter space needs to be explored. The system response is complex or nonlinear. In sum- mary, RBF-MFS is a powerful tool for multi-fidelity modeling. It combines the efficiency of LF simulations with the accuracy of HF data, using RBFs to create a refined surrogate model. This approach is widely applicable in engineering, physics, and other fields where computational cost and accuracy are critical. 3.4.3 Gaussian Process Regression with Multi-Fidelity Scaling and Feature Discretization The proposed GPR-MFS-FD method in Huang et al. (2025), leveraging Gaussian Process Regression (GPR) with multi-fidelity data integration. Given the superior performance of the RBF-MFS model in the forward problem, we assume that GPR will be beneficial in the inverse problem. The RBF kernel, a key component of GPR, assesses the similarity between inputs from simulation and field tests while smoothly handling the discretized feature input. The core of the proposed method is the Gaussian Process (GP) model, characterized by a mean function 𝑚(𝑥) and a covariance function 𝑘 (𝑥, 𝑥′). Integral to the multi-fidelity model is a scaling function 𝛼(𝑥), 38 designed to adjust LF data on the HF model, thus enhancing the accuracy of high-fidelity predictions using: 𝑘 𝑀 𝐹 ((𝑥, 𝑓𝑖), (𝑥′, 𝑓 𝑗 )) = 𝛼(𝑥)𝑘 (𝑥, 𝑥′)𝛼(𝑥′) (3.7) where 𝑓𝑖, 𝑓 𝑗 ∈ { 𝑓 𝐿, 𝑓 𝐻 } denote the fidelity levels. Velocity features are modified by computing the mean of each velocity range, transforming continuous velocity inputs into discretized features: ¯𝑣 = 𝑣𝑚𝑖𝑛 + 𝑣𝑚𝑎𝑥 2 (3.8) where 𝑣𝑚𝑖𝑛 and 𝑣𝑚𝑎𝑥 are the minimum and maximum velocities in the range, respectively. This modification captures the influence of velocity variations more effectively. With the model fine- tuned, predictions for unknown defect size are derived by calculating the posterior predictive distribution for new input 𝑥∗: 𝑦∗|𝑥∗, 𝐷 ∼ N (𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2 𝑛 𝐼)−1𝑦𝐻∗, 𝑘 (𝑥∗, 𝑥∗) − 𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2 𝑛 𝐼)−1𝑘 ∗𝑇 ) (3.9) The predicted output 𝑦∗ given 𝑥∗ and the data 𝐷, follows a normal distribution N . The mean of the distribution is computed by multiplying the covariance vector 𝑘 ∗ with the inverse of the modified kernel matrix 𝐾𝑀 𝐹 + 𝜎2 𝑛 𝐼 where kernel matrix 𝐾𝑀 𝐹 compute the covariance among all pairs of training inputs, adjusted for their fidelity levels and then this product is multiplied by the high-fidelity output data 𝑦𝐻∗. The variance calculated by 𝑘 (𝑥∗, 𝑥∗) − 𝑘 ∗(𝐾𝑀 𝐹 + 𝜎2 𝑛 𝐼)−1𝑘 ∗𝑇 provides an estimate of the uncertainty of the prediction for 𝑥∗, the reduction is shown with model’s initial uncertainty 𝑘 (𝑥∗, 𝑥∗) subtract the term with HF training data encoded. 𝑘 ∗(·) denotes the covariance vector between the test inputs and the training inputs. 3.5 Computational Complexity and Trade-Offs In Table 3.1, we compare the training and prediction complexities of our primary single-fidelity and multi-fidelity methods. We denote 𝑛 as the number of data points in a low-fidelity (LF) dataset and 𝑚 as the sizes of the high-fidelity (HF) for multi-fidelity (MF) approaches. For single-fidelity 39 Table 3.1 Computation complexity comparison Model Type Single-Fidelity Models for forward problem Method PRS (Polynomial Response Sur- face) Myers et al. (2016) RBF-MQ (Radial Basis Function - Multiquadric) Hardy (1971) KRG (Kriging) Stein (1999) Multi-Fidelity Models for forward problem CoRBF (Composite Radial Ba- sis Function) Park and Sandberg (1991) LR-MFS (Linear Regression- based Multi-Fidelity Surrogate) Kennedy and O’Hagan (2000) CoKRG (Co-Kriging) Forrester et al. (2007) RBF-MFS (Radial Basis Function- based Multi-Fidelity Surrogate) Kumar et al. (2018) Multi-Fidelity Models for inverse problem (Multi-Fidelity MF-DGP-EM Deep Gaussian Process with Embedded Mapping) Computational Complexity Training: O (𝑛3) Prediction: O (𝑛) Training: O (𝑛3) Prediction: O (𝑛) Training: O (𝑛3) Prediction: O (𝑛) Training: O (𝑚3 + 𝑛3) Prediction: O (𝑛) Training: O (𝑛3) Prediction: O (𝑛) Training: O (𝑚3 + 𝑛3) Prediction: O (𝑛) Training: O (𝑚3 + 𝑛3) Prediction: O (𝑛2) Training: > O (𝑛3) GPR-MFS-FD (Gaussian Process Regression with Multi-Fidelity Surrogate and Feature Discretiza- tion) Prediction: O (𝑛2) to O (𝑛3) Training: O (𝑚3 + 𝑛3) Prediction: O (𝑛2) 40 models like PRS, RBF-MQ, and KRG, training involves handling an 𝑛 × 𝑛 system, where 𝑛 denote the number of training points. This leads to O (𝑛3) complexity during training. Once trained, predictions can be made more efficiently, on the order of O (𝑛) per prediction, due to simpler operations like evaluating kernels or polynomials. In multi-fidelity approaches, the total cost is often split between the LF dataset and the HF dataset. Each fidelity level requires building its covariance kernel matrix. Multi-fidelity methods integrate low-fidelity (LF) and high-fidelity (HF) datasets. For example, models such as CoKRG, CoRBF, and RBF-MFS will handle two datasets simultaneously, leading to O (𝑚3 + 𝑛3) complexity. Typically, HF data 𝑚 is much smaller (due to cost or physical constraints), while LF data 𝑛 is much larger. Each method’s complexity dictates practical constraints on dataset size and modeling fidelity. Single-fidelity methods are straightforward but can underutilize HF information or broad LF coverage. Multi-fidelity methods, though more computationally expensive at O (𝑚3 + 𝑛3), offer superior predictive performance—critical for accurate forward and inverse analyses in Motion- Induced Eddy Current Testing (MIECT). 3.6 Results and Discussion 3.6.1 Forward Problem The proposed framework demonstrates state-of-the-art performance for both forward and in- verse problem formulations. In forward modeling, RBF-MFS achieves consistently lower Mean Squared Error (MSE) values than single-fidelity models such as PRS, RBF-MQ, and KRG. It also outperforms more sophisticated multi-fidelity models like Co-Kriging and CoRBF. The error distributions and violin plots illustrate that RBF-MFS maintains remarkably stable and accurate predictions even as test set sizes increase, ensuring minimal variability and robust performance in challenging conditions. Cokriging outperforms Kriging as shown in Figure 24(a). At 50% test size, Kriging’s MSE reaches 0.0007 while Cokriging maintains 0.00033. Cokriging achieves both lower median error and tighter error distribution across test sizes, demonstrating effective LF data usage with limited HF data. In Figure 24(b), CoRBF and RBF-MFS maintain MSE below 2.5 × 10−5 across all 41 Figure 24 MSE as a Function of Test Size for (a) Cokriging versus Kriging Comparisons, (b) Comparisons among Composite RBF, Radial Basis Function-based Multi-Fidelity Surrogate, and Polynomial Response Surfaces, (c) Radial Basis Function-based Multi-Fidelity Surrogate versus RBF with Multiquadric Analysis test sizes. RBF-MFS delivers the lowest median errors with minimal variance. PRS performs competitively at select test sizes despite higher average MSE. Figure 24(c) confirms RBF-MFS’s superior performance. 3.6.2 Inverse Problem For the inverse problem, GPR-MFS-FD clearly surpasses the MF-DGP-EM model, delivering more accurate defect size estimations across a range of velocities. This performance gap can be attributed to the integration of the feature discretization technique and multi-fidelity scaling, which transforms continuous velocity ranges into a unified representation, enhancing the capture of velocity-defect detection relationships. 42 Figure 25 Box plot of MSE for GPR-MFS-DF and MF-DGP-EM 3.6.3 Observations and Limitations This performance gap can be attributed to the integration of the feature discretization technique and multi-fidelity scaling, which transforms continuous velocity ranges into a unified representation, enhancing the capture of velocity-defect detection relationships. The multi-fidelity scaling function effectively projects low-fidelity data onto the high-fidelity model, improving prediction accuracy. Moreover, the posterior predictive distribution for unknown defect sizes yields predictions with reduced uncertainty by incorporating high-fidelity training data. In contrast, MF-DGP-EM relies on a hierarchical arrangement of Gaussian processes and employs an embedded mapping function represented by a neural network to align data from different fidelities. While this approach aims to utilize deep learning and Gaussian processes, it may struggle to effectively capture the complex interactions between velocity variations and defect characteristics, potentially introducing additional complexity and sources of error. Furthermore, MF-DGP-EM’s optimization process involves maximizing the marginal likelihood of the observed data through gradient-based techniques, which may be more susceptible to local minima and require careful hyperparameter tuning. In comparison, we employ a more straightforward optimization approach based on the conjugate gradient method, which may contribute to its superior performance. In summary, the performance gap can be attributed to the innovative feature discretization technique, which enables a more effec- tive representation of velocity variations and their influence on defect detection. The discretization of the velocity feature, combined with a streamlined optimization process, contributes to the su- perior accuracy and robustness across different velocity ranges. This highlights the importance 43 of developing targeted methodological optimizations that address the specific challenges posed by MIEC data, as demonstrated by the success of the proposed approach. 3.7 Conclusion In this chapter, we demonstrated how merging low-fidelity (LF) simulation data with high- fidelity (HF) experimental data can effectively address both forward and inverse problems in Motion-Induced Eddy Current Testing (MIECT). By leveraging physics-based finite element sim- ulations for broad parametric coverage and complementing them with targeted laboratory mea- surements, we increased predictive accuracy while controlling costs.Specifically, the synergy of discrepancy-corrected models and multi-fidelity scaling underscored the improvement in capturing real-world defect signals, especially under varying velocities. Feature discretization further miti- gated operational uncertainties (e.g., inconsistent inspection speeds), ensuring robust performance for both forward (signal prediction) and inverse (defect estimation) tasks. This multi-fidelity ap- proach thus forms the central theme of the chapter: effectively combining abundant, less-accurate data with sparse, more reliable measurements to enhance modeling fidelity and reliability.The proposed methodology holds broad applicability beyond its immediate use in surface defect de- tection for high-speed rail inspections. The principles of multi-fidelity data integration, feature discretization, and surrogate modeling can be adapted to a wide range of nondestructive evalu- ation (NDE) and structural health monitoring (SHM) scenarios—such as corrosion detection in pipelines, crack characterization in aerospace components, or fatigue analysis in civil structures. Any domain where engineering simulations and physical measurements coexist can benefit from the cost-effective accuracy gains offered by multi-fidelity frameworks. 44 CHAPTER 4 COMPENSATION IN PULSED EDDY CURRENT TESTING VIA SURROGATE MODELING 4.1 Introduction Pulsed Eddy Current (PEC) testing is a key non-destructive evaluation (NDE) technique that employs an excitation coil to generate a time-varying magnetic field that induces eddy currents within the electrically conductive material Sophian et al. (2017). By analyzing the response signal, PEC enables accurate characterization of material properties such as wall thickness, electrical conductivity, and magnetic permeability Sun et al. (2021); Liu et al. (2022); Majidnia et al. (2014), providing insights for industries ranging from oil and gas to power generation Bieri et al. (2005); Fu et al. (2021). Its high penetration depth and non-contact nature make it uniquely suited for detecting subsurface corrosion, wall thinning, and cracks without requiring direct contact Wang et al. (2021); Chen and Liu (2021), particularly when dealing with coated or insulated components Rifai et al. (2016); Zhang et al. (2017). However, several challenges limit PEC’s accuracy. Liftoff variation can reduce sensitivity and increase measurement uncertainty Wang et al. (2021); Rao et al. (2017). Likewise, coatings and insulation layers alter the electromagnetic environment, complicating the interpretation of PEC signals and potentially masking underlying material degradation Chen and Liu (2021). Beyond these geometric and material complexities, electromagnetic interference from external magnetic fields significantly challenges PEC by distorting signals and reducing defect detection reliability Cortês et al. (2023). In particular, high-voltage feeder lines inside pipes generate strong, time- varying magnetic fields that produce spatially varying permeability, violating the assumption of uniform magnetic properties Chen and Lei (2015) and thus compromising data interpretation methods Li et al. (2021). To overcome these issues, researchers have proposed methods such as magnetic field calibration systems Janošek et al. (2019), adaptive interference suppression algorithms Ponikvar et al. (2023), and enhanced electromagnetic field modeling Sereda and Korol (2022); Zhang et al. (2017) to reduce the interference. Some approaches reduced error standard 45 deviations from thousands to less than 20 nT Zhang et al. (2017). These progresses, coupled with improvements in probe design Wang et al. (2021); Rao et al. (2017); Shu et al. (2007); Rifai et al. (2016), and advanced simulation tool, have significantly enhanced PEC’s robustness. The primary objective of this chapter is to present a new compensation strategy that accounts for the spatially varying magnetic permeability induced by internal power lines in high-voltage feeder pipes. By explicitly modeling the position-dependent magnetic fields, this approach seeks to correct signal distortions and restore accurate wall-thickness estimates. To accomplish this, we integrate two complementary numerical models along with surrogate modeling. These tools enable rapid, reliable estimates of how variations in magnetic permeability impact the transient PEC signal. In turn, they form the basis of an efficient, physics-based correction scheme that significantly enhances the reliability of PEC measurements in challenging high-voltage environments. 4.2 Challenges High-voltage feeder lines enclosed within pipes pose significant challenges for PEC inspections due to the strong electromagnetic fields generated by the internal cables and the nonlinear magnetic behavior of ferromagnetic materials. These issues can produce spatially varying permeability in the pipe wall, causing inconsistent PEC signals and leading to inaccurate wall-thickness estimates or false-positive flaw detections. The three figures presented here illustrate the key hurdles in inspecting such pipes and underscore why careful compensation and advanced modeling are critical. In Figure 26, a schematic cross-section highlights the basic configuration of a feeder pipe with an offset internal cable and an external PEC sensor. The cable carries high currents (on the order of 10 kA at 60 Hz), generating a strong magnetic field that interacts with the ferromagnetic pipe. When the cable is not centered, the resulting magnetic field distribution is asymmetrical around the circumference. This asymmetry, compounded by the pipe’s ferromagnetic properties, leads to locally different levels of magnetic flux density and, consequently, different relative permeability in various angular positions of the pipe. Because conventional PEC algorithms generally assume constant permeability, the varying field intensities at different angular positions introduce systematic errors into the thickness measurements. 46 Figure 26 Cross-sectional schematic of a ferromagnetic pipe with an internal offset cable and an external PEC sensor The representative B-H curve and the corresponding relative magnetic permeability 𝜇𝑟 for fer- romagnetic carbon steel is shown in Figure 27. The B-H curve illustrates the nonlinear relationship between the applied magnetic field intensity H and the resulting flux density B. At larger field strengths, the material approaches magnetic saturation, causing 𝜇𝑟 to drop steeply. In the context of high-voltage feeder lines, certain pipe regions near the cable may experience magnetic fields well into the saturation zone, dramatically reducing 𝜇𝑟. This spatial variation in 𝜇𝑟 complicates signal interpretation because the PEC decay rate no longer correlates to wall thickness in a straightforward manner. The impact is evident in Figure 28, which plots PEC voltage decay curves at two opposite circumferential locations (0° and 180°) for the same nominal wall thickness (100%). Despite no actual thickness difference, the curve measured at 180° decays more rapidly than the one at 0° (black line). This discrepancy arises mainly from the stronger magnetic field and partial saturation at the angular position closer to the cable. In practice, an inspector might mistakenly interpret the faster decay as thinner material or increased corrosion. Without compensating for the local mag- netic permeability changes, such signals can lead to false-positive flaw indications or significantly overestimated metal loss. In summary, we reveal the existing PEC data analysis is insufficient in high-voltage feeder applications. The offset internal cable and large current intensities alter the magnetic conditions 47 Figure 27 Representative B-H curve and corresponding relative permeability illustrating the non- linear magnetic response of carbon steel Figure 28 Comparison of simulated voltage decay curves at two opposite angular locations (0° and 180°) showing signal discrepancies due to local magnetic saturation effects inside the pipe wall, resulting in spatially varying permeability and distorted PEC signals. Over- coming these effects requires robust compensation strategies—incorporating detailed multi-physics modeling, spatially resolved permeability estimates, and calibration protocols—to ensure accurate, reliable wall-thickness measurements. 4.3 Analytical Model We start with a simplified analytical model correlating PEC decay signatures with material electromagnetic characteristics. The PEC signal’s transient response, described by its dominant time constant 𝜏, is fundamentally related to the wall thickness 𝑑, relative magnetic permeability 𝜇𝑟, and electrical conductivity 𝜎. For a uniformly magnetized, defect-free region, the dominant time 48 constant obeys a proportionality 𝜏 ∝ 𝜇𝑟 𝜎𝑑2. However, in the presence of internal conductors and non-uniform fields, 𝜇𝑟 becomes spatially dependent as shown in Figure 29. Figure 29 Cross-section view of the pipe with internal cable If we assume a constant permeability ˆ𝜇𝑟, the discrepancy between the estimated time constant ˆ𝜏 and actual 𝜏(𝜃) will produce the erroneous estimation. When incorporating liftoff into our mathematical model similar to the model proposed in Ulapane et al. (2018); Nafiah et al. (2020), the induced voltage 𝑉 (𝑡, 𝑙) in the receiver coil, considering liftoff, can be expressed as: 𝑉 (𝑡, 𝑙) = 𝑏1𝑒− 𝜋2𝑡 𝜇𝑟 ( 𝜃 ) 𝜎𝑑2 𝑒−𝑘𝑙 + ∞ ∑︁ 𝑖=2 𝑏𝑖𝑒−𝑐𝑖𝑡 (4.1) where 𝑏1 and 𝑏𝑖 are coefficients dependent on the system parameters related to the sensor configuration, 𝑐𝑖 are higher-order time constants, and 𝑘 is a constant that accounts for the exponential attenuation due to liftoff. The state-of-the-art method adopts the estimated time derivative based on the assumed constant ˆ𝜇𝑟, therefore causing the prediction error: Prediction error(𝜃) = (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) 𝜋 √︃ 𝜎 𝑑 𝑑𝑡 ln[𝑉 (𝑡, 𝑙)] (cid:32)√︄ √︄ 1 𝜇𝑟 (𝜃) 1 ˆ𝜇𝑟 − (cid:33) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (4.2) While this discrepancy is computed under the assumption of constant wall thickness, it becomes inadequate when wall thinning occurs. This is because the wall thickness 𝑑 becomes a function of 49 the angular position 𝜃, i.e., 𝑑 = 𝑑 (𝜃), leading to an interdependency between 𝜇𝑟 (𝜃) and 𝑑 (𝜃). The relative permeability becomes dependent on both 𝜃 and 𝑑 (𝜃) and expressed as: 𝜇𝑟 (𝜃, 𝑑 (𝜃)) = 𝑓𝐵𝐻 (𝐻 (𝜃, 𝑑 (𝜃))) 𝜇0𝐻 (𝜃, 𝑑 (𝜃)) (4.3) where 𝑓𝐵𝐻 represents the B-H curve function. Determining 𝜇𝑟 (𝜃, 𝑑 (𝜃)) analytically is complex due to its dependence on the pipe geometry and wall thickness variations. 4.4 Compensation Method via Surrogate Model The proposed compensation algorithm combines multi-physics simulations with data-driven modeling as shown in Figure 30. The Cable-Pipe Model simulates magnetic flux distribution, generating a permeability map based on cable layout. The PEC-Pipe Model then uses these permeability distributions to simulate transient PEC response, revealing how spatial variations affect signal decay rates and thickness measurements. Figure 30 Numerical modeling of PEC Probe Setup on Wax-Coated Pipes (a) Cable-Pipe Model (b) PEC-Pipe Model The process begins by modeling internal power line effects on the pipe’s magnetic permeability, then uses these permeability distributions within a pulsed eddy current (PEC) simulation to predict the voltage decay signal. Finally, surrogate models and a calibration procedure are applied to correct real-world PEC measurements in the presence of spatially varying magnetic properties. 50 To reduce the computational complexities associated varying pipe parameters—especially cru- cial for real-time applications in field inspections—we have developed surrogate models that ap- proximate FEM simulation results, which allows for rapid predictions of decay time constants without extensive computations. The surrogate models are constructed using Gaussian Process Regression (GPR), a non-parametric Bayesian technique ideal for modeling complex, nonlinear relationships with quantified uncertainties. In GPR, the underlying assumption is that the outputs can be represented as realizations from a Gaussian process governed by a mean function and a covariance function (kernel). Specifically, GPR models the relationship between the inputs X and outputs y as: y = 𝑓 (X) + 𝜖, (4.4) where 𝑓 (X) is an unknown latent function sampled from a Gaussian process, and 𝜖 ∼ N (0, 𝜎2 𝑛 𝐼) represents independent and identically distributed Gaussian noise with variance 𝜎2 𝑛 . The latent function 𝑓 (X) is characterized by a Gaussian process: 𝑓 (X) ∼ GP (𝑚(X), 𝑘 (X, X′)) (4.5) where 𝑚(X) is the mean function, often set to zero without loss of generality, and 𝑘 (X, X′) is the covariance function or kernel that encodes the relationship between data points. For the surrogate models, the Radial Basis Function (RBF) kernel, also known as the Gaussian kernel, is chosen due to its smoothness and infinite differentiability properties, which are suitable for modeling the underlying physics of the problem. The RBF kernel is defined as: 𝑘 (𝑥𝑖, 𝑥 𝑗 ) = 𝜎2 𝑓 (− 1 2 (𝑥𝑖 − 𝑥 𝑗 )𝑇 Λ−1(𝑥𝑖 − 𝑥 𝑗 )) (4.6) where 𝜎2 𝑓 is the signal variance, controlling the vertical scale of the function variances, and Λ = diag(𝑙2 1 , 𝑙2 2 , . . . , 𝑙2 𝑝) is a diagonal matrix of squared length-scale parameters. The input vectors 𝑥𝑖, 𝑥 𝑗 correspond to different observations.The RBF kernel measures the similarity between input points, with greater values indicating more closely related inputs leading to similar outputs. The 51 surrogate models are trained using datasets generated from FEM simulations. The training process involves optimizing the hyperparameters 𝜎2 𝑓 , Λ, 𝜎2 𝑛 of the GPR model. This is typically done by maximizing the log-marginal likelihood function, which balances data fit and model complexity: log 𝑝(y|X, 𝜃) = − y⊤(𝐾 + 𝜎2 𝑛 𝐼)−1y − 1 2 1 2 log |𝐾 + 𝜎2 𝑛 𝐼 | − 𝑛 2 log 2𝜋 (4.7) where 𝐾 is the kernel matrix computed using the RBF kernel for all pairs of training inputs, 𝑛 is the number of training data points, and 𝜃 represents all the hyperparameters. By optimizing these hyperparameters and incorporating a 𝜏-alignment step that uses calibration points where the field-measured decay times 𝜏calibration from field,𝑖 are known and the predicted decay time constants 𝜏simulation,𝑖 can be scaled with a factor of 𝛼 and by minimizing: RMSE = (cid:118)(cid:117)(cid:116) 1 𝑁 𝑁 ∑︁ 𝑖=1 (𝜏calibration from field,𝑖 − 𝛼𝜏simulation,𝑖)2 (4.8) By optimizing these hyperparameters and incorporating 𝜏 alignment by minimizing RMSE, the GPR models become finely tuned to the underlying patterns in the FEM simulation and field test data, enabling accurate and efficient predictions during real-time inspections. 4.5 Field Tests and Results 4.5.1 Inspection Setup In this section, we present the results of field tests conducted to validate the effectiveness of the proposed auto-compensation algorithm in real-world PEC inspections of high-voltage feeder pipes. We analyze the performance of the algorithm across multiple pipe segments, compare compensated and uncompensated measurements, and discuss the practical implications of our findings. Figure 31 illustrates the deployment of the PEC inspection system in an underground tunnel environment, characterized by space constraints and complex arrangements of high-voltage feeder pipes. To thoroughly assess the pipe surfaces, we employed a dual scanning strategy. First, we conduct scans at 45° intervals around the pipe’s circumference to accurately capture wall thickness and account for magnetic permeability variations due to internal cables. Next, we employed an 52 Figure 31 (a) Field deployment of PEC inspection system on high-voltage feeder pipes in an under- ground tunnel (b) Schematic representation of circumferential scan positions and corresponding axial scan lines axial scanning method, during which sensors traversed the entire length of the pipe, systematically covering the axial direction to map wall thickness variations longitudinally. Calibration of the PEC instrument was critical for ensuring accuracy. We initiated the process by attaching the encoder and preparing the system for scanning. An initial point scan targeted areas with near 100% nominal wall thickness and minimal insulation, serving as a baseline. We then identified an optimal reference point with a higher signal amplitude and slower decay rate, corresponding to the nominal (100%) wall thickness. Calibrating at both the beginning and end of each inspection cycle ensured probe stability and compensated for any system drift, thereby enhancing measurement reliability. Table 4.1 summarizes the tested pipe segments specifications, including six horizontal and two vertical carbon steel pipes from different feeder lines. Notably, segments PS17-19 through PS30- 14 from Feeder Line 34051 had available ultrasonic testing (UT) validation data. The removal of protective coatings allowed for UT measurements, which confirmed consistent wall thickness and indicated that any detected anomalies in uncompensated PEC data were false positives due to electromagnetic interference rather than actual material loss. All pipes shared similar material prop- erties and were inspected using the PEC-025-G2 probe under consistent configurations, although scan modes varied between dynamic and grid mapping. The uniform dimensions (outer diameters and wall thicknesses) facilitated comparative analysis. Data quality metrics showed low warning percentages (1.52% to 3.93%) for the F34051 segments. The frequent false-positive wall loss indications in uncompensated PEC data underscored the necessity of our compensation algorithm. 53 Table 4.1 Pipe segment and inspection specifications Pipe ID Type/Feeder Dimensions Coverage No. lines Validation PS17-19 PS23-25 PS29-30 PS30-14 PS93-92 Horizontal F34051 Horizontal F34051 Horizontal F34051 Horizontal F34051 Horizontal F63 PS100-99 Horizontal F63 Vertical-4 Vertical Vertical-8 Vertical (OD/WT/Coating) OD: 219.1 mm WT: 6.35 mm Coating: 6.35 mm OD: 219.1 mm WT: 6.35 mm Coating: 6.35 mm OD: 219.1 mm WT: 6.35 mm Coating: 6.35 mm OD: 219.1 mm WT: 6.35 mm Coating: 6.35 mm OD: 273.05 mm WT: 6.35 mm Coating: 6.35 mm OD: 273.05 mm WT: 6.35 mm Coating: 6.35 mm OD: 273.05 mm WT: 6.35 mm No Coating OD: 273.05 mm WT: 6.35 mm No Coating L: 7315.2 mm Circ: 728.1 mm L: 7315.2 mm Circ: 728.1 mm L: 7315.2 mm Circ: 728.1 mm L: 7315.2 mm Circ: 728.1 mm L: 7315.2 mm Circ: 897.7 mm L: 7315.2 mm Circ: 897.7 mm L: 6096 mm Circ: 897.7 mm L: 6096 mm Circ: 897.7 mm 8 8 8 8 16 16 4 8 Status UT Verified UT Verified UT Verified UT Verified N/A N/A N/A N/A While UT validation was possible for accessible areas, some uncertainty remains for non-validated sections due to inaccessibility. Nevertheless, the compensation algorithm demonstrated improved accuracy across all tested segments, reducing false positives and enhancing the reliability of PEC inspections. 4.5.2 Comparison of Compensated vs. Uncompensated PEC Measurements Figure 32 presents the probability density distributions of wall thickness measurements for four segments, comparing compensated and uncompensated PEC data. The compensated measurements exhibit higher means and significantly lower standard deviations, indicating enhanced accuracy and precision. For the PS17-19 segment, the mean increased from 96.05% to 98.95%, while the standard deviation was reduced from 7.12% to 1.45%. Similarly, PS23-25 measurements saw an increase 54 Figure 32 Probability density distributions of wall thickness measurements comparing compensated (blue) and uncompensated (red) PEC inspections for four pipe segments: (a) PS17-19 (b) PS23-25 (c) PS29-30 and (d) PS30-14 in mean from 98.68% to 99.39% with a corresponding reduction in variability. In the PS29-30 segment, the mean rose from 91.43% to 100.33%. Lastly, the PS30-14 segment showed an increase in mean from 89.71% to 100.53%. These results demonstrate that the compensation algorithm effectively corrects underestimations caused by magnetic interference, aligning the measurements closely with the nominal wall thickness of 100%. The reduced standard deviations reflect improved measurement consistency. While some uncertainty persists in non-validated areas, the overall enhancement in data quality underscores the practical value of the compensation algorithm in field applications. Figures 33 and 34 provide polar visualizations of wall thickness distributions for pipe segments PS17-19 and PS30-14, respectively. Each figure divides the pipe circumference into eight segments at 45° intervals, with concentric rings representing wall thickness percentages from 20% to 120% of the nominal value. Before Compensation, both figures show significant underestimations in 55 Figure 33 PS17-19 Polar visualization of wall thickness distribution before and after compensation Figure 34 PS30-14: Comparative polar plots of wall thickness distribution before and after com- pensation 56 specific angular sectors (e.g., 270°–315° for PS17-19 and 270°–0° for PS30-14), with measurements indicating up to 20% false material loss. These discrepancies are attributed to magnetic field interference from internal power cables. After Compensation, the measurements uniformly range from 95% to 105% of the nominal thickness in all segments. This uniformity confirms the algorithm’s effectiveness in correcting spatial measurement errors and aligns with UT validations of consistent wall thickness. The correction of the double-dip pattern in PS30-14 suggests that the algorithm successfully addresses complex interference patterns, potentially caused by the configuration of the internal three-phase power cables. These visualizations reinforce the practical applicability of the compensation algorithm in enhancing inspection accuracy. Figure 35 Vertical-8 Radial box plots displaying the wall thickness distribution before and after compensation Results of Vertical-8 with eight 45° intervals on a polar grid are shown in Figure 35. Reference circles at 0%, 70%, and 100% nominal thickness are shown, with compressed scaling below 70% to enhance visualization. Before compensation, significant measurement variability is evident, particularly at 270° and 315°. After compensation, median values align near 100% nominal thickness with reduced interquartile ranges, indicating enhanced accuracy and consistency. Figure 36 displays radial box plots for segment PS100-99, comparing pre-and post-compensation at 16 circumferential positions (22.5° intervals) on a polar grid with reference circles at 0%, 70%, and 57 Figure 36 PS100-99 Radial box plots showing wall thickness distribution before and after compen- sation 100% nominal thickness. For this horizontal pipe segment inspected at 16 circumferential positions, the pre-compensation data shows heterogeneous interquartile ranges and medians deviating from the nominal thickness. Post-compensation, the measurements display uniform medians near 100%, reduced variability, and consistent whisker lengths across all angles. These results affirm that the compensation algorithm effectively corrects systematic biases and reduces measurement variability regardless of pipe orientation or surface conditions. The algorithm’s ability to improve accuracy in both coated horizontal pipes and uncoated vertical pipes underscores its practical applicability in diverse field environments. While the proposed method significantly improves measurement accuracy, limitations exist. The algorithm’s performance in non-validated sections of the pipes remains uncertain due to the lack of UT or visual inspection data. Additionally, the algorithm assumes uniform material properties, which may not account for anomalies like localized corrosion or material defects. 4.6 Conclusions In this chapter, we present a novel approach to enhancing the accuracy of PEC measurements for the inspection of high-voltage feeder cable pipes. By introducing new physics-based models 58 that, for the first time, effectively compensate for signal distortions caused by internal magnetic fields, we have significantly improved PEC measurement accuracy. These models account for spa- tially varying magnetic permeability induced by internal current-carrying conductors, addressing a critical challenge that previously led to erroneous assessments of material integrity. The proposed integrated compensation methodology combines empirical data, FEM simulations, and GPR sur- rogate models. This integration enables accurate pipeline integrity assessments to be conducted within minutes, making the approach practical for in situ inspections and real-time applications. The use of surrogate models reduces computational demands without compromising accuracy, facilitating rapid and precise wall thickness estimations. The compensation methods and accom- panying software tool have been validated through field demonstrations on in-service pipelines. Deploying the system in challenging underground tunnel environments confirmed its effectiveness and robustness. 59 CHAPTER 5 PHYSICS GUIDED EXPLAINABLE NETWORKS FOR AE CLASSIFICATION 5.1 Introduction As a key diagnostic method in structural health monitoring (SHM) and nondestructive evaluation (NDE), Acoustic Emission (AE) enables detailed assessment of material behavior and structural conditions Strantza et al. (2015). AE signals often exhibit long-duration, non-stationary waveforms with complex reverberation patterns Daugela et al. (2021b), making them challenging to interpret through traditional machine learning methods. Deep learning models have revolutionized AE analysis by revealing hidden patterns and extracting meaningful information Wu et al. (2020), with promising results in applications such as crack propagation monitoring in concrete structures Haile et al. (2020b) and bearing fault diagnosis in wind turbines Zhong and Chen (2023). Despite these advances, the interpretability of these models remains a critical challenge Selvaraju et al. (2016), especially when trust and transparency are paramount for infrastructure maintenance and safety Wickstrøm et al. (2020); Ivaturi et al. (2021). To address this issue, explainable AI (XAI) techniques have emerged as effective tools to clarify deep learning’s decision-making processes Selvaraju et al. (2016); Wickstrøm et al. (2020); Ivaturi et al. (2021); Nayebi et al. (2022); Shawi et al. (2019); Alam et al. (2023); Guillemé et al. (2019). Among these, visualization-based methods such as Class Activation Mapping (CAM) and Gradient- weighted CAM (Grad-CAM) have gained traction. CAM and Grad-CAM have proven effective in visualizing discriminative regions contributing to classifier decisions, helping to align model outputs with underlying physical phenomena. By highlighting which parts of the signal influence the model’s classification, these methods reveal a more transparent decision-making process. They have been successfully applied in diverse domains, including arrhythmia classification from ECG signals Singh and Sharma (2022), MRI data for multiple sclerosis classification Zhang et al. (2021), demonstrating their versatility and potential for AE analysis Wang et al. (2016); Parvatharaju et al. (2021); Singh et al. (2020). 60 In the context of AE signals, the integration of CAM and Grad-CAM with physics-informed segmentation based on the theoretical arrival times of fundamental Lamb wave modes exploits the dispersive nature of Lamb waves in plate-like structures. By segmenting signals into physics-based regions—the Pre-S0, the Transition Zone, and the Post-A0 Region—the model can focus on mode- specific features that correspond to physically significant wave interactions. This approach enables mode-specific analysis that aligns with underlying wave propagation physics, thereby building trust in the classification outcomes Wu et al. (2020); Haile et al. (2020b); Singh and Sharma (2022). Such interpretability methods mitigate the "black-box" nature of deep learning, providing "crucial insights" and increasing user confidence in automated AE diagnostics Singh et al. (2020). As a result, we can better understand the factors driving model decisions and leverage that knowledge for more informed SHM and NDE assessments. 5.2 Physics-informed AE Segmentation Physics-informed segmentation provides a solid foundation for extracting meaningful features from complex Acoustic Emission (AE) signals, utilizing the theoretical arrival times of fundamental Lamb wave modes: symmetric (𝑆0) and antisymmetric (𝐴0). Lamb waves are essential for AE signal analysis in plate structures, displaying multimodal and dispersive properties, where propagation velocities depend on frequency. These features enhance structural health monitoring and damage localization. The 𝑆0 mode has higher group velocities at lower frequencies, while the 𝐴0 mode is highly dispersive, carrying energy across a wide frequency range. Together, they are crucial for understanding damage mechanisms and localizing AE sources Qiu et al. (2020). The segmentation method exploits the dispersive, frequency-dependent behavior of Lamb waves, enabling mode-specific analysis aligned with wave propagation physics. The AE signal is segmented into three distinct regions based on these TOA calculations. The Pre-𝑆0 Region captures environmental noise and electronic interference before the 𝑆0 mode arrives, defined by 𝑡 < 𝜏𝑆0 for time domain and 𝑡 < 𝜏𝑆0( 𝑓 ) for time-frequency domain. The 𝑆0-𝐴0 Transition Zone includes interactions and conversions between 𝑆0 and 𝐴0 modes, occurring when 𝜏𝑆0 ≤ 𝑡 ≤ 𝜏𝐴0 for time domain and 𝜏𝑆0( 𝑓 ) ≤ 𝑡 ≤ 𝜏𝐴0( 𝑓 ) for time-frequency domain. The Post 𝐴0 Region captures 61 later arrivals, such as reflections and higher-order modes, defined by 𝑡 > 𝜏𝐴0 for time domain and 𝑡 > 𝜏𝐴0( 𝑓 ) for time-frequency domain. Figure 37 (a) AE segmentation in time domain (b) AE segmentation in frequency-time domain This segmentation approach serves different purposes in time and time-frequency domains. In the time domain, it helps isolate mode-specific contributions in 𝑠(𝑡) and improves feature representation for 1D time series classification. For time-frequency analysis, it isolates mode- specific contributions in 𝑠(𝑡, 𝑓 ) and enhances mode-specific features for 2D classification. Figure 37 illustrates this segmentation approach through an example acoustic emission signal. Figure 37(a) shows the time-domain signal with three distinct regions marked in different colors: Region 1 (red) for Pre 𝑆0 arrival, Region 2 (green) for the 𝑆0-𝐴0 transition zone, and Region 3 (blue) for Post 𝐴0 arrival. The corresponding spectrogram (Figure 37(b)) displays the time-of-arrival curves for both modes across the frequency spectrum of 50–500 kHz, with the 𝑆0 mode arrival (solid black line) and 𝐴0 mode arrival (dashed black line) defining the region boundaries. This physics-based framework enables effective feature extraction and subsequent analysis, valuable for machine learning applications in structural health monitoring and damage detection. 5.3 Signal Preprocessing In this paper, AE signals were acquired using a fiber-optic coil-based sensing system with a broadband frequency response of 50–500 kHz. The signals were generated using the Hsu-Nielsen pencil lead break (PLB) method, which involved a 2H mechanical pencil with a 0.5 mm diameter 62 lead. We conducted ten PLB tests at each of the ten marked locations on a 1/10-inch-thick aluminum plate, with each test repeated to ensure statistical reliability and account for any variations in signal generation. The aluminum plate was marked with a grid to ensure precise and repeatable testing. AE signals were acquired using sensors with a broadband frequency response of 50–500 kHz, and digitized at a sampling rate of 2 MHz to ensure accurate capture of high-frequency components. An example of a raw AE signal is shown in Figure 38. Figure 38 Signal filtered to 50-500 kHz bandwidth To analyze the time-frequency characteristics of the AE signals, we employed the Continuous Wavelet Transform (CWT), which provides a detailed representation of the signal’s frequency content over time. The CWT is defined as: −∞ where 𝑊𝑥 (𝑎, 𝑏) is the wavelet coefficient, 𝑥(𝑡) is the AE signal, 𝜓 is the mother wavelet (we use 𝑊𝑥 (𝑎, 𝑏) = ∫ ∞ 𝑥(𝑡)𝜓∗ (cid:18) 𝑡 − 𝑏 𝑎 1 √︁|𝑎| (cid:19) 𝑑𝑡, (5.1) the Morlet wavelet due to its good time-frequency localization properties), 𝑎 is the scale parameter, and 𝑏 is the translation parameter. We used the Morlet wavelet as the mother wavelet due to its excellent time-frequency localization properties, defined as: 𝜓(𝑡) = 1 √ 𝜋 4 𝑒𝑖𝜔0𝑡𝑒−𝑡2/2 (5.2) where 𝜔0 = 6 to satisfy the admissibility condition. We computed the CWT for scales corresponding to frequencies from 50 kHz to 500 kHz, divided into 18 uniformly spaced frequency bands: 50– 63 75 kHz, 75–100 kHz, ..., 475–500 kHz. This multi-scale analysis allowed us to capture both low-frequency trends and high-frequency transients characteristic of different AE sources. Figure 39 Continuous Wavelet Transform (CWT) of an Acoustic Emission (AE) signal, showing the time-frequency energy distribution with time (𝜇s) on the x-axis, frequency (kHz) on the y-axis, and wavelet coefficient magnitude represented by color intensity The resulting time-frequency representation provides a rich set of features that can reveal subtle differences between various types of AE events. Figure 39 shows the transformation of the AE signal into the time-frequency representation via CWT, displaying energy distribution over time and frequency bands. This reveals how the frequency content evolves, essential for non-stationary AE signals. Peaks in the CWT plot indicate high energy periods, corresponding to specific Lamb wave modes (𝑆0 and 𝐴0) arriving at different times. The CWT separates overlapping modes by highlighting their unique time-frequency features, enabling mode-specific analysis. This enables the identification of damage mechanisms or AE events, essential for mode arrival analysis and classification. Additionally, representing the signal in both time and frequency enhances the deep learning model’s ability to learn discriminative, physics-informed features. Visualizing energy concentration in the CWT aids interpretability through CAM, linking classifier’s performance to physical events like mode arrivals and reflections. 64 5.4 Explainable CNN Architecture 5.4.1 CAM and Grad-CAM for 1D signal interpretation The non-stationary nature of AE signals presents significant challenges for interpretation and classification, particularly when using conventional machine learning models that operate as black boxes without physical insights. To address this, we adopt an explainable deep learning framework that integrates Class Activation Mapping (CAM), Gradient-weighted Class Activation Mapping (Grad-CAM) and Physics-informed segmentation based on Lamb wave theory. This combination highlights the specific time regions contributing most to a model’s decision, thus improving trans- parency. The proposed architecture employs a one-dimensional Convolutional Neural Network (CNN) for temporal feature extraction. Sequential Conv1D layers process the signal, followed by dimensionality reduction through pooling or Global Average Pooling (GAP). The network culmi- nates in a Fully Connected layer computing class scores, with a final SoftMax layer generating classification probabilities. At the final classification step, suppose the network outputs a score 𝑆𝑘 for class 𝑘: 𝑆𝑘 = 256 ∑︁ 𝑛=1 𝑤 𝑘,𝑛𝐹𝑛 + 𝑏𝑘 , (5.3) Where 𝐹𝑛 and 𝑛 denote the 𝑛𝑡ℎ feature and the number of feature channels respectively, 𝑤 𝑘,𝑛 and 𝑏𝑘 are the weights and biases of the fully connected layer. The larger the score 𝑆𝑘 , the stronger the model’s belief that the input belongs to class 𝑘. In general, CAM provides a heatmap over time, showing which parts of the signal most contribute to a specific class prediction. Instead of using the pooled feature values 𝐹𝑛, CAM directly weights the final feature maps 𝐴𝑛 (𝑖) by the class-specific weights 𝑤 𝑘,𝑛 which is the weight for class 𝑘 from the final layer. CAM𝑘 (𝑖) = 256 ∑︁ 𝑤 𝑘,𝑛 𝐴𝑛 (𝑖) (5.4) 𝑛=1 The CAM value at any time point 𝑖 indicates the signal regions that most strongly influence the model’s classification decision for class 𝑘. When CAM𝑘 (𝑖) is visualized as an overlay on the AE waveform input, it reveals the temporal segments that the model considers most significant for 65 its classification decision. Grad-CAM refines this approach by using gradients of 𝑆𝑘 with respect to the feature maps 𝐴𝑛 (𝑖) as 𝜕𝑆𝑘 𝜕 𝐴𝑛 (𝑖) . Rather than relying on the final-layer weights, Grad-CAM computes importance weights by averaging the gradients over time, then multiplying by the feature maps. Through Grad-CAM, we can see not only what the final classification layer focused on, but also how the model’s hidden representations responded to each segment of the signal. 5.4.2 DCAM for 2D signal interpretation Wavelet transformation converts the original AE signal into a two-dimensional time-frequency representation, mapping energy distribution across multiple frequency bands over time. This 2D representation, combined with Dimensional-wise Class Activation Mapping (DCAM), provides enhanced interpretability of the wavelet coefficient space. The key idea of DCAM is to introduce small variations to the input 2D data and compute corresponding CAM, it identifies stable pat- terns across perturbations while reducing the impact of transient features. The input consists of multiple channels, where each corresponds to a specific frequency band obtained from the wavelet decomposition with 18 frequencies. It begins with generating multiple permutations of the input data to capture a diverse range of feature activations and to assess the model’s consistency across variations in the data. For each permutation 𝑋𝜋 𝑝 and class 𝑘, activation maps CAM( 𝑝) (𝑖, 𝑗) are 𝑘 computed using: CAM( 𝑝) 𝑘 (𝑖, 𝑗) = 𝑤 𝑘,𝑛 𝐴( 𝑝) 𝑛 (𝑖, 𝑗) 256 ∑︁ 𝑛=1 (5.5) where 𝐴( 𝑝) 𝑛 (𝑖, 𝑗) are the feature maps from the last convolutional layer for permutation 𝑝. These CAMs are then averaged over all permutations to obtain a mean activation map: CAM𝑘 (𝑖, 𝑗) = 1 𝑃 𝑃 ∑︁ 𝑝=1 CAM( 𝑝) 𝑘 (𝑖, 𝑗) (5.6) representing the most consistently significant features. Simultaneously, the variance of the CAMs is calculated to assess the stability and reliability of the activations: 66 Var𝑘 (𝑖, 𝑗) = 1 𝑃 𝑃 ∑︁ (cid:16) 𝑝=1 CAM( 𝑝) 𝑘 (𝑖, 𝑗) − CAM𝑘 (𝑖, 𝑗) (cid:17) 2 The DCAM𝑘 (𝑖, 𝑗) is then generated by combining the average CAM and inverse variance: DCAM𝑘 (𝑖, 𝑗) = CAM𝑘 (𝑖, 𝑗) × (cid:18) 1 − (cid:19) Var𝑘 (𝑖, 𝑗) max (Var𝑘 ) (5.7) (5.8) The DCAM technique extends CAM by incorporating permutations of the input signal, pro- viding a more robust and discriminative visualization of the features important for classification. Wavelet transform is applied to the AE signals to decompose them into time-frequency represen- tations, allowing for detailed analysis of the signal’s frequency content over time. The wavelet coefficients are segmented based on frequency bands corresponding to different Lamb wave modes and dispersive characteristics, facilitating mode-specific analysis. DCAM generates heatmaps from 2D wavelet data, revealing discriminative frequency bands, critical time intervals, and the robustness of identified features. These heatmaps enable a deeper understanding of how the model interprets complex AE signals and highlight the importance of specific wavelet coefficients, thereby providing insights into the physical phenomena underlying the AE signals and enhancing the interpretability of the classification process. 5.5 Improved Interpretability and Performance 5.5.1 Time-Domain Interpretation of AE Signals Using CAM and Grad-CAM The analysis of Acoustic Emission (AE) signals from two distinct source locations (1 and 10) provides significant insights into the model’s ability to differentiate between different sources based on signal characteristics. Figures 40 and 41 present a comparison of raw AE signals using CAM and Grad-CAM for locations 1 and 10 respectively. The raw AE signals exhibit notable differences in their temporal profiles, particularly within the initial 400 𝜇s, indicating location-specific wave propagation patterns due to variations in propagation distance and attenuation. Using theoretical arrival times of fundamental Lamb wave modes (𝑆0 and 𝐴0) as natural boundaries, denoted by dotted lines, the signal is partitioned into three 67 Figure 40 (a) Example AE signal from location 1 (b) CAM for classifying AE signal from location 1 (c) CAM for classifying AE signal from location 10 regions: pre-𝑆0, 𝑆0–𝐴0 transition zone, and post-𝐴0. In the CAM and Grad-CAM visualizations, significant activation is observed in different regions for the two locations. For location 1, both CAM and Grad-CAM highlight significant activation in the early signal components (before 400 𝜇s), particularly within the 𝑆0–𝐴0 Transition Zone. This indicates that the model focuses on the initial wave arrivals and mode interactions for classification. In contrast, for test location 10, the activation maps show broader activation across the signal, including the post-𝐴0 region, suggesting that the model incorporates features from later-arriving wave components for more distant sources. Particularly, Grad-CAM provides more localized feature importance compared to CAM, potentially enabling more precise identification of critical signal components. The heatmaps reveal distinct activation patterns between locations 1 and 10, highlighting the model’s ability to 68 Figure 41 (a) Example AE signal from location 10 (b) CAM for classifying AE signal from location 1 (c) CAM for classifying AE signal from location 10 discern subtle, location-specific signal features. This observation supports the hypothesis that the proposed physics-informed segmentation enhances the model’s sensitivity to spatial variations in AE signal characteristics by correlating the model’s attention to specific wave modes and their interactions. The comparative analysis between CAM and Grad-CAM indicates that gradient- weighted approaches offer superior resolution in identifying salient signal features. This enhanced resolution could be particularly valuable in complex source localization scenarios where subtle differences in signal features are crucial. 69 5.5.2 Time-Frequency Analysis Using DCAM The time-frequency analysis of AE signals, as illustrated in Figures 42 and 43, reveals significant insights into the efficacy of DCAM for enhancing signal interpretation. Figure 42 presents a conventional spectrogram of an AE signal. Notably, the energy concentration around 150–200 𝜇s in the 50–200 kHz range, corresponding to the expected arrival of the 𝑆0 mode. The corresponding DCAM result is shown in Figure 43, both overlaid with theoretical TOA curves for the 𝑆0 and 𝐴0 derived from dispersion relations. Figure 42 Time-frequency analysis of an acoustic emission signal Figure 43 Dispersion Compensated Acoustic Monitoring (DCAM) results visualized as a heat map This enhanced definition facilitates more precise identification of mode-specific components, which is crucial for accurate source localization and damage characterization. Furthermore, the DCAM results unveil a distinct high-energy region at approximately 600–700 𝜇s across a broad frequency spectrum, which is substantially less discernible in the conventional spectrogram. This underscores DCAM’s capability to highlight late-arriving wave components, potentially corre- 70 sponding to reflections, mode conversions, or higher-order modes, which are critical in complex structural geometries. The superposition of theoretical dispersion curves denoted by white dotted curves provides a physics-informed framework for signal interpretation. The alignment of high- activation regions in the DCAM representation with the theoretical arrival times of the 𝑆0 and 𝐴0 modes corroborates the method’s effectiveness in compensating for dispersion effects, thereby improving the temporal localization of frequency components. 5.5.3 Quantitative Assessment of Segment Contributions To quantitatively assess the contributions of different signal segments to the classification decisions made by our deep learning model, we propose a framework that leverages the activation maps generated by CAM, Grad-CAM, and DCAM. This framework is designed to be both specific and precise, ensuring that the analysis is grounded in the physical characteristics of the AE signals and the model’s internal processes.The activation maps 𝐴(𝑡) and 𝐴(𝑡, 𝑓 ) for the time-frequency domain represent the model’s attention or focus on different parts of the input signal during classification. High activation values indicate that the model considers those regions particularly important for making its decision. By integrating these activation maps over predefined signal segments, we can quantify the total contribution of each segment to the classification outcome: ∫ 𝑠𝑖 ∫ 𝑆 Similarly, for the time-frequency domain analysis, where the signal is represented as 𝑠(𝑡, 𝑓 ), the 𝐴(𝑡) 𝑑𝑡 𝐴(𝑡) 𝑑𝑡 𝐶𝑖 = (5.9) segment contribution is: 𝐶𝑖 = ∬ 𝑠𝑖 ∬ 𝑆 𝐴(𝑡, 𝑓 ) 𝑑𝑡 𝑑𝑓 𝐴(𝑡, 𝑓 ) 𝑑𝑡 𝑑𝑓 (5.10) This formulation directly measures the proportion of the model’s total attention allocated to each segment. To account for segment size variations, we introduced the relative importance factor, adjusting the normalized contribution. To enable comparison across different signals and segments, we further normalize these contributions: 71 𝐶′ 𝑖 = 𝐶𝑖 (cid:205)𝑖 𝐶𝑖 (5.11) This normalization ensures that the sum of all normalized contributions 𝐶′ 𝑖 equals 1, allowing for a direct comparison of segment importance irrespective of the signal’s absolute activation levels. However, segments vary in size—both in time duration Δ𝑡𝑖 for time-domain signals and in area 𝐷𝑖 for time-frequency representations. Larger segments might naturally accumulate more activa- tion simply due to their size, not necessarily because they are more significant for classification. To account for this, we introduce the relative importance factor, which adjusts the normalized contribution by the proportion of the segment’s size to the total signal size. For the time domain: where Δ𝑡𝑖 is the duration of segment 𝑠𝑖 and 𝑇 is the total signal duration. For the time-frequency Importance𝑖 = 𝐶′ 𝑖 Δ𝑡𝑖/𝑇 (5.12) domain: Importance𝑖 = 𝐶′ 𝑖 𝐷𝑖/𝐷 (5.13) where 𝐷𝑖 is the area of segment 𝑠𝑖 in the time-frequency plane, and 𝐷 is the total area of the spectrogram. A higher value indicates that the segment contributes more to the classification decision than would be expected based on its size alone. 5.5.3.1 Analysis of Average Importance Heatmaps Figure 44 presents the Average Importance Heatmaps for CAM (44(a)), Grad-CAM (44(b)), and DCAM (44(c)), offering significant insights into the contribution of different signal regions to the classification process across various labels (source locations). The CAM heatmap in Figure 44(a) demonstrates a notable emphasis on the 𝑆0–𝐴0 transition zone, particularly for Label 1, with a peak importance value of 0.7739. This suggests that the model heavily relies on features within this transition zone for classifying signals from this specific location. Labels 5–10 show a more balanced distribution of importance across all three regions, indicating that the model incorporates features from multiple phases of the signal for distant source locations. 72 The Grad-CAM results in Figure 44(b) present a different pattern, with a general shift of importance towards the post-𝐴0 region. While the 𝑆0–𝐴0 transition zone still shows high importance for test point 1 (0.7391), for test points 5–10, the post-𝐴0 region consistently demonstrates higher importance values. This shift suggests that Grad-CAM captures more nuanced features in the later parts of the signal, potentially accounting for reflections and dispersive effects more effectively than CAM.The DCAM heatmap in Figure 44(c) reveals the high importance in the 𝑆0–𝐴0 transition zone across all labels, with an exceptionally high value of 20.6584 for Label 1. This dramatic difference in scale indicates that DCAM assigns substantially higher relative importance to specific time-frequency regions. The high concentration in the 𝑆0–𝐴0 region aligns with the physical understanding of mode interactions and their significance in damage characterization and source localization. The row in each heatmap shows that CAM has a slight preference for the 𝑆0–𝐴0 transition zone (0.4056), while Grad-CAM and DCAM indicate higher overall importance for the post-𝐴0 region (0.3220 and 0.9273, respectively). These findings underscore the value of employing multiple visualization techniques to gain a comprehensive understanding of the model’s decision- making process. By isolating mode-specific contributions and enhancing feature representation, this framework has the potential to improve source localization accuracy and advance damage characterization in SHM applications. 73 Figure 44 (a) Class Activation Mapping (CAM) Average Importance Heatmap (b) Gradient- weighted Class Activation Mapping (GCAM) Average Importance Heatmap (c) Dispersion Com- pensated Acoustic Monitoring (DCAM) Average Importance Heatmap 74 CHAPTER 6 CONCLUSIONS AND FUTURE WORK 6.1 Conclusions and Contribution This dissertation advances acoustic emission–based structural health monitoring by integrating simulation-driven techniques, domain adaptation, and explainable AI. Through comprehensive nu- merical modeling, robust machine learning architectures, and GAN-based data augmentation, the research addresses key challenges such as limited annotated data, imbalanced class distributions, and the gap between simulation and experiment. Multi-fidelity surrogate modeling further expands the applicability of these methods to electromagnetic inspection scenarios, broadening their impact on defect characterization and real-time monitoring. Overall, the proposed frameworks deliver improved AE source localization, damage classification, and interpretability, laying a strong foun- dation for broader deployment of data-driven SHM solutions. By fusing high-fidelity simulations with innovative learning strategies, the results demonstrate that even single-sensor deployments can achieve precise, explainable damage detection in complex engineering environments. These contributions collectively underline the importance of synergizing modeling, data augmentation, and interpretability for safer and more reliable infrastructure monitoring. i. A robust finite element modeling framework was developed to simulate pencil-lead break (PLB) and impact sources on aluminum plates. These high-fidelity simulations enabled effective pre-training of deep learning models, thereby reducing the burden on experimental data collection. ii. We bridged the simulation-to-reality gap using a domain adaptation pipeline that aligns features between domains. The approach uses MMD minimization and feature matching to enhance real-world performance while reducing the need for large labeled datasets. iii. Generative Adversarial Networks (GANs) were leveraged to mitigate data imbalance in dam- age classes and to increase the diversity of AE training samples. Various GAN architectures 75 (e.g., DCGAN, WGAN) were evaluated, revealing significant gains in localization and clas- sification accuracy when synthetic minority class samples were introduced. iv. By combining an Inception-based regression model with GAN-synthesized data, we im- proved AE localization. The method’s single-sensor design offers easy deployment without sacrificing detection quality. v. A framework that integrates low-fidelity (simulation) and high-fidelity (experimental) data for inspection techniques like motion-induced and pulsed eddy current testing was proposed. These multi-fidelity surrogate models offer efficient yet accurate defect characterization in harsh electromagnetic environments. vi. Class Activation Mapping (CAM), Grad-CAM, and Dimension-wise Compensated Acoustic Monitoring (DCAM) were introduced to visualize and interpret deep learning decisions. By highlighting crucial wave modes and time-frequency regions, the approach enhanced transparency and trust in the classification pipeline. 6.2 Future Work 6.2.1 Domain-Adaptive Framework The methodologies and findings presented in this thesis open several promising avenues for future research and practical implementation. This chapter outlines key directions for extending the current work across five interconnected areas: domain-adaptive frameworks, multi-fidelity modeling, electromagnetic interference compensation, physics-guided explainable AI, and system- level integration. While current domain adaptation studies predominantly focus on single-sensor setups, future work could examine attention-based sensor fusion to handle multi-sensor Acoustic Emission (AE) arrays. By learning to weight the most informative sensor signals dynamically, the model could adapt to diverse propagation paths and noise conditions, potentially yielding more precise and reliable localization performance. Incorporating wave propagation constraints into the loss function (e.g., penalizing predictions that deviate significantly from known Lamb wave 76 arrival times) can better align model outputs with physical reality. Such constraints could be especially useful in cases where limited experimental data lead to overfitting or unrealistic wave mode representations. Recent advances in natural language processing (NLP) have demonstrated that large language models, such as ChatGPT, can learn transferable representations from extensive pre-training on diverse datasets. In future research, one promising direction is to investigate how these pretrained architectures might be repurposed for AE signal analysis. While ChatGPT is primarily trained on textual data, its underlying transformer-based structure could be adapted for sequential signal inputs through careful feature engineering. A key challenge is bridging the gap between GPT’s linguistic knowledge and the unique nature of AE signals, which exhibit complex time-frequency patterns rather than human language. Fine-tuning ChatGPT (or similar LLMs) with domain- specific AE datasets could potentially boost performance on classification, source localization, or damage characterization tasks. The approach would involve mapping AE signal features into an input format compatible with transformer architectures, then retraining (or partially fine-tuning) the model on specialized corpora of annotated AE signals. Beyond classification and localization, there may be untapped opportunities to transfer the learned embeddings from fine-tuned ChatGPT to other AE tasks—such as anomaly detection, multi-sensor data integration, or real-time damage prognosis. By systematically exploring transfer learning strategies, researchers could adapt the same pretrained backbone to address a broad spectrum of AE problems, reducing both data requirements and development time. Investigating these may significantly expand the toolkit for data-driven NDE, allowing large language model paradigms to be harnessed for highly specialized AE signal interpretation and multi-domain SHM applications. 6.2.2 Multi-Fidelity Modeling and Signal Processing Although multi-fidelity surrogate models reduce computational overhead, predictive uncertainty remains a challenge. Incorporating Bayesian methods (e.g., Gaussian process models with posterior distributions) or Monte Carlo dropout into the surrogate could provide confidence intervals for defect parameter estimates, giving operators clearer risk assessments. To reduce redundant high-fidelity 77 simulations or experimental measurements, active learning or Bayesian optimization methods could direct sampling to the most uncertain regions of the parameter space. This would maximize information gain while minimizing overall testing costs. While the dissertation employs Gaussian Process Regression and Radial Basis Function approaches, deep neural architectures (e.g., deep Gaussian Processes or physics-informed neural networks) could better capture highly nonlinear phenomena such as crack branching or complex eddy current distributions. 6.2.3 Compensation Method Figure 45 Pancake Eddy Current Sensor Array for Circumferential Pipe Inspection In future study, we plan to deploy a circumferential sensing system using pancake eddy current (EC) sensors or magnetometers arranged in a ring formation around the pipe. As shown in Figure 45, schematic design illustrated the multi-channel pancake eddy current sensor array wrapped around a pipe, illustrating both exciter and receiver coils. The array is designed to detect circumferential defects and changes in wall thickness by inducing and measuring eddy currents along the pipe’s surface. This configuration enables the detection and precise localization of internal feeder cables, determining both their offset distance and angular position. The system employs multiple pancake coils positioned at equal angular intervals around the pipe’s outer circumference. Operating 78 frequencies will be selected to achieve optimal penetration through the pipe wall, achieving a balance between detection sensitivity and resistance to noise interference. Special attention is given to maintaining consistent liftoff distances for each coil, considering any protective coatings such as wax tape. The measurement process follows a sequential activation pattern where each coil serves as a transmitter while the others function as receivers. This approach generates a comprehensive set of transmitter-receiver coupling measurements. Each sensor captures voltage responses that contain encoded information about the cable’s location through characteristic field distortion patterns. The collected sensor data can be processed to determine the precise position of the internal cable. 79 BIBLIOGRAPHY Ai, L., Soltangharaei, V., Bayat, M., Van Tooren, M., and Ziehl, P. (2021). Detection of impact on aircraft composite structure using machine learning techniques. Measurement Science and Technology, 32(8):084013. Alam, M. U., Zaki, M., Ramachandra, V., et al. (2023). SHAMSUL: Systematic Holistic Analysis to Investigate Medical Significance Utilizing Local Interpretability Methods in Deep Learning for Chest Radiography Pathology Prediction. Nordic Machine Intelligence, 1(2):45–58. Antipov, A. and Markov, A. (2018). 3d simulation and experiment on high speed rail mfl inspection. NDT & E International, 98:177–185. Assarar, M., Scida, D., El Mahi, A., Poilâne, C., and Ayad, R. (2015). Monitoring of damage mechanisms in sandwich composite materials using acoustic emission. International Journal of Damage Mechanics, 24(5):787–804. Bao, Y. (2023). Modeling of eddy current ndt simulations by kriging surrogate model. Research in Nondestructive Evaluation, 34:154–168. Bengio, Y. (2012). Deep learning of representations for unsupervised and transfer learning. In Guyon, I., Dror, G., Lemaire, V., Taylor, G., and Silver, D., editors, Proceedings of the ICML Workshop on Unsupervised and Transfer Learning, volume 27 of JMLR Workshop and Confer- ence Proceedings, pages 17–36. PMLR. Bhuiyan, M. Y. and Giurgiutiu, V. (2017). Experimental and computational analysis of acoustic emission waveforms for shm applications. Structural Health Monitoring, 16(5):608–620. Bieri, O., Markl, M., and Scheffler, K. (2005). Analysis and compensation of eddy currents in balanced ssfp. Magnetic Resonance in Medicine, 54(1):129–137. Bouzid, O. M., Tian, G. Y., Cumanan, K., and Neasham, J. (2015). Wireless ae event and environmental monitoring for wind turbine blades at low sampling rates. In Ohtsu, M., editor, Advances in Acoustic Emission Technology, pages 533–546. Springer. Chadha, M., Yang, Y., Hu, Z., and Todd, M. D. (2023). Evolutionary sensor network design for structural health monitoring of structures with time-evolving damage. In Proceedings of the 14th International Workshop on Structural Health Monitoring (IWSHM), pages 368–380. Stanford University. Chen, S.-Z. et al. (2021). Development of data-driven prediction model for cfrp-steel bond strength by implementing ensemble learning algorithms. Construction and Building Materi- als, 303:124470. Chen, S. Z. and Feng, D. C. (2022). Multifidelity approach for data-driven prediction models of 80 structural behaviors with limited data. Computer-Aided Civil and Infrastructure Engineering, 37(12):1566–1581. Chen, X. and Lei, Y. (2015). Electrical conductivity measurement of ferromagnetic metallic materials using pulsed eddy current method. NDT & E International, 75:33–38. Chen, X. and Liu, X. (2021). Pulsed eddy current-based method for electromagnetic parameters of ferromagnetic materials. IEEE Sensors Journal, 21(12):6376–6383. Ciampa, F. and Meo, M. (2010). Acoustic emission source localization and velocity determination of the fundamental mode a0 using wavelet analysis and a Newton-based optimization technique. Smart Materials and Structures, 19(4):045002. Cortês, G. d. S., da Silva Junior, D., de Carvalho, T. B., Soares, E. L., dos Santos Oliveira, J. C., and Sattler, M. A. (2023). Analysis of pec technique and external magnetic fields for detection of corrosion under insulation: 3d finite element model. Concilium, 15(2):145–160. Cuadra, J., Vanniamparambil, P. A., Servansky, D., Bartoli, I., and Kontsos, A. (2015). Acoustic emission source modeling using a data-driven approach. Journal of Sound and Vibration, 341:222–236. Daugela, A., Chang, C.-L., and Peterson, D. (2021a). Deep learning based characterization of nanoindentation induced acoustic events. Materials Science and Engineering: A, 800:140273. Daugela, A., Chang, C.-L., and Peterson, D. (2021b). Deep learning based characterization of nanoindentation induced acoustic events. Materials Science and Engineering: A, 800:140273. Duplicate entry renamed to avoid collision with Daugela2021. De Almeida, V. A. D., Baptista, F. G., and De Aguiar, P. R. (2015). Piezoelectric transducers IEEE assessed by the pencil lead break for impedance-based structural health monitoring. Sensors Journal, 15(2):693–702. Delashmit, W. H. and Manry, M. T. (2005). Recent developments in multilayer perceptron neu- ral networks. In Proceedings of the Seventh Annual Memphis Area Engineering and Science Conference (MAESC), pages 1–7. MAESC. Dong, S., Yuan, M., Wang, Q., and Liang, Z. (2018). A modified empirical wavelet transform for acoustic emission signal decomposition in structural health monitoring. Sensors, 18(5):1645. Eaton, M. J., Pullin, R., and Holford, K. M. (2012). Towards improved damage location using acoustic emission. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 226(9):2141–2153. Ebrahimkhanlou, A., Dubuc, B., and Salamone, S. (2019). A generalizable deep learning framework for localizing and characterizing acoustic emission sources in riveted metallic panels. Mechanical 81 Systems and Signal Processing, 130:248–272. Ebrahimkhanlou, A. and Salamone, S. (2017a). Acoustic emission source localization in thin metallic plates: A single-sensor approach based on multimodal edge reflections. Ultrasonics, 78:134–145. Ebrahimkhanlou, A. and Salamone, S. (2017b). A probabilistic framework for single-sensor acoustic emission source localization in thin metallic plates. Smart Materials and Structures, 26(9):095026. Ebrahimkhanlou, A. and Salamone, S. (2018a). A deep learning approach for single-sensor acoustic emission source localization in plate-like structures. Structural Health Monitoring, 17(5):1335– 1351. Ebrahimkhanlou, A. and Salamone, S. (2018b). Single-sensor acoustic emission source localiza- tion in plate-like structures: A deep learning approach. Health Monitoring of Structural and Biological Systems XII, 10600:106001O. Ebrahimkhanlou, A. and Salamone, S. (2018c). Single-sensor acoustic emission source localization in plate-like structures using deep learning. Aerospace, 5(2):50. Farcaş, I.-G. et al. (2023). Context-aware learning of hierarchies of low-fidelity models for multi- fidelity uncertainty quantification. Computer Methods in Applied Mechanics and Engineering, 406:115908. Fawaz, H. I., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. (2018). Transfer learning for time series classification. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), pages 1367–1376. IEEE. Fernández-Godino, M. G. et al. (2016). Review of multi-fidelity models. arXiv preprint arXiv:1609.07196. Fernández-Godino, M. G. et al. (2019). Issues in deciding whether to use multifidelity surrogates. AIAA Journal, 57(5):2039–2054. Forrester, A. I. J., Sóbester, A., and Keane, A. J. (2007). Multi-fidelity optimization via surro- gate modelling. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2088):3251–3269. Fu, J., Ning, Z., and Chang, Y. (2021). Active compensation method for strong magnetic interference of mems electronic compass. IEEE Access, 9:48860–48872. Garrett, J. C., Mei, H., and Giurgiutiu, V. (2022). An artificial intelligence approach to fatigue crack length estimation from acoustic emission waves in thin metallic plates. Applied Sciences, 12(3):1372. 82 Guillemé, M., Lemaitre, A., Devaux, M., and Lannoo, B. (2019). Agnostic local explanation for time series classification. In 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pages 432–439. IEEE. Haile, M., Zhu, E., Hsu, C. D., and Bradley, N. (2020a). Deep machine learning for detection of acoustic wave reflections. Structural Health Monitoring, 19(5):1340–1350. Haile, M., Zhu, E., Hsu, C. D., and Bradley, N. (2020b). Deep machine learning for detection of acoustic wave reflections. Structural Health Monitoring, 19(5):1340–1350. Duplicate entry renamed to avoid collision with Haile2020. Hamstad, M. A. (2007). Acoustic emission signals generated by monopole (pencil-lead break) versus dipole sources: Finite element modeling and experiments. Journal of Acoustic Emission, 25:92–106. Han, W., Liu, Y., Zhang, B., and Wang, Y. (2014). Fast estimation of defect profiles from the magnetic flux leakage signal based on a multi-power affine projection algorithm. Sensors (Basel), 14(9):16454–16466. Hardy, R. L. (1971). Multiquadric equations of topography and other irregular surfaces. Journal of Geophysical Research, 76(8):1905–1915. Hashim, K. A., Md Nor, N., and Idrus, J. (2021). Determination of acoustic emissions data characteristics under the response of pencil lead fracture procedure. Journal of Failure Analysis and Prevention, 21(6):2064–2071. He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778. IEEE. Holford, K. M., Pullin, R., Evans, S. L., Eaton, M. J., Hensman, J., and Worden, K. (2009). Acoustic emission for monitoring aircraft structures. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 223(5):525–532. Hsu, T.-M. H., Chen, W.-Y., Hou, C.-A., Tsai, Y.-H. H., Yeh, Y.-R., and Wang, Y.-C. F. (2015). In Proceedings of the Unsupervised domain adaptation with imbalanced cross-domain data. 2015 IEEE International Conference on Computer Vision (ICCV), pages 4121–4129. IEEE. Huang, X., Elshafiey, O., Farzia, K., Udpa, L., Han, M., and Deng, Y. (2023). Acoustic emission source localization using deep transfer learning and finite element modeling-based knowledge transfer. Materials Evaluation, 81(7):71–84. Huang, X., Elshafiey, O., Mukherjee, S., Karim, F., Zhu, Y., Udpa, L., Han, M., and Deng, Y. (2024a). Deep learning-assisted structural health monitoring: acoustic emission analysis and domain adaptation with intelligent fiber optic signal processing. Engineering Research Express, 83 6(2):025222. Huang, X., Han, M., and Deng, Y. (2024b). A hybrid gan-inception deep learning approach for en- hanced coordinate-based acoustic emission source localization. Applied Sciences, 14(19):8811. Huang, X., Li, Z., Peng, L., Chu, Y., Miles, Z., Chakrapani, S. K., Han, M., Poudel, A., and Deng, Y. (2025). A novel multi-fidelity gaussian process regression approach for defect characterization in motion-induced eddy current testing. NDT & E International, 150:103274. Ismail-Fawaz, A., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. (2022). Deep learning for time series classification using new hand-crafted convolution filters. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), pages 972–981. IEEE. Ivaturi, P., Reiss, J., and Mahmood, F. (2021). A comprehensive explanation framework for IEEE Journal of Biomedical and Health Informatics, biomedical time series classification. 25:2398–2408. Jain, N., Manikonda, L., Olmo Hernandez, A., Sengupta, S., and Kambhampati, S. (2018). Imagining an engineer: On gan-based data augmentation perpetuating biases. arXiv preprint arXiv:1811.03751. Janošek, M., Kopan, T., Zach, P., Janda, P., Mikirtek, P., Němec, M., and Ripka, P. (2019). Magnetic calibration system with interference compensation. IEEE Transactions on Magnetics, 55(1):1–6. Jiang, X., Zhao, F., Ge, Z., Yang, J., Xie, X., and Shi, S. (2020). Implicit class-conditioned domain alignment for unsupervised domain adaptation. arXiv preprint arXiv:2006.04996. Jones, M. R., Rogers, T., and Cross, E. J. (2022). Constraining gaussian processes for physics- informed acoustic emission mapping. arXiv preprint arXiv:2206.01495. Joseph, R. P. (2020). Acoustic emission and guided wave modeling and experiments for structural health monitoring and non-destructive evaluation. Unpublished work or thesis. Jung, B. H., Kim, Y. W., and Lee, J. R. (2019). Laser-based structural training algorithm for acoustic emission localization and damage accumulation visualization in a bolt joint structure. Structural Health Monitoring, 18(6):1851–1861. Kalivarapu, V. and Winer, E. (2008). A multi-fidelity software framework for interactive modeling of advective and diffusive contaminant transport in groundwater. Environmental Modelling & Software, 23(12):1370–1383. Kampolis, I. C. and Giannakoglou, K. C. (2008). A multilevel approach to single- and multiobjective aerodynamic optimization. Computer Methods in Applied Mechanics and Engineering, 197(33– 40):2963–2975. 84 Karim, F., Zhu, Y., and Han, M. (2021). Modified phase-generated carrier demodulation of fiber-optic interferometric ultrasound sensors. Optics Express, 29(16):25011–25021. Kats, V. and Volkov, A. (2019). Features extraction from non-destructive testing data in cyber- physical monitoring system of construction facilities. Journal of Physics: Conference Series, 1312(1):012015. Kats, V. and Volkov, A. (2020). Features extraction from non-destructive testing data in cyber- physical monitoring system of construction facilities. In Journal of Physics: Conference Series, volume 1425, page 012149. IOP Publishing. Kennedy, M. C. and O’Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87(1):1–13. Kontogiannis, S. G. et al. (2020). A comparison study of two multifidelity methods for aerodynamic optimization. Aerospace Science and Technology, 97:105592. Kumar, D. et al. (2018). A wireless shortwave near-field probe for monitoring structural integrity of dielectric composites and polymers. NDT & E International, 96:9–17. Leary, S. J., Bhaskar, A., and Keane, A. J. (2003). A knowledge-based approach to response surface modelling in multifidelity optimization. Journal of Global Optimization, 26:297–319. Li, D., Chen, D., Goh, J., and Ng, S.-K. (2018). Anomaly detection with generative adversarial networks for multivariate time series. arXiv preprint arXiv:1809.04758. Li, M., Zhou, Z., Xu, D., and Zhang, X. (2022). A new multi-fidelity surrogate modelling method for engineering design based on neural network and transfer learning. Engineering Computations, 39(7):2480–2498. Li, X.-B., Zhao, X., and Zhang, Y. (2009). Numerical simulation and experiments of magnetic flux leakage inspection in pipeline steel. Journal of Mechanical Science and Technology, 23(1):109– 113. Li, Y., Tian, G. Y., and Ward, S. (2006). Numerical simulation on magnetic flux leakage evaluation at high speed. NDT & E International, 39(5):367–373. Li, Y., Zhao, X., and Liu, J. (2021). Analysis of the anti-magnetic interference characteristics of the stacked magneto-optical current sensor and error compensation method. In 2021 International Conference of Optical Imaging and Measurement (ICOIM), pages 243–249. IEEE. Liu, C., Lou, Y., Liu, C., Yang, Q., Yang, Z., Zhang, Q., Sun, H., and Zhao, X. (2022). Synthesized magnetic field focusing for the non-destructive testing of oil and gas well casing pipes using pulsed eddy-current array. IEEE Transactions on Magnetics, 58(9):1–10. 85 Liu, C., Wu, X., Mao, J., and Liu, X. (2017a). Acoustic emission signal processing for rolling bearing running state assessment using compressive sensing. Mechanical Systems and Signal Processing, 91:395–406. Liu, C., Wu, X., Mao, J., and Liu, X. (2017b). Acoustic emission signal processing for rolling bearing running state assessment using compressive sensing. Journal of Vibration and Acoustics, 139(5):051007. Liu, G., Zhu, Y., Sheng, Q., and Han, M. (2020). Polarization-insensitive, omnidirectional fiber- optic ultrasonic sensor with quadrature demodulation. Optics Letters, 45(15):4164–4167. Liu, T., Lai, X., Song, X., and Guo, Z. (2023). A multi-fidelity surrogate model by optimal model selection. In Proceedings of the International Conference on Automation Control, Algorithm, and Intelligent Bionics (ACAIB 2023), page Paper 127593F. Mahajan, H. and Banerjee, S. (2023). Acoustic emission source localisation for structural health monitoring of rail sections based on a deep learning approach. Measurement Science and Technology, 34(4):044010. Majidnia, S., Rudlin, J., and Nilavalan, R. (2014). Investigations on a pulsed eddy current system for flaw detection using an encircling coil on a steel pipe. Insight – Non-Destructive Testing and Condition Monitoring, 56(10):560–565. Malekzadeh, M., Atia, G., and Catbas, F. N. (2015). Performance-based structural health monitoring through an innovative hybrid data interpretation framework. Journal of Civil Structural Health Monitoring, 5(3):287–305. Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2016). Response Surface Method- ology: Process and Product Optimization Using Designed Experiments. Wiley, Hoboken, NJ, 4th edition. Nafiah, F., Abidin, A. F. Z., Dzulkarnain, M. A., Abdullah, A. H., Yusoff, M. Z., and Jasni, M. Z. (2020). Pulsed eddy current: Feature extraction enabling in-situ calibration and improved estimation for ferromagnetic application. Journal of Nondestructive Evaluation, 39(3):40. Nayebi, A., Karami, M., Gupta, P., Benesty, J., and Anjum, A. (2022). Windowshap: An effi- cient framework for explaining time-series classifiers based on shapley values. arXiv preprint arXiv:2211.06507. Park, G. S. and Park, S. H. (2004). Analysis of the velocity-induced eddy current in mfl type ndt. IEEE Transactions on Magnetics, 40(2):663–666. Park, J. S. and Sandberg, I. W. (1991). Universal approximation using radial-basis-function net- works. Neural Computation, 3(2):246–257. 86 Parvatharaju, P. S., Gupta, A., Gatterbauer, W., et al. (2021). Learning saliency maps to explain deep time series classifiers. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM), pages 3852–3856. ACM. Piao, G., Li, Y., Tian, G. Y., Yin, W., and Wang, R. (2020). The effect of motion-induced eddy current on high-speed magnetic flux leakage (mfl) inspection for thick-wall steel pipe. Research in Nondestructive Evaluation, 31(1):48–67. Ponikvar, D., Zupanič, E., and Jeglič, P. (2023). Magnetic interference compensation using the adaptive lms algorithm. Electronics, 12(5):2360. Qiu, X., Hu, X., Sun, W., and Chen, C. (2020). Acoustic emission propagation characteristics and damage source localization of asphalt mixtures. Construction and Building Materials, 252:119074. Rao, K. S. S., Rao, B. P. C., and Thirunavukkarasu, S. (2017). Development of pulsed eddy current IETE Technical instrument and probe for detection of sub-surface flaws in thick materials. Review, 34(5):572–578. Rifai, D., Tian, G. Y., Sophian, A., and Al-Turki, Y. A. (2016). Giant magnetoresistance sensors: A review on structures and non-destructive eddy current testing applications. Sensors, 16(6):823. Sause, M. G. R. (2011). Investigation of pencil-lead breaks as acoustic emission sources. Technical Report 29-184, University of Augsburg, Institute for Physics, Experimental Physics II, Augsburg, Germany. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2016). Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128:336–359. Sen, D. and Nagarajaiah, S. (2018). Data-driven approach to structural health monitoring using In Proceedings of the 9th European Workshop on Structural statistical learning algorithms. Health Monitoring, pages 295–305. European Workshop on Structural Health Monitoring. Sereda, O. and Korol, O. (2022). The external magnetic field modeling features of electrical complexes and systems before and after its compensation. Bulletin of the National Technical University "KhPI". Series: Energy: Reliability and Energy Efficiency, 2022(4). Shao, S., Wang, P., and Yan, R. (2019). Generative adversarial networks for data augmentation in machine fault diagnosis. Computers in Industry, 106:85–93. Shawi, R. E., Li, J., Yan, Q., Navab, N., and Albarqouni, S. (2019). Interpretability in healthcare: A comparative study of local machine learning interpretability techniques. In 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), pages 275–280. IEEE. 87 Shi, W., Zhu, R., and Li, S. (2022). Pairwise adversarial training for unsupervised class-imbalanced domain adaptation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Dis- covery and Data Mining (KDD), pages 1276–1285. ACM. Shu, L., Songling, H., and Wei, Z. (2007). Development of differential probes in pulsed eddy current testing for noise suppression. Sensors and Actuators A: Physical, 135(2):675–679. Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Singh, A., Sengupta, S., and Lakshminarayanan, V. (2020). Explainable deep learning models in medical image analysis. Journal of Imaging, 6(6):52. Singh, P. and Sharma, A. (2022). Interpretation and classification of arrhythmia using deep convolutional network. IEEE Transactions on Instrumentation and Measurement, 71:1–12. Sophian, A., Tian, G. Y., and Fan, M. (2017). Pulsed eddy current non-destructive testing and evaluation: A review. Chinese Journal of Mechanical Engineering, 30(3):500–514. Sorin, V., Barash, Y., Konen, E., and Klang, E. (2020). Creating artificial images for radiology applications using generative adversarial networks (gans) - a systematic review. Academic Radiology, 27(8):1175–1185. Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer, New York, NY. Strantza, M., Bianchi, G., Mahato, B., Mencattelli, L., Weaver, P., Dulieu-Barton, J. M., and Potter, K. (2015). Evaluation of shm system produced by additive manufacturing via acoustic emission and other ndt methods. Sensors, 15:26709–26725. Sun, J., Li, K., Wang, M., Hou, Z., Li, X., and Lin, Z. (2021). Domain adaptation with geometrical preservation and distribution alignment. Neurocomputing, 454:152–167. Sun, L., Chen, J., Xu, Y., Gong, M., Yu, K., and Batmanghelich, K. (2022). Hierarchical amortized gan for 3d high resolution medical image synthesis. IEEE Journal of Biomedical and Health Informatics, 26(9):3966–3975. Sun, R. (2020). Optimization for deep learning: An overview. Journal of the Operations Research Society of China, 8(2):249–294. Takagi, T., Fukutomi, H., and Tani, J. (1998). Numerical evaluation of correlation between crack size and eddy current testing signal by a very fast simulator. IEEE Transactions on Magnetics, 34(5):2581–2584. Ulapane, N., Mi, J., Zhu, Y., and Long, M. (2018). Non-destructive evaluation of ferromagnetic 88 material thickness using pulsed eddy current sensor detector coil voltage decay rate. NDT & E International, 100:108–114. van der Maaten, L. and Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9:2579–2605. Verstrynge, E., Schueremans, L., Van Gemert, D., and Wevers, M. (2021). A review on acoustic emission monitoring for damage detection in masonry structures. Construction and Building Materials, 268:121089. Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 1096–1103. ACM. Wang, H., Li, P., Lang, X., Tao, D., Ma, J., and Li, X. (2023). Ftgan: A novel gan-based data augmentation method coupled time–frequency domain for imbalanced bearing fault diagnosis. IEEE Transactions on Instrumentation and Measurement, 72:1–14. Wang, J., Wen, S., Yao, Y., Zhao, X., Gao, K., Gao, L., and Li, P. (2021). Discriminative feature alignment: Improving transferability of unsupervised domain adaptation by gaussian-guided latent alignment. Pattern Recognition, 116:107936. Wang, R., Zhang, Y., Li, Y., Tian, G. Y., and Yin, W. (2020). Motion induced eddy current based testing method for the detection of circumferential defects under circumferential magnetization. International Journal of Applied Electromagnetics and Mechanics, 64(1-4):501–508. Wang, Z., Yan, W., and Oates, T. (2016). Time series classification from scratch with deep neural networks: A strong baseline. arXiv preprint arXiv:1611.06455. Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3(1):9. Wevers, M. and Lambrighs, K. (2009). Applications of acoustic emission for shm: A review. In Chang, F.-K. and Sohn, H., editors, Encyclopedia of Structural Health Monitoring, pages 289–302. Wiley. Wickstrøm, K., Kampffmeyer, M., and Jenssen, R. (2020). Uncertainty-aware deep ensembles for reliable and explainable predictions of clinical time series. IEEE Journal of Biomedical and Health Informatics, 25:2435–2444. Wilson, J. W. and Tian, G. Y. (2006). 3d magnetic field sensing for magnetic flux leakage defect characterisation. Insight-Non-Destructive Testing and Condition Monitoring, 48(6):357–359. Wu, J., Xing, D., Song, H., Lin, X., and Chen, W. (2020). Acoustic emission signal classifica- tion using feature analysis and deep learning neural network. Fluctuation and Noise Letters, 89 19:2150030. Wu, Y. and Huang, X. (2022). Unsupervised reinforcement adaptation for class-imbalanced text classification. arXiv preprint arXiv:2205.13139. Xiao, N. and Zhang, L. (2021). Dynamic weighted learning for unsupervised domain adaptation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15237–15246. IEEE. Zahir, M. K. and Gao, Z. (2013). Variable-fidelity optimization with design space reduction. Chinese Journal of Aeronautics, 26(4):841–849. Zaini, M. H., Hussin, F. J., Mohamed, Z. A., Yusof, M. H., and Marghany, M. (2021). Extraction of flux leakage and eddy current signals induced by submillimeter backside slits on carbon steel plate using a low-field amr differential magnetic probe. IEEE Access, 9:146755–146770. Zhang, J. et al. (2019). A comparative study between magnetic field distortion and magnetic flux leakage techniques for surface defect shape reconstruction in steel plates. Sensors and Actuators A: Physical, 288:10–20. Zhang, Y., Lee, J. D., Wainwright, M. J., and Jordan, M. I. (2017). On the learnability of fully- connected neural networks. In Singh, A. and Zhu, J., editors, Proceedings of the 20th Interna- tional Conference on Artificial Intelligence and Statistics (AISTATS), volume 54 of Proceedings of Machine Learning Research, pages 83–91. PMLR. Zhang, Y., Li, G., Wang, H., Wang, S., and Liu, Y. (2022). Multi-scale signed recurrence plot based time series classification using inception architectural networks. Pattern Recognition, 123:108402. Zhang, Y., Yang, Y., Liu, G., Wu, Q., and Bai, X. (2021). Grad-cam helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging. Journal of Neuroscience Methods, 353:109095. Zhao, H., Zhang, W., Li, X., Yang, X., and Liu, Y. (2023). A mfl mechanism-based self-supervised method for defect detection with limited labeled samples. IEEE Transactions on Instrumentation and Measurement, 72:1–10. Zhiye, D., Jiangjun, R., and Shifeng, Y. (2005). 3d mfl of steel pipe computation based on nodal-edge element coupled method. In Proceedings of the 6th International Conference on Electromagnetic Field Problems and Applications (ICEF), pages 252–256. IEEE. Zhong, K.-Z. and Chen, J. (2023). Experimental applying acoustic emission to fault diagnosis and prediction of autonomous devices. In Proceedings of the 2023 Sixth International Symposium on Computer, Consumer and Control (IS3C), pages 1–4. IEEE. 90 Zhu, Y., Sheng, Q., and Han, M. (2020). Effect of laser polarization on fiber bragg gratings fabry-perot interferometer for ultrasound detection. IEEE Photonics Journal, 12(4):1–9. 91 Appendix A APPENDIX To visualize the distribution of synthetic (augmented) and original datasets in a lower-dimensional space, we employed t-Distributed Stochastic Neighbor Embedding (t-SNE) van der Maaten and Hinton (2008). t-SNE is a powerful technique for visualizing high-dimensional data by mapping each datapoint to a two or three-dimensional space. While t-SNE was originally designed for static data, it has been adapted for use with time series data in some cases. Visualizing AE data can be challenging due to its complexity and high dimensionality. However, t-SNE can be used to map time series data onto a low-dimensional space while preserving its underlying structure. To apply t-SNE to time series data, we first need to transform the sequential nature of the data into a set of fixed-length feature vectors that can be used as input to t-SNE. This can be done using various techniques such as sliding windows or feature extraction methods like Fourier transforms or wavelet transforms. Once we have transformed the time series data into feature vectors, we can compute pairwise similarities between them using a Gaussian kernel: p𝑖, 𝑗 = −||x𝑖 − x 𝑗 ||2/2𝜎2) (cid:205)𝑘 (cid:205)𝑙 −||x𝑘 − x𝑙 ||2/2𝜎2) (1) where 𝑥𝑖 and 𝑥 𝑗 are two feature vectors, 𝜎 is a parameter that controls the width of the Gaussian kernel, and 𝑝𝑖, 𝑗 is the probability that 𝑥𝑖 would pick 𝑥 𝑗 as its neighbor if neighbors were picked in proportion to their probability density under a Gaussian centered at 𝑥𝑖. Next, we compute pairwise similarities between points in the low-dimensional map using a Student-t distribution: q𝑖, 𝑗 = (1 + ||y𝑖 − y 𝑗 ||2)−1 (cid:205)𝑘 (cid:205)𝑙 (1 + ||y𝑘 − y𝑙 ||2)−1 (2) where 𝑦𝑖 and 𝑦 𝑗 are two points in the low-dimensional map, and 𝑞𝑖, 𝑗 is the probability that 𝑦𝑖 would pick 𝑦 𝑗 as its neighbor if neighbors were picked uniformly at random from all other points. Finally, t-SNE minimizes the difference between these two distributions using gradient descent on a cost function that measures their divergence: 92 KL(𝑃||𝑄) = ∑︁ ∑︁ 𝑖 𝑗 p𝑖, 𝑗 log p𝑖, 𝑗 q𝑖, 𝑗 (3) We’ve employed this t-SNE technique to enhance our understanding of the relationship between our simulation and experimental datasets. The evolution of synthetic versus original data in a two- dimensional t-SNE space across multiple training epochs (1, 10, 100, 500, 1000, and 2000) for the GAN are presented in Figure 46. At the earliest epoch (1), the synthetic samples form a distinctly dense cluster that is visibly separate from the original data points, indicating a wide divergence between the generated and real distributions. By epoch 10, the synthetic cluster expands and starts intermingling with scattered regions of the original dataset, suggesting an initial but still incomplete alignment. At epoch 100, a more pronounced overlap emerges: the red points begin to envelop various pockets of blue points, reflecting a notable reduction in distributional discrepancies. As training progresses to epoch 500 and epoch 1000, the synthetic data cloud becomes more diffuse yet steadily conforms to the general shape and spread of the original dataset, highlighting the GAN’s increasing capability to capture underlying structural relationships. Finally, at epoch 2000, the red points appear deeply integrated throughout the blue distribution, demonstrating a high degree of similarity. This late-stage visualization underscores the GAN’s success in emulating the original data’s manifold. Overall, the progressive changes in t-SNE projections confirm that the standard GAN effectively narrows the gap between the synthetic and real datasets over successive epochs. This trend reinforces the notion that prolonged training with a carefully tuned objective function can yield realistic augmented samples that closely mirror the complexities of the experimental data space. As shown in Figure 47, in the early stages of GAN training, the augmented data points are sparsely distributed and show little overlap with the original data, indicating poor alignment and high divergence. By epoch 2000, the GAN-augmented data points are well-integrated with the original data, demonstrating the GAN’s capability to generate synthetic data that closely resembles the original dataset. WGAN shows a unique pattern where augmented data forms distinct clusters that encompass the original data points, suggesting it captures the overall distribution well but may 93 Figure 46 t-SNE visualizations at six training epochs of 1 to 2000 for the standard GAN, illustrating how synthetic (red) points progressively converge toward the distribution of original (blue) data in low-dimensional space 94 Figure 47 T-SNE Visualization of Synthetic and Original Datasets 95 over-segment the data space. DCGAN and TSAGAN both show good integration of augmented and original data, with TSAGAN appearing to have a slightly more uniform distribution. In the last im- age, noise-based augmentation produces a distinct pattern where augmented data forms concentric circles around original data points, indicating a simple additive noise approach that doesn’t capture the underlying data distribution as effectively as GAN-based methods. These visualizations high- light the effectiveness of GAN-based augmentation techniques in generating high-quality synthetic data that closely mimics the original dataset. The increasing overlap and similarity in distribution between original and augmented data points across different GAN architectures demonstrate their capability to produce diverse yet representative samples, providing a robust foundation for deep learning models and ensuring balanced representation across all labels. Appendix B It is important to emphasize that Lamb waves in thin plates (or thin-walled structures) are inherently dispersive, meaning that the phase and group velocities vary with frequency Ciampa and Meo (2010); Joseph (2020). As a result, for each frequency component, there can be multiple wave modes (commonly labeled 𝑆0, 𝐴0, 𝑆1, 𝐴1, etc.) whose velocities also depend on the product of frequency and plate thickness. For many practical applications, especially in the low-to-medium frequency range, the fundamental modes 𝑆0 (symmetric) and 𝐴0 (antisymmetric) dominate the wavefield. Moreover, the phase velocity 𝑐 𝑝 in Equation (5) should be interpreted as 𝑐 𝑝 ( 𝑓 ), i.e., a function of frequency 𝑓 , further reinforcing the dispersive nature of Lamb waves. This dispersion can lead to significant spreading of wave packets during propagation, which is an important consideration when analyzing signals in structural health monitoring or acoustic emission testing. Below is an example to derive the dispersion curve. The first step is to define the material properties and parameters. These parameters are necessary for the subsequent calculations of the phase and group velocities. The dispersion relation for Lamb waves in a thin plate is given by the following transcendental equation: 96 Table 1 Material Properties and Parameters Quantity Young’s modulus 𝐸 𝜐 Poisson’s ratio 𝜌 Density 𝑑 Thickness Name Expression 206 GPa 0.3 2700 kg/m3 0.1 inch tan(𝛽𝑑) = 4𝛽2𝑘 2 − 2𝑘 2𝛽2 (2𝛽2 − 𝑘 2) (𝑘 2 − 𝛽2) (4) where 𝛽 is the wavenumber in the material, 𝑘 is the wavenumber in the fluid, and 𝑑 is the thickness of the plate. The wavenumbers are defined as: 𝛽 = 𝜔 𝑐 𝑝 and 𝑘 = 𝜔 𝑐 𝑓 (5) where 𝜔 is the angular frequency, 𝑐 𝑝 is the phase velocity in the material, and 𝑐 𝑓 is the phase velocity in the fluid. To find the dispersion curves, the dispersion relation is solved numerically. A range of frequencies is considered, and for each frequency, the dispersion relation is solved using a bisection method to find the corresponding phase velocity. This numerical sweep is carried out separately for each mode of interest (e.g., 𝑆0 and 𝐴0). Once the phase velocity 𝑐 𝑝 ( 𝑓 ) is obtained, one can directly observe how the velocity changes with frequency, yielding dispersion curves for the plate. The thickness 𝑑 = 0.1 inch (approximately 2.54 mm) ensures that the lower-order modes are typically the most relevant in this frequency range. Furthermore, to assess the energy transport properties, the group velocity can be calculated from the derivative of 𝜔 with respect to 𝑘. In practice, the group velocity is critical for interpreting wave arrival times and energy propagation in nondestructive testing applications, such as acoustic emission. Because of the strong frequency dependence, both phase and group velocity curves must be considered in analyzing wave propagation in thin plates.The group velocity is calculated from the phase velocity using the following formula: 𝑣𝑔 = 𝑑𝜔 𝑑𝑘 97 (6) where 𝑣𝑔 is the group velocity, 𝜔 is the angular frequency, and 𝑘 is the wavenumber. The derivative is approximated using finite differences. The dispersion curves are then plotted as phase/group velocity versus frequency. The dispersion curves for the symmetric and antisymmetric modes are plotted separately. Figure 48 Dispersion curves of two fundamental Lamb wave modes (blue and red) plotted as velocity versus frequency The phase (or group) velocity dispersion curves for two Lamb wave modes as a function of frequency in the 0–2 MHz range is shown in Figure 48. The blue curve exhibits a noticeable drop in velocity around the mid-frequency range, then rises again toward higher frequencies, indicating strong dispersion. Appendix C Multi-fidelity methods have been applied across various engineering domains to improve mod- eling accuracy and computational efficiency. Some approaches focus on enhancing visualization and interactivity in complex, heterogeneous systems, particularly for groundwater or fluid flows, by combining analytical elements with intuitive 3D user interfaces. Others leverage surrogate-based optimization tools to handle industrial design problems, using trust-region or expected-improvement methods to achieve effective trade-offs between solution quality and computational cost. Neural network and kriging-based techniques also appear, where inexpensive (low-fidelity) models provide “knowledge” that refines the training of more accurate (high-fidelity) surrogates, significantly re- ducing the number of expensive evaluations. In structural applications, ensemble machine learning 98 has demonstrated high accuracy for bond strength predictions, while multi-level frameworks can manage different model complexities at each stage of aerodynamic or structural design. To tackle high-dimensional problems, some strategies pinpoint smaller, promising regions using low-fidelity data, then refine those regions with high-fidelity samples, saving considerable resources. In ad- dition, machine learning–based multi-fidelity approaches can maintain robust performance even when high-fidelity data is scarce, provided low-fidelity sources capture relevant trends. Finally, context-aware Monte Carlo methods on high-performance computing (HPC) systems show how speedups on the order of multiple magnitudes can be achieved by balancing training costs and variance reduction strategies. The list of studies are in the Table. Table 2 Summary of multi-fidelity approaches in various engineering applications Study Aims/Focus Analysis Tech- Major Find- nique/Procedure ings/Limitations Kalivarapu et Develop a multi- Utilizes the Superblock More accurate heteroge- al. Kalivarapu fidelity framework Analytical Element neous flows representa- and Winer for modeling advec- Method (AEM); Offers in- tion than FD/FE methods; (2008) tive and diffusive tuitive 3D GUI for setup & User-friendly scenario cre- transport; Enhance visualization; Desktop/VR ation; May require signifi- visualization & deployment cant computational setup interactivity for com- plex, heterogeneous groundwater flows 99 Table 2 (cont’d) Study Aims/Focus Analysis Tech- Major Find- nique/Procedure ings/Limitations Kontogiannis Compare multi- Trust-region vs. expected- Trust-region: quick et al. Konto- objective multi- improvement approach; solutions, good Pareto giannis et al. fidelity surrogate- Comparison with single- efficiency; Expected- (2020) based optimization fidelity and co-Kriging; improvement: better tools; Evaluate in- Complex aerodynamic test exploration; Performance dustrial aerodynamic cases depends on HPC resources design cases Leary et al. Reduce computational Neural networks & Knowledge-based kriging Leary et al. costs via multi-fidelity knowledge-based kriging; matches neural alterna- (2003) approaches; Incor- Uses cheap model outputs tives; Significant computa- porate cheap model as prior information; LF tional savings; Success de- knowledge into train- data learning for HF pends on LF model quality ing improvement Chen Chen et et al. al. Develop data-driven Ensemble learning Best accuracy (R2 = 0.98); ensemble ML models (GBDT, RF); Multiple ML Effective ensemble meth- (2021) for CFRP-steel bond algorithm comparison; ods; Requires large dataset strength; Compare Variable importance anal- ensemble methods vs. ysis traditional ML 100 Table 2 (cont’d) Study Aims/Focus Analysis Tech- Major Find- nique/Procedure ings/Limitations Kampolis et al. Propose multilevel Multilevel structure Efficient Pareto front Kampolis and framework for aerody- with varying strategies; search; Good solution Giannakoglou namic optimization; Metamodel-assisted evo- refinement; Complex (2008) Integrate different lution; Gradient-based implementation evaluation software refinement Zahir et al. Za- Reduce design space LF model for promising re- 39% computational cost hir and Gao for high-dimensional gion identification; LF-to- reduction; High-fidelity re- (2013) optimization; Use LF HF surrogate modeling; It- sults in refined regions; model for region re- erative update strategy Limited to specific prob- finement lems Chen et al. Improve structural be- ML-based multi-fidelity Enhanced prediction accu- Chen and Feng havior prediction; Ad- method; LF-HF data racy; Depends on LF data (2022) dress limited HF data integration; RC beam case quality; Additional calibra- challenges study tion needed Farcaş et al. Present context- Multi-fidelity Monte Carlo 72 days → 4 hours Farcaş et al. aware multi-fidelity with variance reduction; speedup; HPC viability; (2023) Monte Carlo method; Cost-balanced training; Efficient variance reduc- Optimize training HPC plasma simulation tion; HPC-specific appli- vs. sampling costs; Handle multiple LF models cation 101