LEARNING PARADIGMS F OR THE IDENTIFICATION OF EL ASTIC PROPERTIES OF COMPOSITES USING ULTRASONIC GUIDED WAVES By Karthik Gopalakrishnan A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical Engineering Master of Science 202 0 ABSTRACT LEARNING PARADIGMS FOR THE IDENTIFICATION OF ELASTIC PROPERTIES OF COMPOS ITES USING ULTRASONI C GUIDED WAVES By Karthik Gopalakrishnan Identification of elastic properties of composite s is relevant for both n ondestructive material s characterization as well as for in - situ condition monitoring to assess and predict an y possible material degradation . Learning p aradigms have been well explored when it comes to detect ion and characteriz ation of defects in safety - critical structure s , but are relatively unexplored when it comes to structural material s characterization. In this thesis we propose a learning paradigm that includes the potential use of Machine Learning (ML) and Deep Learning (DL) algor ithms to solve the inverse problem of material propert ies identification using ultrasonic guided waves. The propagation of guided waves in a composite laminate is modelled using two different modelling techniques as part of the forward problem. Here, w e u se the two fundamental modes of guided wave s, i.e. the anti - symmetric (A0) and the symmetric modes (S0) as features for the proposed learning models. As p art of the inverse problem, different learning models are used to map feature space to target space th at consists of the material properties of composite s . The performance of the algorithms is evaluated based on different metrics and it is seen that the networks are able to learn the mapping and generalize well to unseen examples even in the presence of no ise at various levels. Overall, we are able to develop a complete framework consisting of many interlinking data processing algorithms that can effectively estimate and predict the material pr operties of any given composite. iii ACKNOWLEDGEMENTS I would like to first acknowledge my advisor, Dr Yiming Deng, for his unwavering support and guidance throughout this journey. The amount of freedom he gave me in pursuing research in general was fantastic, and I have been able to grow my scientific temper greatly wor king under him. His work ethic and approach to science inspires me and will certainly guide me in the future. I would also like to place my deepest thanks to the members of my committee: Dr Lalita Udpa and Dr Mi Zhang for always being available and guiding me throughout this journey. In particular, Dr Lalita Udpa has been of immense help to me in the last couple of years. Apart from the invaluable technical inputs that I have received from her, she has always gone out of her way looking out for me and makin g sure I am doing well. I really appreciate and am grateful for the opportunity of working and interacting with her on a daily basis. I would also to thank all the members of the Nondestructive Evaluation Laboratory (NDEL). Having a large research g roup at NDEL has been very beneficial and has led to many productive and fun conversations. In particular, I would like to thank Srijan, Subrata, Bharath, Vivian, Guillermo and Vivek for the banter, and the various conversations we have had ranging from fo otball, food and what not. Without them, going to the lab would have become routine and mundane. I would also like thank all the faculty members associated with the NDEL who have in some way or other enriched my experience here. When I initially mov ed to Michigan State University, I had the opportunity to work with Dr Sunil Kishore Chakrapani for a year. I would like to express my deepest of gratitude towards him for placing his trust on me when I was just a fresh undergraduate from India. I learnt a great deal in conducting rigorous experimental approach under him, and also learnt a lot on writing good scientific documents. I am inspired by his approach to science and iv research in general, and I will always carry his teachings and his advices with me. I also want to take this opportunity to thank Dr Shankar Balasubramaniam for always being available and giving me some valuable advice on choosing my career path . I have had the pleasure of being a teaching instructor for two different course s during my time here. I would like thank Dr Gregory Wierzba and Dr Sunil Kishore Chakrapani for being wonderful supervisors, and always letting me perform my duties without a lot of interference, but at the same time always being available in case of any trouble. I also took many interesting courses during my time here, and I would like to thank all the teachers who thought the courses for their efforts and endeavors. Also, big thanks to all the staff at the Dept. of Electrical and Computer Engineer ing. In particular, I am eternally grateful to Meagan, I have troubled her multiple times with multiple different requests and she has always solved it for me in a very timely manner and with a smile on her face. I also want to thank Laurene, for always pr ocessing my time sheets on time even if I submit it a day late. Big thanks to Brian, for always being available for any technical help or requests that I have had over the last two years. People like them are the unsung heroes of every department, and a lo and services. Moving on to the academic collaborators, I would like to first thank Magna International Inc. I was fortunate to work on a very interesting project funded by them, where I learnt the basics of adhesion and adhesion bonding in materials. I also worked on a project funded by the US Department of Transportation, where I worked on evaluating the integrity of oil and gas pipelines. As part of this work, I also had the opportunity to pr esent my work in an R&D seminar held by them. I would like to thank them for providing me with such an amazing opportunity. Finally, I would like to place my biggest thanks to Mr Mahindra Rautela, who has greatly helped me with my work that has eventually become my dissertation. A lot of the v work that I have presented in my thesis is in collaboration with him, and I have enjoyed working with him all through. His drive, passion and zeal towards research and technology are something that I am inspired by and Before moving here, I have interacted with lot of different people that have helped me shape my career. In particular, I would like to thank Dr G K Ananthsuresh, and Dr Suresh Sundaram for taking me under their wing at IISc and NTU respectively, when I was just a curious undergraduate trying to understand what research was all about. I also want to thank Dr Senthil Jayavelu, who helped me with a lot of my work for my undergraduate thesis that really helped me de velop a firm interest in the area of Deep Learning. I also want to thank all my teachers at the Amrita University, Bangalore Campus from where I obtained my undergraduate degree. Moving outside the academic sphere, I have had the pleasure of meeting some wonderful people who have all been a big part of my graduate life. Cricket is a big part of who I am, and I have had the opportunity of playing a good level of cricket at Michigan. I would like to thank the entire cricket family here at Michigan for giving me an opportunity to have something else to look forward to in life. I also want to thank the entire Desi Boys group: Sai Chaintanya, Ankit, Hitesh, Ronak, Bharath, Yash, Abhiroop, Siddartth and Abu Bakr for their company through the last two years. A lot of my weekends went by chilling with them over food and drinks, or by watching random movies or playing board games. Moving to MSU, I was able to meet one of my best friends in my life, Sejal. I am thankful for all the amazing times we have s pent together. From watching movies, cooking meals to playing and fighting over board games, I want to thank you for your companionship and for always being there no matter what. I would also like to thank Manali, who along with Sejal accompanied me on mul tiple trips all across Michigan. I also want to thank all my friends back home in India, who have all indirectly been a part of this wonderful journey. vi Finally, but most importantly I am eternally grateful to my parents, Gopalakrishnan and Anuradha. Your support has been unwavering, and I am who I am because of you and your nonstop consistent support. I am very fortunate to have a dad who is himself a revered professor. I want to thank him for always taking out time to review my presentations, report s, papers and always giving me feedback to improve it. I want to thank my mom for always willingly taking in all my rants and being a big emotional support. I would like to thank my sister Keerthana, for always being the sibling whose leg I could always pu ll. Nothing would have been possible if not for you guys. Thanks for everything! vii TABLE OF CONTENTS . . ix .. . x 1.1. Over 1.1.1. Basic Elements 1.1.2. 1.2. 1.3. 1.4. 1.5. Mat erial Property Identification of 1.6. 1.7. 2.1. 19 2.2. 21 . . 25 3.1. 25 3.2. 26 3.3. 3.4. 3.5. CHAPTER 4 4.1. 4.2. Multi - 4.3. 4.4. Recurrent Neural Netwo 4.5. CHAPTER 5 MODELLING TECHNIQUES FOR GUIDED WAVES IN COMPOSITES. 53 5.1. 5.2. 5.3. FEM Modelling using COMSOL ® 5.4. 5.5. Non - 5.6. Spectral Element Formulation 5.7. CHAPTER 6 6.1. 6.2. viii CHAPTER 7 STUDY PARAMETERS FOR IDENTIFICATION OF ELASTIC 7.1. 7.2. 8.1. 8.2. 8.3. Training Results with Dense Neural Netwo 9.1. 9.2. 9.3. APPENDIC ES 106 ix LIST OF TABLES Table 5.1 Material Properties of a transversely isotropic lamina Table 5.2 Wave velocities: SFEM vs FEM .. Table 6.1 Secant Sensitivity Values Table 7.1 Material Properties of the composites used for the ML based study Table 8.1 1DCNN Architecture Table 8.2 Prediction Results from the 1DCNN Table 8.3 LSTM Architecture Table 8.4 DNN (MLP) Architecture Table 8.5 P rediction Results using DNN x LIST OF FIGURES Figure 1.1 Elements of Structural Health Monitoring 4 Figure 1.2 (a) Forward Process (b) Inverse Process . 7 Fi g ure 1.3 Ultrasonic Guided Wave re sponse for a healthy pipeline v s a faulty pipeline with a single 1mm deep corrosion pit . . 9 Figure 1.4 Interdependence of the forward and inverse process . 10 Figure 1.5 Active SHM Setup o f a CFRP Composite Plate 11 Figure 1.6: Overall Workflow of the Composite Property Estimation Framework 17 Figure 2.1 (a) RV for longitudinal material property determination (b) RV for transverse material property determination ... 2 0 Figure 2.2 Co - ordinate system for a laminate . 2 1 Figure 2.3 Principal axes of the laminate and global x - y axes . .. . 2 3 Figure 2.4 An 8 layered unidirectional composite laminate . . . 24 Figure 3.1 Bulk Wave Testing vs Guided Wave Testing 26 Figure 3.2 2D infinite plate of thickness 2d 3 0 Figure 3.3 Phase Velocity Dispersion Plot for 1mm thick Steel Plate 33 Figure 3.4 Group Velocity Dispersion Plot for 1mm thick Steel Plate 3 3 Figure 3.5 Elementary laminate Composite Waveguide with co - ordinate system .. 3 5 Figure 4.1 Machine Learning and Deep Learning . 3 9 Figure 4.2 Pipeline of a Neuron . 40 Figure 4.3 Three Layered MLP Network . 4 1 Figure 4.4 Overfitting and Underfitting [31] . . 4 1 Figure 4.5 Example of Data Augmentation [32] . . 45 Figure 4.6 Working of a Convolutional Layer . 46 Figu re 4.7 Example of weight sharing in Convolutional Layers . 4 7 xi Figure 4.8 Working of a Max Pooling Layer (stride=2) . 4 8 Figure 4.9 (a) Simple Feed Forward Network (b) A RN N with previous hidden states a s input 49 Figure 4.10 Architecture of a simple LSTM Network [48] 51 Figure 4.11 Vanilla based LSTM Cell [48] . 52 Figure 5.1 (a) Overall domain of the problem (b) Discretized domain set of the problem with boundary co nditions 54 Figure 5.2 Finite Element Method Procedure [34] . . 56 Figure 5.3 Cross sectional view of the sample geometry 5 7 Figure 5.4 (a) Time domain representation of the input tone burst (b) Frequ ency domain representation of the input tone burst 58 Figure 5.5 (a) Finite Element Mesh for the composite laminate 59 Figure 5.6 (a) Received response at 500 mm from the actuator (b) Filtered received response to isolate the fundamental wave modes 59 Figure 5.7 SFEM Procedure of solving PDE 6 1 Figure 5.8 Spectral Isotropic Rod Element [19] . . 62 Figure 5.9 (a) and (b): A0 and S0 response using the SFEM method (c) Overall raw waveform obtained from the COMSOL FEM simulation (d) Filtered waveform isolating the i ndividual A0 and S0 reflections (filtering done using knowledge of group velocities of these two modes ) . . 67 Figure 6. 1 A0 wa veforms for two different sets of material properties ( Blue : = 1000 kg/m 3 , E 1 = 50 GPa, E 2 = 5 GPa, v 12 = 0.25, v 23 = 0.25, G 12 = 2 GPa, Orange: = 1750 kg/m 3 , E 1 = 111.53 GPa, E 2 = 5 GPa, v 12 = 0.25, v 23 = 0.25, G 12 = 2 GPa) 70 Figure 6.2 Sensitivity of density (rho) on A0 and S0 wave velocities . 72 Figure 6.3 (b) S0 wave velocities 73 Figure 6.4 v 12 and v 23 on A0 and S0 wave velocities (b) Sen sitivity of Shear Modulus G 12 on A0 and S0 wave velocities 73 Figure 6.5 Uniqueness check for a dataset that had 512 samples originally but trimmed down to 88 after the check . 75 Figure 7.1 Group Velocity Dispersion Curve for a random composite with three modes (A0 , SH0 and S0) highlighted . 78 Figure 8 .1 Research Flow for the two Deep Learning Models (1DCNN and LSTM) ... . . .. . 7 9 xii Figure 8 .2 Loss curve: MSE vs number of epochs for 1DCNN . . 8 1 Figure 8 .3 MAE curve: MAE vs number of epochs for 1DCNN . 82 Figure 8 .4 R 2 coefficient curve: R 2 coefficient vs number of epochs for 1DCNN . 82 Figure 8 .5 Prediction Results: True vs Predicted Plot for 1DCNN . 83 Figure 8 . 6 (a) Original Signal (b) Original signal corrupted with a noise of SNR5 . 8 4 Figure 8 .7 Prediction results on data with noise of SNR20 for 1DCNN . 85 Figure 8 . 8 Prediction results on data with noise of SNR10 for 1DCNN . 86 Figure 8 . 9 Prediction results on data with noise of SNR5 for 1DCNN . 86 Figure 8 .10 NOWP for the predictions on noisy and noiseless data for 1DCNN . 87 Figure 8 .11 Loss curve: MSE vs number of epochs for LSTM Figure . 89 Figure 8 .12 MAE curve: MAE vs number of epochs for LSTM . 89 Figure 8 .13 R 2 coefficient curve: R 2 coefficient vs number of epochs for LSTM . 90 Figure 8 .14 Prediction Results: True vs Predicted Plot for LSTM . 91 Figure 8 .15 Prediction results on data w ith noise of SNR20 for LSTM . 91 Figure 8 .16 Prediction results on data with noise of SNR10 for LSTM . 92 Figure 8 .1 7 Prediction results on data with noise of SNR5 for LSTM . 92 Figure 8 .18 NOWP for the predictions on noisy and noise less data for LSTM network . .. 93 Figure 8 .19 NOWP for the two deep architectures in the presence and absence of noise in the signal . 94 Figure 8 .20 Training time per epoch for the two deep architectures (1DCNN and LSTM) . . . 95 Figure 8.21 Research Flow for the ML based approach . 96 Figure 8 .22 Loss curve: MSE vs Epochs for the DNN . 97 Figure 8 .23 MAE curve: MAE vs Epochs for the DNN . 98 Figure 8 .24 R 2 coefficient curve: R 2 coeff icient vs Epochs for the DNN . 98 Figure 8 .25 Prediction results for the DNN . 100 Figure 8 .26 NOWP using DNN . 10 1 xiii Figure 8 .2 7 Training time per epoch using DNN . 101 1 CHAPTER 1 INTRODUC TION The idea of structural design has undergone a radical change. Due to the continuous technological advances achieved by the human mankind, lighter and sleeker structures have replaced the conventional bulky and heavy structures. But newer solutions al ways bring with it a newer set of problems that requires solving. The new age structures have introduced severe constraints on design methodologies that are currently in practice [1], which requires newer techniques to monitor and assess the integrity of s tructures. Structural Health Monitoring (SHM) is one such technology, which can provide vital information on the state of the structure at any given time. Combined with the powerful computational and signal processing tools available, one can not only prec isely determine the health of any structure, and classify and characterize defects that affect the structural integrity, but also predict the future perform ance of the structure over time. 1.1. Overview on Structural Health Monitoring Structural Health Monit oring is the process of evaluating the health state of a structure and predicting its remaining life. This process often involves the constant observation of a system over time using periodically sampled response measurements from a sensor actuator system, extraction of healthy and damage sensitive features from these measurements and the statistical analysis of the these features to establish the state of the system health, and then further predict the remaining life of the structure. All structures like bridges, aircrafts, and pipelines have a finite lifetime, and begin to deteriorate when put into service. Due to a combination of environmental, material and other effects, processes such as corrosion, fatigue, erosion, overloads and general wear and tear 2 degrade them until they are no longer useful. In worse cases, they can lead to fatal damage that can endanger people , their livelihoods and the environment in general. It is therefore essential to examine structures of importance periodically , and determine whether or not remedial action is needed. SHM serves as an early warning system and helps in resolving the health of the structure before they can progress to cause potential damage. Structural design has undergone many changes acros s years, where stringent restrictions are placed on design parameters to produce the most efficient structures having superior structural integrity. Such structures are generally geometrically optimized to guarantee their resistance to sustain the high des ign loads. However, they are more susceptible to small damages such as horizontal, vertical or inclined cracks, corrosion in metallic structures, and delamination, fibre breakages in the case of composites. These defects severely affect the struct ural heal th, and therefore need consistent monitoring. SHM can be viewed as a generalized tool box that has the objective of providing necessary techniques for the constant/periodic monitoring of structures. These techniques are specifically designed for the various materials used in critical structures like buildings (concrete), bridges (metal) and aircrafts (composites/metals) etc. SHM therefore has innumerable applications in varied disciplines including aerospace, mechanical and civil engineering [2] [3]. SHM potentially offers increased safety, since faults cannot grow to a dangerous level, and avoids the vagaries of human behaviour. The benefits of SHM can be mainly categorized as: To use structures to their optimal best, a minimized downtime, and avoidi ng fatal failures. To give designers an improvement on their products. Minimization of human involvements, and therefore possible human errors. This directly improves safety and reliability. 3 1.1.1. Basic Elements of SHM Systems A typical system contains both hard ware and software elements. The hardware elements basically are the sensor actuator setup and its associated instrumentation, while the software components can vary but generally compromises of damage modelling and damage characterization algorithms. Senso rs may be active or passive. Passive sensor like strain gauges only sense (receive) while active sensors transmit and receive. Commonly used sensors include Poly Vinyl Di - Fluoride (PVDF) sensors, Piezoceramic sensors made from Lead Zirconate Titanate , comm only known as PZT sensors or fibre optic sensors. The responses obtained from the sensors vary sensor to sensor, but typically, they are all time histories of a certain variable. It is important to note that SHM is a time dependent process, and any sensor should be able to monitor a parameter/variable over time. The most common response often received from a PZT or a PVDF sensor, is the voltage history. These responses are post processed and manipulated to extract healthy and damage features to effectively characterize the health state of the structure. SHM compromises of two main components, the Diagnosis and the Prognosis. Diagnosis aims to, at every instant of time give a procedure to determine the health of the structure while Prognosis involves co mputation of the severity of the defects detected during diagnosis in terms of fracture mechanics parameters, and derive the structures residual life. Diagnosis normally gives informa tion about the onset of damages like cracks, its location and geometric p arameters. Considering only the diagnosis component , SHM can be described as a new and improved way of performing Nond estructive Evaluation (NDE). By combining integrati on of sensors, smart materials, computational modelling , data transmission and processi ng ability, it extends the traditional NDE approach to reconsider the structure design and its lifetime management. An alternate view of SHM is that of a discipline combining the following four 4 subjects [3] (see Fig ure 1.1). In general, an SHM system encom passes the following components: Test structure/ simulation Model Sensors Data acquisition systems Signal processing algorithms Damage modelling and classification algorithms Data transfer , handling, management and storage mechanism Figure 1.1 Elements of Structural Health Monitoring if not for powerful and robust post processing algorithms that converts raw data to meaningful information. Common modelling techn iques employ the Finite Element Method (FEM), which is very adept at model ling complex geometries. FEM requires the use of very fine meshes to detect and c haracterize very small defects, which results in very high computation time and cost. An example of u sing FEM for SHM can be found in [4]. 5 Therefore, for characterizing smaller defects more efficiently, a suitable mathematical model should be based on the physics of wave propagation, and one of the most well exploited models is the Spectral Finite Element Method (SFEM) [5]. Modelling has mainly two components, namely the flaw modelling, and damage detection algorithms. In metallic structures, some of the most common types of defects are caused by pitting corrosion. Modelling pitting corrosion is re latively easy, and has been extensively studied in [6][7][8]. Another common type of defect in metallic structures is the horizontal/vertical cracks which are sometimes through thickness. Damage modelling in composites is whereas a challenging task due to many different types of failure modes. Some of the commonly occ urring modes are delamination, fibre breakage, matrix cracks and d ebonds. In addition, composites are prone to moisture absorption due to its highly porous nature because of unavoidable errors during manufacturing. Hence, one needs to be able to come up with simplified mathematical models to describe the various types of flaws that commonly occur in critical structures. The second aspect of modelling is in devising robust damage characterization algorithms. These algorithms should be able to easily detect defects, distinguish healthy to faulty samples, and be able to clearly extract the damage features from a SHM response. The features should directly or indirectly give information about the stat e of the health of the structure. There is always a lot of noise present when conducting field experiments and these algorithms should be able to work well even in the presence of noise. 1.1.2. Levels of Structural Health Monitoring SHM can be thought as a syst em identification problem. Through diagnosis, one can get information about any defect or anomaly present and their characteristics, and the prognosis uses the information obtained from diagnosis and determines the residual life of the structure. We can br oadly divide SHM to five levels [2]. 6 Level 1 : Detecting the d amage i.e. being able to distinguish healthy and faulty responses. Level 2 : Defect location and geometry Level 3 : Severity of the damage Level 4 : Damage c ontrol i.e. possibility of controlling or delaying the growth of damage. Level 5 : Determining the residual life. The first four levels effectively constitute the diagnosis component of SHM, while Level 5 is effectively prognosis. Level 1 SHM is relatively easy to achieve, and it can be ach ieved by using passive sensors to monitor parameters such as strain energy, fundamental natural frequency, phase information, stiffness reduction over time. The most common method though is using natural frequencies. As damage reduces stiffness, it induces changes in the natural frequencies. Comparing to the baseline fundamental frequency should confirm the presence of damage, and act as a reliable feature to isolate faulty samples. Level 2 SHM is relatively harder the level 1, as from the known inpu t and the measured SHM response, it is necessary to determine the location and orientation of the flaw. A simple way is capturing the response at some known location. This response will contain the reflected energy packet coming from the flaw, knowing the speed of the wave in the medium, and the time of arrival of the reflected pulse, we can effectively locate the flaw. This is easier said than done, as the reflection from the flaw will mostly be very small in amplitude, and can easily be buried as noise or as part of the reflected wave packets. This makes the designing of the detection algorithms challenging. An ideal algorithm would be one which works accurately without a baseline response. Once the damage characteristics are evaluated in Level 2, i t is important to determine the severity of the damage. For example, if the damage is a crack, then we need to estimate the Stress Intensity Factor (SIF) or Strain Energy Release rate (SERR). If these parameters reach a pre - defined threshold value, these cracks will grow. Now, if the defect is found to be 7 severe, Level 4 SHM deals with the immediate measures to be taken to arrest the growth of cracks. Level 5 SHM is closely related with level 4 SHM, wherein the estimation of fracture parameters is used to perform fatigue life analysis to determine the residual life. The analysis is mainly statistical in nature and generally encompasses novel data processing techniques. It is to be noted, the knowledge of material properties of the structure under tes t is critical to perform or achieve any level of SHM. For example, in the most basic and probably easiest level i.e Level 1 involves detecting possible defects. This is normally done by monitoring the material properties of the structure periodically over time. This requires an user to know the following levels of SHM are all based on the results of the detection process in Level 1. Therefore, the material char acterization of the structure is the fundamental and most important task that needs to be carried out in any SHM framework or technique. 1.2. Forward and Inverse Problems in NDE/SHM Forward problems typically use known models of the system of interest along w ith a known input t o establish the characteristics of the output response. Inverse problems in most cases meanwhile, use an observed output, along with a known input to estimate the properties of the model. Figure 1.2 below shows the block diagrams of a b asic forward and inverse solver. Figure 1.2 (a) Forward Process (b) Inverse Process 8 In NDE, and more generally in SHM, physics based mathematical models are utilized to describe systems of interest. Some of the common methods used are the FEM and SFEM. W aves excited by different actuating mechanism act as the inputs, and as described before , different type of responses like voltage histories, displacement histories, strain rates are the outputs of a typical NDE system. ly solvable in this field, because it is very hard to know the exact physical properties of the system or model; hence one has to reverse engineer using the output response or measurement to gain information about the system in general. In SHM particularly , known inputs and observed outputs needs to be used to establish the physical and material state of the critical structure in question. This essentially is the inverse process, and is a fundamental part of any SHM/NDE framework. A problem should be well posed in order to be solved. In general, one can describe three conditions for an inverse problem to be well posed [9]. The systems used in SHM typically have distinctive physical and material properties that can be mathematically well defined, and t his is necessary to ensure that the problem in hand has a realistic solution. As is the case in solving any engineering related challenges, the solution need s to be unique. For example in F igure 1.3, when Ultrasonic Guided Waves (UGW) is used to study pitt ing corrosion in long pipelines , it is clearly visible that the responses for a healthy and faulty pipe with a very small very different. The differences are minimal, and a defect response can easily be misclassified as a healthy one. This me ans that in order to obtain a unique accurate representation of the system, one needs to collect more data where more informative features can be extracted. Also, the solution must be stable for small deviations in the measured data, or in the presence of noise. In other words, the solution must be robust and should incorporate factors that affect systems in practical situations. But by incorporating more information or measurements, one risks the stability of the system as more data brings in more stochast ic variations. 9 Fig ure 1.3 Ultrasonic Guided Wave responses for healthy pi peline v s a faulty pipeline with a single 1mm deep corrosion pit Damage detection and characterization is a complex procedure, and it is impossible to solve these kinds of i nverse problems independent of the forward process. Typically, materials characterization forms the crux of almost all damage characterization techniques. Knowledge of the material properties of the structure before being put into service and during its s ervice period is very critical for condition monitoring. Therefore, such problems generally involve both forward predictors and inverse detectors. SHM techniques normally combines the forward solver and the inverse solver to establish robust models, where both forward predictor and inverse detector models maximize their sensitivity to different features of the signals , and at the same time minimize their sensitivity to cofounding factors caused due to the variations in the mechanical, physical and material properties of the system. It can be effectively described as a symbiotic relationship, where information from the forward process is used to improve the models established in the inverse process, and these improved models are then used in the forward proce ss to obtain better measurements. Fig ure 1.4 shows the inter relationship between the forward and inverse process, and how it can be used to establish efficient models. 10 Figure 1.4 Interdependence of the forward and inverse process 1.3. SHM for Composites Composites are materials made by combining two or more natural or artificial elements with different physical and chemical properties that results in a mechanically superior material compared to its constituent material. Generally, fibre reinforced polymer composites consist of a polymer matrix reinforced with a man - made or natural fibre. Compared with traditional metallic materials, the main advantages of composites are: a) low de nsit y and high specific strength and stiffness b) good vibration damping abil ity, long fatigue life and high wear, creep, corrosion and temperature resistances; b) strong tailoring ability in both microstructures and properties that make them design efficient to satisfy different application needs; c) since detail accessories can b e combined into a single cured assembly, the number of required fasteners and the amount of assembly labour can be significantly reduced [12][13]. The most common types of damages in composites are fibre breakage, matrix cracking, fibre - matrix debonding an d delamination between plies, most of them which occur through the thickness, and are barely visible. They can severely degrade the performance of a structure and can cause fatal damages at the worst. Due to the above advantages, composites are used in man y critical structures across different industries like aviation, automobile etc. and therefore require constant periodic monitoring to guarantee optimum performance. Figure 1.5 below shows a typical setup to perform active SHM of composites. 11 Figure 1.5 Active SHM Setup of a CFRP Composite Plate 1.4. Machine Learning in SHM Machine Learning (ML) is considered to be the natural evolution of statistical learning. It is one of the major dominant subset of Artificial Intelligence (AI). To summarize in a single s entence, ML is the science of developing statistical models that learn with time, and can perform a function/task of interest without necessary human intervention. Most of the current SHM frameworks aim to perform fast in - situ testing. NDE progressively ha s required a higher level of automatization to handle the huge amount of data generated. This in term means quick processing of the signals obtained after an experiment . Conventional data analysis techniques require expertise and involve manual labour, but with the advent of better computational facilities like GPUs , powerful machine learning based classification and characterization algorithms can be developed. The complexity involved in defect and materials characterization algorithms have been described before . This is mainly due to the measured output response space. Neural Networks are in other words known as universal function approximators [10[[11] [26][27] . Qu ite simply, they help in mapping the input feature space to the output response space. In general, SHM is an inverse problem that is data intensive and which does not have a unique solution. Hence, ML plays a very important role 12 in mapping output signals t o input features to discover important details of the structure. Overall, the benefits of Machine Learning in SHM can be described as: Automatization of the SHM framework i.e. performs various levels of SHM without any human intervention. Improves the robu stness of the algorithms. These algorithms are powerful and work very well even in presence of noise. Reliability. Once trained well, the networks can sustain and perform the described process multiple times with the same accuracy. Mapping input to output space, thereby gaining vital information about the system parameters Efficient and fast post processing of the observed outputs. 1.5. Material Property Identification of Composites In any Structural Health Monitoring framework, the knowledge of the material properties of the structure under inspection is fundamental. The knowledge of material properties is required before the structure is put into service. Because of the inherent heterogeneity of composites in general, coupled with the large uncertainties ex pected in the manufacturing process, it becomes extremely difficult to predict the material properties of the final part from those of the individual constituents. Once in service, it is also essential to continuously compute the material properties of the structure periodically to capture possible material degradation. Not only that, by monitoring the material properties over time, potential defects can be identified in the structure by detecting the changes in the material properties over time. Material degradation can cause a structure to fail if the strength falls below a certain threshold. Having efficient material property identifications schemes can potentially help fine tune the manufacturing process to reduce the amount of error that occurs normall y. Therefore 13 identification of material properties is not only required for materials characterization but also for in - situ monitoring. The ideal requirements for any material property identification scheme are that the method is non - destructive in nature, and is able to be deployed for real time online in - situ monitoring. The algorithm or framework should also be able to generalize well on any composite. Any framework involving guided waves requires a frequency selection procedure. Guided waves are m ulti - modal waves, and multiple modes often occur at higher frequencies relatively. Therefore, dispersion curves are normally used to study the dispersion phenomenon of guided waves in any material. Based on these curves, an optimal frequency is selected fo r the problem in hand. The dispersion curves which normally is phase/group velocity plotted as a function of frequency is dependent on the material properties of a composite. Hence uncertainties or differences in the material properties of the composite c an lead to dispersion curves that are not accurate, and therefore can result is selecting a frequency that is probably not best suited for the problem in hand. Hence due to the above mentioned reason, it is important to have an automated material property identification scheme that is real time and can be potentially used for in - situ monitoring. 1.6. Literature Review Researchers have used both vibration - based and ultrasonic guided wave (UGW) - based techniques to estimate material properties of a composite [1 4 ] . The vibration - based technique is global in nature and is sensitive to boundary conditions which are not suitable for in - situ condition monitoring [15 ]. The ultrasonic range of the mechanical guided waves is highly sensitive to composite laminate propert ies which makes them useful f or Non - Destructive Evaluations and Structural Health Monit oring applications [16][17][18] . Traditionally, considerable research has been done on using Deep Learning for damage detection, and 14 damage characterization. Rautela et. al. proposed different deep learning frameworks for damage detecti on in 1D composite waveguides [48 ]. Here, they have modelled cracks of various sizes using the Spectral Finite Element formulation, and have used the responses obtained from this model to t rain on different architectures. But the field of material s characterization using Deep Learning is relatively less explored. Rather, the inverse problem of property identification is investigated using different inversion schemes that in some way or other uses global optimization algorithms. Krishnan Balasubramaniam [7] has explored the feasibility of estimating the stiffness constants in a single layer unidirectional composite laminate, and the ply - up sequence of layered laminates. It is solved as a multi - parameter optimisation function, and genetic algorithm was used to find the optimal solution. Bernard Hosten et.al. [8] have used the phase velocities of lamb wave modes along with a Newton - Raphson scheme to estimate the elastic properties of composites b y minimizing the Thomson/Haskell matrix. . In this work, they have used air coupled transducers to generate lamb waves that are sensitive to material properties. J Vishnuvardhan et. al. [9] have utilized a genetic algorithm to measure the elastic properties of an orthotropic plate using ultrasonic velocity data. They have used slowness curves to verify the quality of the reconstruction. Ranting Cui et. al. [2] have investigated a property inversion scheme based on matching phase velocity dispersion curves of relevant guided modes (A0, S0, SH0) using Simulated Annealing optimization algorithm along with a metropolis criteria. In this process, a Semi Analytical Finite Element method is formulated to solve the forward problem. This inversion scheme is used to id entify the elastic properties of a unidirectional, quasi - isotropic and anisotropic composite laminates. All the above mentioned inversion schemes are limited in terms of large scale automation, generalization ability, computational time, in - situ pr edictions and robustness towards the noise. Also there is a research gap when it comes to using Deep Learning 15 Algorithms for material property identification. In this thesis, we aim to address this research gap by developing learning models within an overa ll framework that can identify and estimate material properties of composites. 1.7. Thesis Objectives The scope of this thesis is to establish a comprehensive framework with different learning models and interlinking data analysis algorithms that can be utili zed to characterize composite laminates. The propagation behaviour of the UGW in the structure is utilized to reverse engineering and estimate the material properties. Powerful Deep Learning algorithms like Convolutional Neural Networks ( CNN) and Long Shor t Term Memory Networks (LSTM), along with multi - layer Dense Neural Networks (DNN) are utilized for this purpose. The performance of these models is to be evaluated on data both in presence and absence of noise of various levels to verify its robustness. We use slightly different methodologies to collect data and train the deep networks (1DCNN and LSTMs) compared to that of the DNNs. The approach involving the deep networks will be classified as approach, while the approach involving DNNs will be classified as approach hereon in this thesis. Indeed, Deep Learning or DL is a subset of Machine Learning (ML), and the demarcation is made to easily differentiate the two different approaches. Once the framework is well established for complex structures like composites, it can be potentially extended to other materials used in critical structures like metals, concrete etc. To summarize, the main objecti ves of this thesis can be defined based on the two approaches used i.e. DL based approach and ML based approach. The main objectives in the DL based approach are as follows: 16 Develop (Forward Process) : Develop Numerical Models (FEM and SFEM) to simulate the two fundamental modes of a UGW (A0 and S0) in a unidirectional multi - layered composite lamin ate. Compare : Compare and comment on the two different numerical methods used. Creating the dataset : Use the numerical models in the forward process to obtain the A0 and S0 responses for differ ent sets of material properties representing different composi tes. Dataset Check : Sensitivity Analysis, Uniqueness check Training (Inverse Process) : Use the generated dataset, with A0 and S0 time histories as inputs and the material properties as the output labels to perform supervised regression using differ ent deep learning architectures (1DCNN and LSTM) Prediction : Once trained, for any uns een A0 and S0 waveforms , the networks should be able to identify and estimate the material properties. Robustness : Use a model trained on noiseless data and predict on datasets w ithout and with noise of various levels. The main objectives of the ML based approach are as follows: Dataset (Forward Process) : Use Dispersion Calculator ® software to generate group velocity dispersion curves for different sets of material properties repr esenting different composites. Training (Inverse Process) : Use the generated dataset, with group velocity curves for A0, S0 and the Shear Horizontal mode SH0 and the material properties as the output labels to perform supervised regression using a simple m ulti - layer Dense Neural Network. 17 Figure 1.6 Overall Workflow of the Composite Property Estimation Framework Figure 1.6 shows a flowchart describing the overall workflow of the composite material property identification framework. The organization of the thesis is as follows, Chapter 2 gives a basic introduction to composites and the Composite Laminate Theory (CLT). Chapter 3 then discusses guided waves in general before looking at specific cases of guided wave propagation in composite laminates. C hapter 4 covers the basics of Machine Learning, and describes in detail the Convolutional Neural Network, Long Short Term Memory Networks and the Dense Neural Networks. Chapter 5 describes in detail the two modelling techniques used in this thesis i.e. the Finite Element Method and the Spectral Finite Element Method. Chapter 6 outlines the data analysis algorithms used in this thesis, while Chapter 7 mentions the study parameters for the two different methods used. The results are presented and discussed in Chapter 8, and the thesis is concluded in Chapter 9. 18 CHAPTER 2 CLASSICAL LAMINATE THEORY Composites can be broadly classified into three different types: Fibrous Composites , Particulate Composites and Laminated Composites . For structural applications, only the laminated composites are extensively used, and therefore the whole material property identification framework is built considering composite laminates. A composite laminate consists of many layers which are commonly referred to as laminae or plie s. They are stacked together in particular order to form structures. Based on the strength and stiffness requirements, the number of plies is decided. Typically, each lamina consists of fibres oriented in a direction where maximum strength is required. The fibres are bonded by a matrix material which is mechanically inferior compared to the fibres. The laminate derives all its strength from the fibres. Some of the commonly used fibres are: Carbon Fibre, Kevlar, and Glass Fibre. The most commonly used matrix material is the Epoxy Resin. Laminated composites are assumed to have orthotropic properties at the lamina level, but when at the laminate level, they exhibit anisotropy. The anisotropic behaviour results in stif fness coupling, such as bending - axial - shear coupli ng in beams and plates, bending - axial - torsion identification of these types of composites is complex compared to isotropic materials. The material properties of the se composites typically can be determined at the lamina level (Micro Mechanics) or at the laminate level (Macro Mechanics). Section 2.1 below outlines the micro mechanical analysis while Section 2.2 describes the macro mechanical analysis. 19 2.1. Micro M echanical Analysis A 2D transversely isotropic lamina require s determination of 6 properties namely, Elastic modulus in two coordinate directions (E 1 , E 2 12 23 ), shear modulus (G 12 ) and density . These properties are determined using the properties of t he fibre and the binding matrix. The lamina and the derived laminate strength are strongly influenced by the type of fibres an d their orientation. Another important parameter that has a direct influence on the laminate strength is the Volume Fraction. Volume fraction is the volume of fibre present with respect to the overall volume of the lamina. According to Jones [20], micro - me chanics is the study of composite material behaviour, wherein the interaction of the constituent materials is examined in detail as a part of the definition of the behaviour of the heterogeneous composite material. The overall elastic moduli of a composite material are expressed in relation to that of the fibres and the matrix. The properties of a lamina can be mathematically expressed as shown in Equation ( 2.1 ) . (2.1) Where E , and V fraction respectively, while f and m in the subscript denotes the fibre and matrix respectively. The volume fraction of fibre and the matrix is given bel ow in Equations ( 2.2 ) and ( 2.3 ) . (2.2) (2.3) Where vol f and vo l com are the volume of fibre and the volume of the whole composite laminate respectively. In order to determine material properties of a lamina, some fundamental assumptions are required. The main fundamental one is that the fibre is mechanically superior compared to the matrix, and therefore is the main load bearing member in the constituent matrix. Also, the strains in the fibre and the matrix are assumed to be the same. In this work , a transversely isotropic laminate is considered for material property 20 i dentification . In doing so, the analysis can be limited to a small representative volume. Such a volume is called Representative Volume (RV) . A simple RV would be a fibre surr ounded by a matrix as shown in F igure 2.1. Figure 2.1 (a) RV for longitudinal m aterial property determination (b) RV for transverse material property determination By applying a stress 1 along direction 1, t direction 1 are determined in [19] to be: (2.4) Where E f and E m fibre and matrix respectively. Equation ( 2.4 ) is more commonly well known as the Rule of Mixtures . Similarly, a stress 2 is applied along direction 2, and the elastic modulus in direction 2 is computed to be: (2.5) (2.6) The shear modulu s is express ed in E quation ( 2.7 ) , while the lamina density is expressed in E quation ( 2.8 ) . 21 (2.7) (2.8) Once the material properties have been determined at the lamina level, one can proceed then to perform a macro mechanical analysis of the lamina to effectively characterize the overall consti tutive model of the laminate. 2.2. Macro Mechanical Analysis Macro mechanical analysis involves the determin ation of the overall constitutive model i.e. the overall stress strain relations of the composite laminate consisting of individual laminas stacked together. This is done using Classical Laminate Plate Theory (CLPT) . As mentioned earlier, only necessary equations have been provided here. The theoretical detail of these equations can be found in [19]. Main assumptions made in CLPT are the following: 1. T superposition principle are valid. 2. At the lamina level, the composite is homogeneous and orthotropic and has exactly two planes of symmetry, one along the direction of the f ibre and other perpendicular to it. 3. The state of stress at the lamina level is predominantly plane stress. Figure 2.2 Co - ordinate system for a laminate 22 Looking at F igure 2.2, the principal axes are denoted as 1 - 2 - 3, where direction 1 is along the fibres, and direction 2 is transverse to it. . The lamina is assumed to be in 3 - D state of stress with six stress components. For a transversely isotropic laminate, strain and stress is related by a compliance matrix as shown in E quation ( 2.9 ) . All bold fa ce variables denoted here are matrices. (2.9) Where is the strain developed in the composite due to the applied stress . T he compliance matrix S is given by [21] : (2.10) The stiffness matrix Q is expressed in equation in Equation ( 2.11 ) : (2.11) On solving, the entries of matrix Q is derived in terms of the individua l lamina properties determined in the micro mechanical analysis explained in section 2.1 [19] to be: (2.12) (2.13) (2.14) (2.15) Through symmetry, the stiffne ss matrix Q can be reduced to a 3*3 matrix, and is expressed below in Equation ( 2.16 ) . (2.16) 23 The above relations are for a unidirectional co mposite with fibres along direction 1 at 0 ° . Typically fibres of arbitrary orientations are used in structural applications. In most cases, the orien tations of the global axes do not coincide with the principal axes of the composite. Hence a combination of translation ( T ) and rotation ( R ) operations is required to obtain the stiffness matrix of such composites. The reduced stiffness matrix is expressed in Equation (2.17) while F igure 2.3 shows the difference in the orientations of the principal axes and the global co - ordinate system. (2.17) Figure 2.3 Principal axes of the laminate and global x - y axes The matrix is fully populated. Hence, although the lamina in its own prin cipal direction is orthotropic, in the transformed coordinate, it represents complete anisotropic behaviour. The elements of is given by +2( (2.18) (2.19) (2.20) + (2.21) (2.22) (2.23) 24 * The reduced stiffness matrix can be full expres sed as shown in E quation ( 2.24 ) . (2.24) The axial stiffness for the whole constituent model is given in Equation ( 2.25 ) , while the flex ural (bending) stiffness is mathematically expressed in Equation ( 2.26 ) . Equation ( 2.27 ) expresses the axial bending coupling i.e. it relates the in - plane forces with mid - plane curvatures. (2.25) (2.26) (2.27) Where z is the vertical position of the p articular ply from the mid - plane of the laminate as sho wn in F igure 2.4 which shows a typical 8 layered composite laminate. The constituent equation of a comp osite laminate is concluded in E quation ( 2.28 ) . The constituent equation relates the forces N and the moments M with the strains and curvatures k . (2.28) Figure 2.4 An 8 layered unidirectional composite laminate 25 CHAPTER 3 FUNDAMENTALS OF GUIDED WAVES 3 .1. Introduc tion to Guided Waves Guided waves are typically elastic waves carrying energy which is confined between the boundaries separating two mediums. Guided waves travel along the length of the sample. They are created by the interactions of bulk waves and the bo undaries which results in several mode changes. These interactions also create standing wave modes normal to the boundaries which propagate parallel to the boundaries. In essence, guided wave modes are a constructive interference pattern created by the int eraction of many bulk waves with the boundaries. They can be broadly classified into three types: Rayleigh waves (Surface Acoustic Waves), Shear Horizontal Waves (SH) and Lamb waves . Rayleigh waves are surface waves that can propagate in a half space i.e. it can propagate when there is only a single boundary present, while SH waves and Lamb waves need two boundaries to propagate . Hence, a lot of studies involve analysing the propagation of Guided Waves in plate like structures or pipelines. Guided W aves can travel along surfaces, interfaces or even throughout the volume structures. This can be achieved by appropriately applying an incident pulse, and making sure the proper boundary conditions exist on the structures in question. Though the direction of propagation is always parallel to the boundaries, the particle motion can be longitudinal, transverse or elliptical depending on the mode that is generated. Because of the boundary effects, which induces the dispersiveness of the wave, analysis of guide d waves are much more complex compared to Bulk waves. However, this complexity can be utilized with great effect in SHM. Compared to Bulk waves, Guided waves can travel long distances, and therefore it can be used to propagate along long structures (pipeli nes), and large areas can be 26 monitored from a single location. Inspecting using b ulk waves are cumbersome and time consuming and sometimes c annot inspect st ructures that are about few thick due to impedance issues . Guided waves can also be use d to detect very small defects depending on the frequency and the mode selection [22]. Figure 3.1 below roughly approximates the inspection areas of Bulk Wave Testing and Guided Wave Testing. Figure 3.1 Bulk Wave Testing vs Guided Wave Testing Th is section gives a general introduction to Guided waves, and describes the analysis of propagation of guided waves in isotropic plate (metal) like structures. It then talks about dispersion curves, before concluding with the analysis of guided waves in a 1 D Laminated Composite. Since this is very well researched area and also well documented in literature, only the important equations relevant to this thesis is given . The in depth derivations can be found in the references cited. 3.2 . Lamb Waves Lamb wave s are elastic stress waves that propagate between two traction free plates. They are also commonly known as plate guided waves . The governing equation of Lamb waves can be approximated to elastic wave propagation in doubly bounded media. The equat ion 27 [23] can be used to st udy L amb waves in general in a three dimensional solid. Equation ( 3.1 ) describes it in a condensed form. (3.1) Where is the density of the material, and are known as constants, is the del operator and 2 is the Laplacian. u is the particle displacement vector and is defined as u =iu x +ju y +ku z . The fundamental differential equations governing wave propagation are based upon three fundamental relationships from Linear Elasticity Theory. These include strain - displacement relation shown in Equation ( 3.2 ) , generalized equation of motion by Newton in Equation ( 3.3 ) and constitutive stress - strain relations described previously in Chapter 2 and in Equation ( 3.4 ) . (3.2) (3.3) (3.4) kl is the second order strain tensor, while c ijkl is the fourth order stiffness tensor and ij the second order stress tensor. The indices indicate summation and commas indicate partial derivatives where i,j and k =1,2,3. Basically, the equation combines three equilibrium equations, six stress displacement equations and six constituent stress strain relations. For an isotopic medium, the equation can be solved using Helmholtz vector by splitting the equation into two partial differential equations based on scalar potentials H and . For an isotropic medium, the governing equation can be effectively solved to give separate relations for the longitudinal ( P waves ) and the transverse waves ( S waves ). In essence, a single propagating guided wave can be decoupled to two of its fundament al components, the P wave and the S wave. Equation s 28 ( 3.5 ) and ( 3.6 ) describe the Helmholtz decomposition, where Equation (3.6) is known as Gauge invariance. (3.5) (3.6) Substituting Equation ( 3.5 ) and ( 3.6 ) in Equation ( 3.1 ) , we obtain two separate governing partial differential equation, which are given by : =0 (3.7) (3.8) Rearranging Equation ( 3.7 ) and ( 3.8 ) , and then expressing it in the rectangular co - ordinate system, we obtain the classi c equations of wave propagation in a 3D homogeneous solid in terms of scalar and vector potentials. (3.9) (3.10) Where c L is the longitudinal wave speed (or P wave speed) and is equal to (2 , and c S is the shear or transverse wave speed and is equal to A complete guide to t hi s derivation can be found in [17 ]. To find the longitudinal mode solutions, classic separation of variables is performed using the expressions of potentials presented in [24] where the continuity condition in circumferential direction is applied. For tor sional or transverse mode solutions, alternative solutions are presented by [25]. The last step is the application of boundary conditions, and eigenvalues can be used to trace dispersion curves in general. For any non 1D solid, the total displacemen t vector can be expressed as shown in Equation ( 3.11 ) . (3.11) 29 Where u L is the displacement due to the longitudinal or Pressure wave (P wave). The particle motion i s parallel with respect to the direction of wave propagation in this case. u S and u SH are the displacements due to the transverse (S wave) and the Shear Horizontal modes, where the particle motion is normal to that of the direction of wave propagation. The interaction of shear and pressure waves in thin plates and shells give rise to guided waves. In principle, there exist infinite guided wave modes. The first two fundamental of these wave modes are the Symmetric (S0) mode, and the Anti - Symmetric (A0) mode. These modes frequency range where the wavelength is greater than the plate thickness, these modes are called the extensional mode and the flexural mode res pectively. The names are derived based on the particle motion and the elastic that govern the propagation of these waves. These characteristics change as frequency increases. For simplicity, we refer to these modes as A0 and S0 in this work. Th ese modes are particularly important as (a) they exist at all frequencies and (b) they in most cases carry more energy than the higher order modes. Point (a) is particularly important as they are the only modes to exist at lower frequencies until a certain cut off frequency. Therefore, these modes can be exploited for various applications in SHM , for if we use higher modes, multiple reflections from different wave modes will bury the defect signature thereby making the analysis of data more cumbersome and e rror prone. In the following section, we briefly show the symmetric and anti - symmetric solution for lamb wave propagation in two dimensional plates of infinite length. 3.3. Lamb Wave Propagation in 2D Plates Let us consider an infinite 2D plate as shown i n Figure 3.2. For simplicity of analysis it is possible to assume that the wave potentials are invariant to the z - direction along the wave 30 front [23]. The direction of propagation of the wave is considered to be in the x direction. The plate here is consid ered to be isotropic in nature, as only then the decomposition principle can be effectively applied to decouple the governing equation to its constituent longitudinal and transverse wave components. Figure 3.2 2D infinite plate of thickness 2d. In the above case, since the wave potentials are assumed to be invariant along the z 0 and u SH has only the shear displacement component u z . The displacement vector for the longitudinal mode and torsional m ode u L and u S have both u x and u y components. These displacements only depend on the scalar potential and z - component of a vector potential H z . Therefore the equations can be rewritten as: (3.12) (3.13) By substituting simple notations for and H z = , and assuming plane wave solution o f the form and we can get a classic formulation for L amb wave propagation in the potential form: (3.14) (3.15) Assuming a harmonic solution e - x , the above equations reduce to: 31 (3.16) (3.17) In equations ( 3.16 ) and ( 3.17 ) , = /c is defined as the wavenumber. To simpl if y the analysis further and express the e quations in a more compact form for convenience, the following substitutions are made: (3.18) (3.19) Therefore, Equations (3.16) and (3.17) can be expressed as: (3.20) (3.21) A general solution for the above governing equations can be expressed as shown below: (3.22) (3.23) Where, A 1 , A 2 , B 1 and B 2 can be established by applying the appropriate boundary conditions. This can be done by applying symmetric or antisymmetric boundary conditions which gives rise to the symmetric and anti symmetric mode solution respectively. Equations (3.24) and (3.25) show the symmetric and antisymmetric mode solutions respectively. An in depth derivation of this can be found in Appendix A and B based on [25]. (3.24) (3.2 5 ) Where p and q are described in Equations (3.18) and (3.19) respectively. Therefore, using the above relations, one can trace dispersion curves. 32 3.4. Dispersion Principles Dispersion in general signifies a changing relationship between velocity and frequency. Bulk waves wave does not change. The concept of phase and group velocity is critical to explain dispersion. In simple terms, group velocity is the velocity of the whole wave packet, or the whole wave front. A wave packet typically contains multiple sine waves of different frequency content packed within. The velocity of the sine waves that make up the envelope or wave mode is the phase velocity, while the velocity of the whole packet itself is the group velocity. (3.26) Since both p and q both depend on , the phase and group velocities should be evaluated numerically at each frequency step as a multiple of plate thickness. Figures 3.3 and 3.4 below show the phase and group velocity plots of a 1mm Steel Plate respectively. It follows that the solution is not unique, and at higher frequencies, multiple higher order modes of the Symmetric and Anti - Symmetric modes exi st. Group velocity dispersion plots are more vital, for we can determine how fast the wave front propagates which helps us establish the Time of Flight (TOF). This information is extremely useful for damage detection and damage characterization in structur es. Not only that, the group velocities is useful in establishing the material properties of a structure. Group velocity is derived from phase velocity, and the mathematical expression for it is expressed in Equation (3.26) [17]. Dispersion curves a re critical in guided wave inspection. Depending on the wave mode that is suitable for testing, it helps in finalizing an operating frequency based on the requirements. It also gives an engineer the cut - off frequency up to which he can operate without invo lving higher order modes, which invariable complicates the structural analysis. The first step of any guided wave experiment or simulation is to plot the dispersion curves. 33 Performing any guided wave setup without doing so would result in wasted efforts, l ost time and energies. Figure 3.3 Phase Velocity Dispersion Plot for 1mm thick Steel Plate 1 Figure 3.4 Group Velocity Dispersion Plot for 1mm thick Steel Plate 1 Metals are mostly isotropic in nature. Though they contain a ver y small degree of anisotrop y , it is neglected for most of the analysis in Wave Mechanics. Helmholtz decomposition principle can be used to solve the governing wave equations only when the 1 Dispersion Plots are plotted with the help of D C software This footnote can go in Reference section Link: https://www.dlr.de/zlp/en/desktopdefault.aspx/tabid - 14332/24874_read - 61142/#/galler y/33485 34 material is Isotropic. By applying it for any non 1D solid, we can b asically decouple the equation into its constitutive longitudinal and transverse mode. But when it comes to composites, they are highly anisotropic in nature. There is not true P wave or true S Wave. Hence, anisotropy cannot be neglected in the analysis, a nd hence Helmholtz principal can no longer be used to solve a system that has more than one dimension. The analysis of 2D composites is achieved by using the Partial Wave Theory which is explained in depth in [19]. Though this concept is critical to unders tand in SHM, the scope of this thesis is to establish a reliable framework to estimate and identify material properties. Therefore, we consider a 1D composite, and the next section outlines briefly the guided wave propagation in 1D Laminated Composite. It is well known that using higher order 1 - D theories, we can get to great extent the dispersion relations obtained from 2 - D Lamb wave theory [19][49] . For example, if we use First Order Shear deformation theory (Timoshenko theory) and higher order Mindlin - He r mann Theory [19}, we can accurately get the dispersion relations obtained by the 2 - D Lamb wave equations. Hence, we pursue this approach in this thesis. A key point to note is that in a simple higher order 1D model, there will be only one wave mode presen t at a time i.e. the axial and flexural mode will not be coupled. The type of mode generated will depend on the direction of application of the incident tone burst signal. 3.5. Guided Waves in 1D Laminated Composite In this derivation, we have considered a 1D composite waveguide that satisfies both elementary rod and beam theories. We begin with defining the displacement vectors fields. (3.27) (3.28) Where u o is the axial displacement along the mid plane, and w is the transverse d isplacement, and z is measured from the middle plane as shown in Figure 3.5. 35 Figure 3.5 Elementary laminate Composite Waveguide with co - ordinate system A constitutive 3D model for laminated composites was explained in Chapter 2 ( Equat ions 2.18 - 2.23). The analysis of 1D composites can be achieved if we consider that the composite waveguide in in a 1 - D state of stress, and therefore the layer - wise constitutive law is expressed in Equation (3.29). (3.29) Where xx and xx are the stress and strain applied in the x direction. The expression for 11 as a function of ply angle is shown below. (3.30) Where and . Q ij is the orthotropic elastic coefficients of the individual lamina, and its constitutive relations can be found in Chapter 2. The strain energy is defi ned in Equation (3.31). (3.31) The kinetic energy is then defined as, (3.32) Where is the layer wise density, differential equations are obtained, and they can be expressed as (3.33) (3.34) 36 The corresponding boundary conditions are, (3.35) (3.36) (3.37) Where A 11 is the axial stiffness, B 11 the axial - bending coupled stiffness and D 11 the bending stiffness. They are mathematically expressed in Equation (3.38). (3.38) Where h is the depth of the beam, b is the layer width. A is the cross sectional area while d 2 u o /dt 2 and d 2 w/dt 2 are the longitudinal and transverse accelerations. N x is the force acting in the axial direction; V x i s the shear force while M x is the bending moment. In our studies here, we primarily consider only a symmetric laminate. Symmetric Laminates are composite laminates where the lay up above the mid plane is the mirror image of that of the layup below the mid plane. For a symmetric laminate, the axial - bending stiffness B 11 is equal to 0. As a consequence, axial and flexural modes become uncoupled. Therefore for a symmetric laminate, we can rewrite equations (3.33) to (3.37) as (3.39) (3.40) The boundary conditions can be rewritten as, (3.41 ) (3.42 ) (3.43) 37 From Equations (3.39) and (3.40), it is visible that the governing equations are uncoupled, i.e. solution to Equation (3.39) will give rise to the axial mode, while solution to the equation (3.40) will give rise to th e flexural mode. An in depth derivation of the solution based on computation of wavenumbers is available in [19]. 38 CHAPTER 4 OVERVIEW OF MACHINE LEARNING Deep Neural Networks are an important subgroup of Machine Learning. Machine Learning (ML) is consider ed to be the natural evolution of statistical learning. It is one of the major dominant subset of Artificial Intelligence (AI). To summarize in a single sentence, ML is the science of developing statistical models that learn with time, and can perform a fu nction/task of interest without necessary human intervention. Previously, the field of AI relied heavily on hard coded rules and pre fixed algorithms. This required enormous computation al resources. Also, the systems depended on programmed intelligence, an d were not capable of learning on their own. ML, whereas relies on learning using real time data instead of relying on hard coded rules. This is done by building a model that best describes patterns between the input and output data. In a nutshell, they ca n be seen as function approximators that fits the best possible function for a set of input and output vectors [ 26] [27]. ML algorithms have the ability to predict the output even on unseen data. It greatly reduces the computational effort, while the accur acy keeps improving as more real data is available to train. There are different Decision Trees, Support Vector Machines, Radial Basis Functions etc. Deep Learning Algorithms meanwhile are a subset of ML techniques that basically mimics a human brain . A deep neural network basically contains multiple layers that have nodes that resemble a neuron in the brain with associated weights. The weights can be learned using many optimization algorithms to pr edict the desired output. They are very flexible, and can learn complex and often nonlinear relationships. Though the amount of training data required is large, the type of tasks it can achieve is unparalleled , and is pretty much driving the field of AI th ese days. Figure 4.1 summarizes how these different techniques fit in. To summarize, Deep Learning techniques are one of the many different ML techniques, that is part of the broader AI field. 39 Figure 4.1 Machine Learning and Deep Learning In this chapter, we introduce the fundamentals of an Artificial Neural Network (ANN) that include the basic theory, math and the algorithms behind it. Following this , more advanced deep learning models like Convolutional Neural Networks (CNN) are discussed in deta il. 4.1 Artificial Neural Networks An artificial neuron w as formalized as initially the Threshold Logic U nit (TLU) by McCulloch and Pitts in 1943 [28]. For a single neuron shown in Figure 4.2, the output signal can be written mathematically as: (4.1) Where, x i is the input to a neuron, y j is the output from the neuron. W ji is the associated weight matrix and b is the bias term, which is generally used t o shift the decision boundary line or any hyper - plane in multidimensional problems [29]. The function f is the activation function which decides in which manner the neuron will be excited or fired for a particular input signal. As the name suggests, a heav eside step function is used in the TLU . The function is expressed mathematically in Equation (4.2). 40 (4.2) Figure 4.2 Pipeline of a Neuron This activation is actually not used these days because it has a constant derivative, and more importantly t he derivative is not defined at x=0. It must be noted that it is similar to how a biological neuron works. Another important point of note here is that, the weight matrix initially designed in the TLU does not have the capability to learn. A TLU th en evolved into the famous perceptron networks. Perceptron networks were similar to TLU, but the major difference was that there was now a learning rule to adjust the weight matrix. The delta rule, which uses the Vanilla gradient descent approach to minimi ze the error, is used as the learning rule. Multiple neurons stacked together in layers are now commonly known as Multi - layer Perceptron Networks (MLP) or as an Artificial Neural Network (ANN). Since each neuron is fully connected with neurons from the nex t layer, they are also referred to as Fully Connected Networks (FCN) or Dense Neural Networks (DNN) . Perceptron Networks were first reported by Rosenblatt in 1958 [30] . 41 4. 2 Multi Layered Perceptron (MLP) Figure 4.3 shows a typical MLP network with one input layer, one output layer and one hidden layers. The MLP is a feed forward network, where the flow of information is forward, while the flow of error information is backward. Each node i is connected another node j in the previous and following layers with an associated weight w ij . In layer k , the weighted sum of is performed at each node i of all the signals x j (k - 1) from the preceding layer k - 1 giving the sum z i (k) of the node. The sum is passed through a nonlinear activation function f , which gives th e output of the MLP network. Mathematically, it can be expressed as, (4.3) Figure 4.3 Three Layered MLP Network There are different types of activati on functions used. Each one has its own advantages and disadvantages which is not discussed in this thesis. The main nonlinear activation functions extensively used are the sigmoid , tanh, softmax and ReLu activation functions. The ReLu function i.e. the R ectified Linear Unit is computationally very efficient due to its simplicity. Equation s (4.4) and (4.5) describe the ReLu function. The ReLu function introduces nonlinearity in the system that greatly improves the performance in most cases. One disadvantag e of ReLu is that, when the inputs approach zero or negative, the gradient of the function becomes zero, therefore the network then cannot perform backpropagation and 42 Dying ReLu introd ucing a small positive slope in the negative area, and is known as the Leaky ReLu. (4.4) (4.5) Machine Learning can be broadly classified into three categories . Supervised Learning is where the model is task driven in order to predict/classify a desired output. Unsupervised Learning is data driven and includes clustering algor ithms. The final type is the R einforcement L earning where the model learns to react to an environment. In this work, we predominantly use Supervised Learning Algorithms. The core concept of Supervised Learning involves two passes, the first being the forwa rd pass where an input signal is propagated through multiple layers to obtain a predicted output. The second being the backward pass, where the error obtained from the predicted and true output is propagated back up to the first hidden layer. Using the err or information, the weights are adjusted based on some rules to move the predicted output close to the true output i.e minimize the error. This is often achieved through the backpropagation algorithm. The beauty of the algorithm is that the weights are changed accordingly depending on how much they contribute to final error. Suppose we consider a Loss function as defined in Equation (4.6), where n is the number of samples, y and are the true and predicted outputs. (4.6) A point to note is that this type of evaluating the loss using all samples at the end of every epoch is called batch or online training , where an epoch is defined as a measure of the number o f times all of the training vectors are used once to update the weights. By back propagating through the network, we can obtain information on how the previous outputs influence the error using the chain rule. Assuming there are k hidden layers, for any pa rticular neuron, we compute: 43 (4.7) w here net ik is the total value entering the activation function in Equation (4. 7 ). By looking at the individual derivative s, it becomes obvious to calculate all of them from the final output layer to the first input layer. By repeating the chain rule process, we have all the partial derivatives to update th e weights according to Equation (4.8 ). This is typically the Vanilla version of back propagation. There are other versions that have been developed , which are more advanced, and computationally more efficient. The overall process of driving the loss to a minimum by means of backpropagation is known as gradient descent , whic h is given by (4.8) Here, is called the learning rate, which basically regulates the rate of the learning process. A general methodology to find an optimum learning rate is a very well posed this; rather mostly by trial and error . a suitable learning rate can be established. However, it is t o be noted that establishing a suitable learning rate is fundamental to any ML study. Choosing too small a value can lead to premature convergence, while a big value can cause oscillations near the global minima which , is commonly known as overshooting. The main pu rpose of a neural network is to classify or predict on unseen data, after being trained on seen data. In order to do that, a dataset is broken down in to three main subsets. Training dataset : This is the dataset that is used to train the model which basica lly fits for all the weights to get the desired output. Validation dataset : In order to verify how well the network is training, some part of the dataset is fed in at pre fixed time intervals and the validation loss is monitored. If the validation loss inc reases during training, then it is generally advisable to stop training. This is normally known as overfitting . 44 Test dataset : Once training is completed, the weights are all updated and stored as a model. This is used on a dataset that consists of previous ly unseen data to predict or classify based on the task. Typically, there is again no pre - set rule to determine the ratio in which these datasets are split from the main dataset. Having too many training examples will result in over generalization, while having a small training dataset will result in insufficient learning. Another important concept to consider is the problem of overfitting and underfitting. If the model is complex with multiple layers, and if a line is fit to pass through all the tra ining samples, this results in under generalisation of the model, where the model has learnt so well on the training network and yet it fails to predict on any unobserved data. This is known as overfitting . The exact opposite occurs when the model is too s imple, where the model suffers from high bias and no matter how many observations is fed into it; the model always produces similar results. This is known as underfitting . When the training loss is far lesser than the validation loss, this indicates overfi tting while the opposite indicates underfitting. Therefore, achieving the right trade - off is one of the most challenging tasks in ML. Figure 4.4 Overfitting and Underfitting [31] By nature, deep neural n etworks are always complex models. Hence th e problem of underfitting occurs much lesser than that of overfitting. Therefore, there is a need to avoid overfitting when designing deep neural networks. Typically this is achieved by many different approaches. The main methods use regularizers, which re strict the sudden changes 45 in weights by adding a penalty term in the loss function. Another way to tackle overfitting is by adding dropout layers, which randomly prunes some neurons in the hidden layers thereby favouring some features over others. This gre atly helps in making sure the model doe s not just generalize on the training dataset. Another commonly exploited technique is data augmentation. This simply refers to distorting some features of the data to create more data that can be fed in to the model to achieve better performance. Deep Neural Networks are always data hungry, and more the data, better is the performance in most cases. Typically the MLP/DNNs presented this section involves a time consuming feature selection process. The features are then fed to the neural network to perform an associated task. But deeper networks like Convolutional Neural Networks and Long Short Term Memory Networks eliminate this step by having a feature selection process incorporated within the network itself i.e. one w ould have to just feed in the raw signal, and the network would pick out features that best describe the input signals and complete an associated task. Figure 4.5 Example of Data Augmentation [32] 46 4.3 Convolutional Neural Networks (CNN) CNNs take biol ogical inspiration from the visual cortex. The visual cortex has small regions of cells that are sensitive to specific regions of the visual field. This idea was expanded upon by a fascinating experiment by Hubel and Wiesel in 1962 where they showed that s ome individual neuronal cells in the brain responded (or fired) only in the presence of edges of a certain orientation [33]. A C NN is a network that performs the Convolution operation (rather Correlation) in at least one of its layers instead of the genera l matrix multiplication and contain images for training. Some of the most salient features of a CNN are spatial down sampling, shared weights and local reception. A typical CNN has convolutional layers, pooling layers and fully connected layers. The convolutional layer is the first layer that is used to extract features from the input that best describes the input data. This is done by convoluting the input image by convolutional filters called as convolutional kernels . The resulting output is typically known as feature maps. Figure 4.6 shows how a convolutional layer wor ks. Figure 4.6 Working of a Convolutional Layer 4 7 Assuming an input image that is a 6 x 6 matrix, a convolutional filter of size 2 is applied to the input matrix. A simple multiplication of the elements in the filter and the input images is done to establish the first entry (@ (1, 1) of the feature matrix. The size of the output feature map depends on an entity called stride . The stride gives the spatial distance between the central pixels (both vertically and horizontally) on which the convolution o peration takes place. The feature map in Figure 4.6 is computed with a stride of 1. Mathematically, the convolution operation can be defined as: (4.9) The output G is the feature map produced by the convolution operation, while f is the input and h is the kernel/filter. m and n are the indices of the row and column of th e output feature map. A convolutional layer enforces the idea of weight sharing. If every pixel in the image is imagined to be a neuron in the first layer, and if the output feature map is treated as the second layer where each pixel once again is treated as a neuron, it is clearly evident that at any single given computation, only few neurons from the first input layer is connected to the second output layer. As the kernel is moved across the image, different neurons are connected and this therefore ensure s shared weights. Shared weights reduce the number of parameters to be learnt, and it becomes computationally more efficient. Figure 4.7 Example of weight sharing in Convolutional Layers 48 s. These layers are used to sub sample the feature maps spatially. They reduce the dimensions that we are working with and make it computationally more efficient. Similar to convolutional layers, the input image is stride over by a kernel, wherein operatio ns like max pooling, average pooling takes place. A point to note is that pooling layers have no learnable parameters. Figure 4.8 below shows the working of a max poolin g layer for a stride length 2. Figure 4.8 Working of a Max Pooling Layer (stride=2) The final architectural idea involved in a CNN is the fully connected layer (FCN). This is very similar to the MLP networks discussed before, and is basically a classifier/predictor model that uses the features extracted by the kernels in previous la yers. All the parameters of this layer are learnable and is generally the last layer of a CNN. Convolutional Neural Networks are generally used only on images, but a lot of recent research has gone into using - CNN) for time series data. 1D - inherent features from long time signals. 1D - only difference being the operations is done in single dimension. of kernel size is very critical and can be treated as a hyper parameter. 49 4.4 Recurrent Neural Networks (RNN) The traditional neural networks and convolutional neural networks all work with fixed input and output lengths. But that is never the case in many practical applications (for example, f inding the number of vowels in a sentence). Problems like speech recognition and time series predictions or forecasting require a system to store and use context information. More so, if we consider a human brain, one of the most salient features is the pe rsistence present in our system i.e. we remember something we lear nt/saw days or weeks before. Recurrent Neural network (RNN) typically brings about persistence in a traditional neural network. Recurrent Neural Networks take the previous output or hidden s tates in most cases as the inputs for the current computation [44] . The composite input at time t has some historical information about the happening at time T