DATA-DRIVEN MULTI-SCALE MODELING, ANALYSIS AND SIMULATION OF
                           MATERIAL FAILURE
                                      By
                    Eduardo Augusto Barros de Moraes
                              A DISSERTATION
                                  Submitted to
                          Michigan State University
                  in partial fulfillment of the requirements
                               for the degree of
              Mechanical Engineering – Doctor of Philosophy
      Computational Mathematics, Science and Engineering – Dual Major
                                     2022


                                             ABSTRACT
      DATA-DRIVEN MULTI-SCALE MODELING, ANALYSIS AND SIMULATION OF
                                        MATERIAL FAILURE
                                                   By
                                  Eduardo Augusto Barros de Moraes
    Material failure processes are inherently stochastic and anomalous, occurring across a wide span
of length and time scales, from dislocation motion at the micro-scale, to formation of micro-cracks,
up to crack propagation and aging mechanisms at the macro-scale and cascading failure at the
system-level. Anomalies such as intermittent signals in Acoustic Energy experiments, power-law
distribution of the energy spectrum, crackling noise, dislocation avalanches, among other indicators,
occur even in standard, ordered, crystalline materials. Modeling and simulation of failure must
take into account parametric and model-form uncertainties that propagate across the scales, when
seemingly unimportant material properties or loading conditions could cause catastrophic failure
at the component level. The pursuit of a unified framework for quantitative and qualitative failure
prediction that can bridge the multiple scales while still incorporating the material’s underlying
stochastic processes is still a challenge, which requires a new modeling paradigm that incorporates
such features with both robustness and simplicity.
    In this work, we propose a data-driven methodology for multi-scale, statistically consistent
modeling of anomalous failure processes. At the micro-scale, the goal is to study the dynamics
of dislocations, which play a vital role in plasticity and crack nucleation mechanisms, and shows
anomalous features across different time and length-scales. We start by investigating the dislocation
mobility properties at the nano-scale and propose a surrogate model for dislocation motion based
on a Kinetic Monte Carlo method, where the dislocation motion is emulated as a random-walk
on a network following a Poisson process. The surrogate learns the rates of the corresponding
Poisson process directly from high-fidelity, Molecular Dynamics (MD) simulations. The surrogate
is capable of efficiently obtaining uncertainty measures for the mobility parameter, which can be
then propagated to more complex simulations in upper scales. At the meso-scale, the collective
behavior of dislocation dynamics leads to avalanche, strain bursts, intermittent energy spikes, and


nonlocal interactions. We develop a probabilistic model for dislocation motion constructed directly
from trajectory data from Discrete Dislocation Dynamics (DDD). We obtain the corresponding
Probability Density Function for the dislocation position, and propose a nonlocal transport model
for the PDF. We use a bi-level Machine Learning framework to learn the parameters of the nonlocal
operator and the coefficients of the PDF evolution equation, facilitating a continuum representation
of the anomalous phenomena.
    At the macro-scale, parametric material uncertainties substantially affect the predictability of
failure at the component level. We develop an Uncertainty Quantification (UQ) and Sensitivity
Analysis (SA) framework for propagation of parametric uncertainties in a stochastic phase-field
model of damage and fatigue, and we use the Probabilistic Collocation Method (PCM) as a building
block. A Global SA indicates the most influential parameters in solution uncertainty and shows
that damage initiation is sensitive to parameters associated with classical free-energy potential
definitions, providing another motivation to incorporate the heavy-tail processes as observed in
the meso-scale. We extend the framework and develop a Machine Learning (ML) framework for
failure prediction phase-field models for brittle materials. We combine a classification algorithm
with a pattern recognition scheme using virtual nodes from the phase-field damage model to
generate patterns of material softening at each time-step. The framework identifies the presence
and location of cracks and is robust even under noisy data, whether from model, parametric, or
experimental uncertainties.


Copyright by
EDUARDO AUGUSTO BARROS DE MORAES
2022


I dedicate this work to my wife, Julia, and my children, Lucas and Olivia.
                                     v


                                    ACKNOWLEDGEMENTS
I would like to thank Dr. Mohsen Zayernouri, my PhD advisor, for the wonderful opportunity to
join FMATH group and engage in such interesting and productive research. I am grateful for all
the discussions, exchange of ideas, and thorough help during this journey. I extend my gratitude to
the Committee members, Dr. Shanker Balasubramaniam, Dr. Tony Gao, Dr. Sara Roccabianca,
Dr. Hui-Chia Yu for the valuable feedback and support for the completion of this PhD project.
    I would like to thank the external collaborators that worked with me the past five years and
helped this project make substantial advances, Dr. Mark Meerschaert, Dr. Hadi Salehi, and Dr.
Marta D’Elia.
    I would also like to thank all the current and former members of FMATH group with whom
I had the privilege to share this amazing experience of being a Spartan. Thank you Dr. Ehzan
Kharazmi, Dr. Mehdi Samiee, Dr. Jorge Suzuki, Pegah Varghaei, Ali Akhvan-Safaei, Hadi Seyedi,
and Dr. Yongtao Zhou for the collaborations, discussions, and help through the years.
    I would like to send a special thanks to my family and friends. To my parents, Ângela and
Eduardo, and my sister Isabela, thank you for believing in me since the beginning and for giving
me all the love and support that helped me reach this stage. To my in-laws Vicente and Eliane, and
my sister-in-law Bruna, thank you for your support and trust you had in me through all the years.
    This work could not be possible without the incredible support of my wife, Julia. Thank you
for staying by my side through the most challenging times of our lives, thank you for caring and
providing all the support that our family needs, and for giving me the strength to keep improving
and going after new challenges. To my children, Lucas and Olivia, thank you for all your patience
you showed in the most difficult times, the kindness and pure joy you give to those around you
every day. May this work and all our efforts be a door for a lifetime of happiness that you deserve.
    This work was supported by the department of Mechanical Engineering and the department
of Computational Mathematics, Science and Engineering at Michigan State University. The high-
performance computational resources were provided by the Institute for Cyber-Enabled Research
                                                 vi


(ICER) at Michigan State University.
                                     vii


                                 TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    xi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
CHAPTER 1 INTRODUCTION . . . . . . . . . . . .            . . . . . . . . . . . . . . . . . . .  1
   1.1 Failure as a Stochastic Anomalous Process . . .    . . . . . . . . . . . . . . . . . . .  1
       1.1.1 Fractography Analysis . . . . . . . . .      . . . . . . . . . . . . . . . . . . .  1
       1.1.2 Acoustic Emission . . . . . . . . . . .      . . . . . . . . . . . . . . . . . . .  2
   1.2 Multi-scale Modeling of Material Failure . . . .   . . . . . . . . . . . . . . . . . . .  5
       1.2.1 Fundamentals of Dislocation Theory . .       . . . . . . . . . . . . . . . . . . .  6
       1.2.2 From Nano to Meso-scale . . . . . . . .      . . . . . . . . . . . . . . . . . . .  7
       1.2.3 From Meso to Macro-scale . . . . . . .       . . . . . . . . . . . . . . . . . . .  8
       1.2.4 Nonlocal Models . . . . . . . . . . . .      . . . . . . . . . . . . . . . . . . .  9
       1.2.5 Phase-field Models . . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . 10
       1.2.6 Uncertainty Quantification . . . . . . .     . . . . . . . . . . . . . . . . . . . 12
   1.3 Towards a Predictive Multi-Scale Failure Model     . . . . . . . . . . . . . . . . . . . 15
   1.4 Outline of this Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
CHAPTER 2     ATOMISTIC-TO-MESO MULTI-SCALE DATA-DRIVEN GRAPH SUR-
              ROGATE MODELING OF DISLOCATION GLIDE . . . . . . . . . . .                    . . 23
   2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
   2.2 Data-Driven Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    . . 26
       2.2.1 Molecular Dynamics Simulation of Edge Dislocation Glide . . . . . . .          . . 26
       2.2.2 Graph-theoretical Coarse-graining . . . . . . . . . . . . . . . . . . . .      . . 28
       2.2.3 Construction of the Random Walk . . . . . . . . . . . . . . . . . . . .        . . 30
       2.2.4 Empirical Computation of Rate Constants . . . . . . . . . . . . . . . .        . . 33
   2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . 34
       2.3.1 Convergence of Rate Constant Estimation . . . . . . . . . . . . . . . .        . . 35
               2.3.1.1 Uncertainty quantification of rate estimation . . . . . . . . .      . . 35
       2.3.2 Dislocation Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . .     . . 37
               2.3.2.1 Rate estimation . . . . . . . . . . . . . . . . . . . . . . . . .    . . 37
               2.3.2.2 Surrogate results . . . . . . . . . . . . . . . . . . . . . . . .    . . 40
       2.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . 42
CHAPTER 3     DATA-DRIVEN LEARNING OF CONTINUUM NONLOCAL EQUA-
              TIONS FOR DISLOCATION DYNAMICS . . . . . . . . . . . . . . . .                . . 44
   3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
   3.2 Two-Dimensional Discrete Dislocation Dynamics . . . . . . . . . . . . . . . .        . . 47
       3.2.1 Representative Example: Single Crystal Under Creep . . . . . . . . . .         . . 49
   3.3 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . 53
                                              viii


      3.3.1   Obtaining Data of Shifted Positions . . . . . . . . . . . .    . . . . . . . . .  53
      3.3.2   Density Estimation . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . .  54
              3.3.2.1 Kernel Density Estimation (KDE) . . . . . . . .        . . . . . . . . .  55
              3.3.2.2 Adaptive Kernel Density Estimation (AKDE) . .          . . . . . . . . .  56
  3.4 Nonlocal Transport Models . . . . . . . . . . . . . . . . . . . . .    . . . . . . . . .  58
  3.5 Machine Learning of Nonlocal Kernels for Dislocation Dynamics .        . . . . . . . . .  59
      3.5.1 A Bi-level Machine Learning Framework . . . . . . . . .          . . . . . . . . .  59
              3.5.1.1 Level 1 . . . . . . . . . . . . . . . . . . . . . .    . . . . . . . . .  60
              3.5.1.2 Level 2 . . . . . . . . . . . . . . . . . . . . . .    . . . . . . . . .  61
  3.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . .  63
      3.6.1 Method of Manufactured Solution . . . . . . . . . . . . .        . . . . . . . . .  63
      3.6.2 DDD-Driven Results . . . . . . . . . . . . . . . . . . . .       . . . . . . . . .  64
      3.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . .  69
CHAPTER 4    AN INTEGRATED SENSITIVITY-UNCERTAINTY QUANTIFICATION
             FRAMEWORK FOR STOCHASTIC PHASE-FIELD MODELING OF
             MATERIAL DAMAGE . . . . . . . . . . . . . . . . . . . . . . . . . . .           .  72
  4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  72
  4.2 A Stochastic Damage and Fatigue Phase-Field Framework . . . . . . . . . . . . .        .  75
      4.2.1 Stochastic Discretization . . . . . . . . . . . . . . . . . . . . . . . . . .    .  78
              4.2.1.1 Probabilistic collocation method . . . . . . . . . . . . . . . . .     .  79
  4.3 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  82
      4.3.1 Stochastic Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . .    .  82
      4.3.2 Global Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . .    .  84
      4.3.3 Integrated Sensitivity-Uncertainty Quantification Framework . . . . . . .        .  85
  4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  .  85
      4.4.1 Single-Edge Notched Tensile Test . . . . . . . . . . . . . . . . . . . . .       .  87
              4.4.1.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . .      .  88
              4.4.1.2 Uncertainty and sensitivity analyses . . . . . . . . . . . . . . .     .  89
      4.4.2 Tensile Test Specimen . . . . . . . . . . . . . . . . . . . . . . . . . . .      .  94
CHAPTER 5    DATA-DRIVEN FAILURE PREDICTION IN BRITTLE MATERIALS:
             A PHASE-FIELD BASED MACHINE LEARNING FRAMEWORK . .                            . . 101
  5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
  5.2 Damage and Fatigue Phase-Field Model . . . . . . . . . . . . . . . . . . . . .       . . 103
      5.2.1 Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . .      . . 103
      5.2.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . 105
  5.3 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . 106
      5.3.1 Time-Series Data Generation . . . . . . . . . . . . . . . . . . . . . . .      . . 106
      5.3.2 Label Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    . . 108
              5.3.2.1 Label definition according to load-displacement curve . . . .        . . 108
              5.3.2.2 Label definition according to damage threshold concept . . .         . . 109
  5.4 ML Algorithmic Framework . . . . . . . . . . . . . . . . . . . . . . . . . . .       . . 110
  5.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . 113
      5.5.1 Results with 𝑘-NN . . . . . . . . . . . . . . . . . . . . . . . . . . . .      . . 113
                                              ix


              5.5.1.1 Detection of the Presence of Failure . . . .     . . . . . . . . . . . . 113
              5.5.1.2 Detection of the Location/Pattern of Failure     . . . . . . . . . . . . 116
       5.5.2  Results with ANN . . . . . . . . . . . . . . . . . . .   . . . . . . . . . . . . 117
              5.5.2.1 Detection of the Presence of Failure . . . .     . . . . . . . . . . . . 117
              5.5.2.2 Detection of the Location/Pattern of Failure       . . . . . . . . . . . 119
       5.5.3  Discussion of Deterministic Results . . . . . . . . .    . . . . . . . . . . . . 120
       5.5.4  Uncertainty Quantification . . . . . . . . . . . . . .   . . . . . . . . . . . . 121
              5.5.4.1 Algorithmic randomness . . . . . . . . . .       . . . . . . . . . . . . 121
              5.5.4.2 Noisy data . . . . . . . . . . . . . . . . .     . . . . . . . . . . . . 122
CHAPTER 6 SUMMARY AND FUTURE WORKS . . . . . . . . . . . . . . . . . . . . 124
   6.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
   6.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
                                              x


                                       LIST OF TABLES
Table 2.1 True rates: 200 (forward) and 1 (backward), in units of 𝑠−1 . . . . . . . . . . . . . 36
Table 2.2 True rates: 100 (forward) and 100 (backward), in units of 𝑠−1 . . . . . . . . . . . 36
Table 2.3 Rate estimates from MD data for different values of shear stress, using
          Eq. (2.10), Eq. (2.13) and MLE fit. . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 3.1 Parameter and algorithm errors for the manufactured solution . . . . . . . . . . . 64
Table 3.2 Machine Learning results for two train-test split solutions. . . . . . . . . . . . . 66
Table 3.3 Machine Learning results for different initial guess combinations of 𝛼 and 𝛿. . . 67
Table 4.1 Expected value of stochastic parameters for single-edge notched tensile test. . . . 87
Table 4.2 Expected value of stochastic parameters for tensile test specimen. . . . . . . . . 94
Table 5.1 Parameters used in the representative cases. . . . . . . . . . . . . . . . . . . . . 107
Table 5.2 Illustration of label/class definition for detection of location/pattern of failure. . . 117
Table 5.3 Total classification accuracy mean and standard deviation from algorithmic
          randomness (%). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
                                                  xi


                                       LIST OF FIGURES
Figure 1.1 Micrograph of a fractured surface highlighting the different phases of alu-
           minium and nickel, used in fractography analysis to compute the fractal
           dimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   2
Figure 1.2 Acoustic emission signals from single crystal ice under creep. The energy
           bursts caused by the collective dislocation motion appear in the form of
           intermittent signals that scale as a power-law of exponent 𝜏 = 1.6, independent
           of the applied stress. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Figure 1.3 (a) Polycrystalline microstructure with different grain sizes, averages of ⟨𝑑⟩ =
           0.26 𝑚𝑚 (top-left), 0.87 𝑚𝑚 (top-right), 1.92 𝑚𝑚 (bottom-left), and 5.02 𝑚𝑚
           (bottom-right). (b) Distribution of avalanche size from dislocation motion
           in polycrystals for crystals with different grain sizes. A tempered power-law
                                           −𝛽
           fit of the form 𝑃(> 𝐴0 ) = 𝐴0 exp(−𝐴0 /𝐴𝑐 ) is used to estimate the cut-off
           amplitude 𝐴𝑐 . The power-law exponent across all samples is similar. At the
           bottom, the relation for coarse-grained samples. . . . . . . . . . . . . . . . . .     4
Figure 1.4 Time-series of spatially averaged crack-front velocity (a), and the respective
           scalings of burst duration distribution (b), size distribution (c), and their
           mutual scaling (d). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
Figure 1.5 Random-fuse model (a), and the fractured samples under different values of
           𝛽, for percolated, disordered media (b), critical dynamics (c), and finally
           leading to nucleated cracks for large 𝛽 (d). . . . . . . . . . . . . . . . . . . . .   5
Figure 1.6 An edge dislocation characterized by an extra half-plane of atoms (a), and a
           screw dislocation obtained by a twisted displacement (b) [1]. . . . . . . . . . .      6
Figure 1.7 Example of construction of Burgers circuit, and Burgers vector definition for
           an edge dislocation [1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
Figure 1.8 Illustration of the multi-scale failure analysis framework proposed. At each
           scale we learn the physics of their underlying processes, and obtain their
           corresponding stochastic models. At the macro-scale, robust uncertainty
           quantification and machine learning algorithms will detect the presence of
           damage and predict the material failure. . . . . . . . . . . . . . . . . . . . . . 18
Figure 2.1 Framework for constructing a network-based KMC surrogate model for dislo-
           cation glide. The surrogate is then employed for fast and accurate simulations
           of dislocation motion, obtaining velocity data at different stress levels, leading
           to the estimation of the dislocation mobility. . . . . . . . . . . . . . . . . . . . 27
                                                 xii


Figure 2.2 MD domain of the dislocation mobility test. (a) 𝑥 − 𝑦 plane, illustrating the
           edge dislocation core as the lattice perturbation at the center. (b) 3D view of
           the MD domain with the BCC lattice removed, showing the dislocation line
           along the 𝑧-axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Figure 2.3 (a) Temperature and total energy for the equilibration step, (b) Edge dislo-
           cation position x 𝑀 𝐷 (𝑡) and (c) mobility through MD simulations for distinct
           values of applied shear stresses 𝜏 under 𝑇 = 750 [𝐾]. We observe an over-
           damped motion for the applied shear stress range and a linear mobility relationship. 29
Figure 2.4 Convergence to true rates (y-axis) as a function of number of realizations
           with fixed time-steps (a), or number of time-steps with fixed realizations (b).
           Dashed lines are the true rates (200 and 1), solid lines are the expected rates,
           and the shaded areas are the regions of uncertainty based on standard deviation.      37
Figure 2.5 Normalized histograms of waiting times between forward (a) and (d), back-
           ward (b) and (e), and any jump (c) and (f), along an exponential fitted curve
           resulted from MLE parameter estimation for 𝜏 = 25 𝑀 𝑃𝑎 (top row) and
           𝜏 = 100 𝑀 𝑃𝑎 (bottom row). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 2.6 Convergence in the jump rates from MD time-series data for different stress
           levels. We observe a more steady and monotonic trend with higher stress levels.       40
Figure 2.7 Position versus time of edge dislocation, comparison between MD results
           from LAMMPS and one realization of surrogate model through the random
           walk on a network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Figure 2.8 Normalized histograms of velocity estimates from different applied shear
           stresses. Gaussian fit is plotted after computation of expectation E [𝑣] and
           standard deviation 𝜎 2 [𝑣] from 1000 MC realizations. . . . . . . . . . . . . . . 41
Figure 2.9 Velocity versus stress plot, comparison between MD results of dislocation
           glide from LAMMPS, and surrogate model simulations using a random walk
           in a network under two different system temperatures. The surrogate model
           accurately estimates the mobility with 1.29% relative error. . . . . . . . . . . . 42
Figure 3.1 Dislocation distribution at the beginning of the simulation (a), and after the
           relaxation (b) in a metastable structure for Case 1. Red and blue markers
           correspond to dislocations with positive and negative Burgers, respectively. . . . 51
Figure 3.2 Time-series plots of collective velocity, number of dislocations in the system,
           and plastic strain during the relaxation steps to show the system’s stabilization.    51
                                               xiii


Figure 3.3  Time-series plots of collective velocity, accumulated plastic strain, and num-
            ber of dislocations in the system for a single realization of creep test for all
            three cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Figure 3.4  Probability Density Function of dislocation velocity for Cases 1, 2, and 3.
            We observe a power-law scaling of order 𝛽 = 2.4 for Cases 2 and 3 with
            multiplication, and a sharper decay for Case 1. . . . . . . . . . . . . . . . . . . 52
Figure 3.5  Time-series of the total number of dislocations across the 𝑛𝑟 = 2000 realiza-
            tions of DDD for Cases 1, 2, and 3. We highlight the selected data for training
            and testing the ML algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Figure 3.6  Evolution of mean and skewness factor of 𝑝ˆ1 (𝑥). The evolution of mean (a)
            before, and (b) after the symmetrization and re-normalization. The evolution
            of skewness factor (c) before, and (d) after symmetrization and re-normalization. 58
Figure 3.7  Final shape of dislocation shifted position PDF from AKDE with symmetriza-
            tion and re-normalization at selected time-steps. . . . . . . . . . . . . . . . . . 58
Figure 3.8  Iteration errors (training and testing), and the solution path of the ML algo-
            rithm when solving the inverse problem of a manufactured solution with a
            known kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 3.9  Evolution of training and testing MLAE values computed over the NM iterations. 66
Figure 3.10 Solution path from different combinations of initial guess. . . . . . . . . . . . . 67
Figure 3.11 Final kernel shapes from optimized parameters obtained trough the ML algo-
            rithm, scaled by 𝛿. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 3.12 Simulation of the nonlocal model for the whole time-interval of available
            data, highlighting the initial PDF, and the final distributions of the true data
            and the nonlocal prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 4.1  Left: Geometry and boundary conditions for single-edge notched tensile test.
            Right: Finite element mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Figure 4.2  Univariate simulations: expectation (a) and standard deviation (b) error be-
            tween MC results with 104 samples and solutions from PCM with different
            number of collocation points. We observe that with 4 points we obtain com-
            parable accuracy in PCM. Expectation (c) and standard deviation (d) PCM
            convergence, where the reference for error computation is a solution with 100
            collocation points in the univariate case. The convergence rate is close to linear.    88
                                                 xiv


Figure 4.3 (a) and (b): Comparison between MC results with 104 samples, and solutions
           from PCM with different number of realizations in the multivariate case. With
           higher dimensions the advantage of PCM over MC becomes more evident,
           with only 3 points needed in each dimension to stabilize the error in both
           expectation (a) and standard deviation (b). (c) and (d): Convergence of
           damage field on multivariate PCM simulations of notched geometry, where
           the reference for error computation is a solution with 6 collocation points in
           each dimension. We obtain linear convergence for both expectation (c) and
           standard deviation (d). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Figure 4.4 Damage phase-field expectation (top) and standard deviation (bottom) after
           crack propagation taking 𝛾 as random input. From tensile load the crack
           propagates in Mode-I as expected. 𝛾 has influence around the crack path,
           because it controls the diffusion of damage. Once the crack propagates and
           the expected value is 1 in the crack path, the uncertainty vanishes in the
           cracked region. However, the deviation around the crack tip grows with time. . . 90
Figure 4.5 Time evolution of damage phase-field expectation and standard deviation
           profiles at the crack path line taking 𝛾 as random input. From damage
           expectation we observe the crack tip as a moving interface. The standard
           deviation peak follows the advecting boundary and grows in time, which
           makes the expected interface less sharp. . . . . . . . . . . . . . . . . . . . . . 91
Figure 4.6 Damage phase-field standard deviation after crack propagation in univariate
           uncertainty quantification. Fatigue parameter 𝑎 and viscous damping 𝑏 do
           not propagate uncertainty as much as Griffith energy 𝑔𝑐 and rate of change of
           damage parameter 𝑐. Since the crack path is defined by geometry, the majority
           of uncertainty is related to the speed of crack propagation, controlled mostly
           by 𝑔𝑐 and 𝑐. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 4.7 Expected sensitivity fields with respect to each input parameter. Similarly
           to standard deviation fields, local sensitivity results also point to 𝑐 and 𝑔𝑐 ,
           related to propagation speed, as parameters with more sensitive output, since
           we have a specific crack initiation location and path. . . . . . . . . . . . . . . . 92
Figure 4.8 Time evolution of damage phase-field expectation and standard deviation
           profiles at the crack path line when propagating the uncertainty of all 5 random
           parameters. We observe that the combined effect of all parameters results in
           a larger standard deviation around the crack tip at final time, comparable to
           peak values of 𝑐 and 𝑔𝑐 uncertainties. . . . . . . . . . . . . . . . . . . . . . . . 93
                                                 xv


Figure 4.9  Notched tensile total damage deviation field and sensitivity indices (𝑆 𝑗 ) fields
            for all parameters using 6 points in each dimension at final time 𝑇 = 0.5 𝑠.
            Ahead of the crack, 𝑔𝑐 and 𝑐 are the most influential parameters to total
            damage field variance. The remaining parameters have little participation at
            the most uncertain region of the geometry. . . . . . . . . . . . . . . . . . . . . 94
Figure 4.10 Notched tensile total damage deviation field and total effect sensitivity indices
               𝑗
            (𝑆𝑇 ) fields for all parameters using 6 points in each dimension at final time
            𝑇 = 0.5 𝑠. When we combine parameter effects and include their interactions
            the dominant sensitivity at the crack tip gets carried out to all parameter
            indices. In the remaining regions, the sensitivity index is uniform except for
            𝛾: the diffusion coefficient is more influential throughout the specimen. . . . . . 95
Figure 4.11 Top: Geometry and boundary conditions for tensile test specimen. Bottom:
            finite element mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Figure 4.12 Convergence of damage field in multivariate PCM simulations for tensile
            test specimen, where the reference for error computation is a solution with 6
            collocation points in each dimension. We have lower convergence rates when
            compared to notched geometry due to due to more uncertainty of crack location. 96
Figure 4.13 Damage phase-field expectation taking all parameters in 𝜉 as random inputs.
            From tensile load we see the appearance of 4 possible crack initiation points,
            based on the stress concentration profile from the geometry. The expected
            solution at final time gives a curved crack path at both sides of the geometry. . . 97
Figure 4.14 Damage phase-field deviation taking taking all parameters in 𝜉 as random
            inputs. We have regions of uncertainty around all 4 points of possible crack
            initiation. At final time, the the uncertainty vanished where the crack prop-
            agated, and the maximum deviation around the crack is more than 30% of
            maximum damage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Figure 4.15 Local sensitivity expectation fields with respect to each input parameter.𝛾, 𝑔𝑐
            and 𝑎 are the most sensitive parameters, with the same absolute range. 𝑏 is
            not sensitive in the range considered and 𝑐 is less sensitive than in the notched
            case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Figure 4.16 Tensile test specimen total damage deviation field and sensitivity indices
            (𝑆 𝑗 ) fields for all parameters using 6 points in each dimension at final time
            𝑇 = 0.5 𝑠. Differently than in the notched case, here 𝛾 and 𝑔𝑐 are the most
            influential parameters in the region of higher uncertainty. . . . . . . . . . . . . 98
                                                  xvi


Figure 4.17 Tensile test specimen total damage deviation field and total effect sensitivity
                       𝑗
            indices (𝑆𝑇 ) fields for all parameters using 6 points in each dimension at final
            time 𝑇 = 0.5 𝑠. With the combined effect of all parameters, we still have 𝛾
            and 𝑔𝑐 as having most influence in the uncertainty regions. . . . . . . . . . . . 99
Figure 5.1  Description of geometry and boundary conditions for the tensile test speci-
            men, along with finite element mesh and sensor layout for time-series gener-
            ation. We highlight two sensor nodes that show different time-series behaviors. . 106
Figure 5.2  Damage phase-field for each representative failure case. By changing the
            parameters 𝛾, 𝑔𝑐 , and 𝑐, we observe different failure types (distinct crack
            positions and paths), as well as varying dynamics. . . . . . . . . . . . . . . . . 107
Figure 5.3  Damage phase-field time-series data for three cases, showing the different
            evolution of 𝜑 depending on the virtual sensor node position. . . . . . . . . . . 108
Figure 5.4  (Left) Load-displacement curve for case 1, where we identify the three points
            where the labels change from 0 to 1, according to different criteria. (Right)
            Respective damage phase-fields corresponding to the positions indicated in
            the curve. We note that Label type 3, based on a threshold of 90% of maximum
            force, lies between the first two criteria. In Label 1, damage field is still too
            smooth, while in Label 2, failure is far too advanced. . . . . . . . . . . . . . . . 110
Figure 5.5  Schematic illustration of the proposed ML framework. A pattern recognition
            scheme is introduced to represent time-series data of damage degradation
            function 𝑔(𝜑) = (1 − 𝜑) 2 extracted at sensing nodes as a pattern. The 𝑘-NN
            and ANN algorithms are employed for failure classification using recognized
            patterns. In 𝑘-NN analysis the classification is performed by determining
            the 𝑘-nearest vote vector. An ANN provides a map between the inputs and
            outputs through determination of the weights using input and output patterns. . 111
Figure 5.6  𝐾-NN classification accuracy with different number of 𝑘: (a) Accuracy based
            on multiple label Type 3, (b) Accuracy based on multiple label Type 4. . . . . . 114
Figure 5.7  𝐾-NN classification results for failure case 3 with different size of data subsets
            and multiple label Types 3 & 4. . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Figure 5.8  𝐾-NN classification results for different failure cases based on label Types 3 & 4. 115
Figure 5.9  Confusion matrix on test data with 𝑘-NN: (a) Case 5 and multiple label Type
            3, (b) Case 1 and multiple label Type 4. . . . . . . . . . . . . . . . . . . . . . . 115
Figure 5.10 𝐾-NN classification accuracy with different number of 𝑘 based on multiple
            label Types 3, and 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
                                                 xvii


Figure 5.11 Confusion matrix with 𝑘-NN classification results for detection of loca-
            tion/pattern of failure based on multiple label Type 4. . . . . . . . . . . . . . . 117
Figure 5.12 ANN classification results for different failure cases based on multiple label
            Types 3 & 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure 5.13 Confusion matrix on test data with ANN: (a) Case 5 and multiple label Type
            3, (b) Case 1 and multiple label Type 4. . . . . . . . . . . . . . . . . . . . . . . 118
Figure 5.14 Confusion matrix with ANN classification results for detection of loca-
            tion/pattern of failure based on multiple label Type 4. . . . . . . . . . . . . . . 119
Figure 5.15 Mean total classification accuracy and standard deviation for (a) Detection of
            failure presence, case 3 and , (b) Detection of failure location. . . . . . . . . . . 123
                                                xviii


                                 LIST OF ALGORITHMS
Algorithm 2.1 Kinetic Monte Carlo method for Dislocation Glide as a Random Walk on
              a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Algorithm 3.1 Bi-Level Machine Learning Algorithm with Nelder-Mead Minimization. . . 62
Algorithm 4.1 Stochastic Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 86
Algorithm 4.2 Global Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Algorithm A.1 Semi-implicit time integration scheme . . . . . . . . . . . . . . . . . . . . 136
                                             xix


                                              CHAPTER 1
                                          INTRODUCTION
1.1    Failure as a Stochastic Anomalous Process
The initial works on fracture mechanics by Griffith [2] laid the ground of many years of development
of failure analysis through the theory of Linear Elastic Fracture Mechanics (LEFM). Although
descriptive, Griffith’s theory relied heavily in the assumption of homogeneous materials which,
consequently, originated smooth and continuous characteristics of propagating cracks [3]. However,
in reality failure processes are not deterministic and fracture surfaces are naturally rough, thus
limiting the extent of failure prediction based solely on LEFM models. Over the last three decades,
a solid understanding of the nature of failure in heterogeneous media (a more realistic assumption
for material models) has been developed and matured with the aid of fractography analysis and
Acoustic Emission (AE) techniques.
1.1.1   Fractography Analysis
The onset of quantitative studies of failure in heterogeneous and disordered media can be traced back
to Mandelbrot [4], where the analysis of fractured surfaces revealed an inherent fractal dimension.
The fractal dimension in an index that characterizes the complexity of patterns by measuring the
ratio of change in detail due to the change in scale. In their study, Mandelbrot et. al. showed that
the same fracture dimension increments are found from computing the fractal dimension of fracture
boundaries and surfaces.
    Further fractography studies corroborated quantitatively the fractal characteristics of fracture
surfaces [5–10]. In [5], experimental computation of fractal dimension of brittle fracture surfaces
showed a direct correspondence between fractal dimension and toughness. Universality of fracture
dimension of fractured surfaces was investigated in [6] (Fig. 1.1). After applying different heat
treatments to aluminum alloys that lead to distinct fracture characteristics and fracture toughness,
                                                    1


Figure 1.1 Micrograph of a fractured surface highlighting the different phases of aluminium and
nickel, used in fractography analysis to compute the fractal dimension.
they showed within experimental error that fractal dimension was independent of the heat treatment
or fracture mode.
1.1.2   Acoustic Emission
During an Acoustic Emission (AE) test, irreversible changes in the material’s microstructure
undergoing a failure process cause the emission of acoustic (elastic) waves that can be recorded
through sensors. The use of AE in fracture experiments showed interesting phenomena that could
not be accounted for in classical LEFM models, indicating an anomalous process was taking place
inside the material. For instance, intermittent energy signals, power-law distribution of energy
spectrum, crackling noise, can all be observed during an AE test, all of which are indicators that
avalanches and anomalous dynamics are occurring. Such anomalous statistics can be traced back
to two main contributors: dislocation motion, and fracture.
    Evidence of dislocation avalanches in crystal plasticity has been widely studied through AE
experiments [11–15]. In [11], creep experiments in single ice crystals led to intermittent energy
                                                 2


Figure 1.2 Acoustic emission signals from single crystal ice under creep. The energy bursts caused
by the collective dislocation motion appear in the form of intermittent signals that scale as a power-
law of exponent 𝜏 = 1.6, independent of the applied stress.
signals that follow a power-law scaling due to the collective motion of dislocation populations (Fig.
1.2). They compared the statistical results from a numerical model of dislocation dynamics with
the experimental acoustic data. Similar power-law decay of dislocation activity has been observed
through phase-field models of dislocation dynamics [12].
    Dislocation avalanches in polycrystalline materials was studied extensively in [13]. They show
that avalanches in intermittency is also present in polycrystals, where avalanche sizes are bounded
by the grain sizes (Fig 1.3). Yet, they propose that the accumulation of internal stresses near the
grain boundaries due to dislocation activity triggers further avalanches in neighboring grains.
    From a different perspective, AE tests also showed anomalous features in brittle fracture [16–20].
encourages the search for a unified model that can effectively incorporate the heavy-tail statistics
into a predictive model. Ref. [16] showed that energy bursts from AE tests of microfracture follow
a power-law scaling, corroborating the observations also made in the time correlation function.
    Bonamy et. al. [18] studied crack growth in heterogeneous materials through a stochastic
LEFM model to account for the material heterogeneity. They have observed that crack propagation
velocity is intermittent, with activity burst duration and size scaling as power-laws (Fig. 1.4). The
jerky dynamics observed in the simulations are referred to as the crackling noise from experiments.
    A random-fuse model was used in [19] to study the transition of percolation, to avalanche,
                                                   3


                              (a)                                         (b)
Figure 1.3 (a) Polycrystalline microstructure with different grain sizes, averages of ⟨𝑑⟩ = 0.26 𝑚𝑚
(top-left), 0.87 𝑚𝑚 (top-right), 1.92 𝑚𝑚 (bottom-left), and 5.02 𝑚𝑚 (bottom-right). (b) Distribu-
tion of avalanche size from dislocation motion in polycrystals for crystals with different grain sizes.
                                                        −𝛽
A tempered power-law fit of the form 𝑃(> 𝐴0 ) = 𝐴0 exp(−𝐴0 /𝐴𝑐 ) is used to estimate the cut-off
amplitude 𝐴𝑐 . The power-law exponent across all samples is similar. At the bottom, the relation
for coarse-grained samples.
Figure 1.4 Time-series of spatially averaged crack-front velocity (a), and the respective scalings of
burst duration distribution (b), size distribution (c), and their mutual scaling (d).
                                                    4


Figure 1.5 Random-fuse model (a), and the fractured samples under different values of 𝛽, for
percolated, disordered media (b), critical dynamics (c), and finally leading to nucleated cracks for
large 𝛽 (d).
to brittle failure in disordered media. The fuse model was parameterized by specifying failure
resistance following a power-law probability 𝐹 (𝑥) = 𝑥 −𝛽 , where 𝛽 is the disorder parameter.
The systems transitions from completely disordered (small 𝛽) subject to percolation, to a critical
avalanche regime, up to a brittle failure region under low disorder (high 𝛽) and larger system size
(Fig. 1.5). They point that critical phenomena leads to nucleation in the limit of long length scales.
1.2     Multi-scale Modeling of Material Failure
Multi-scale modeling of failure and fracture has been a topic of interest for many decades. Several
complementary methodologies have been presented for the coupling between the scales, given the
wide range of time and length scales involved during the failure process. A major difficulty is to
properly address the stochastic nature of failure and to account for anomalous phenomena, such
as nonlocality, in the upper scales without losing the physics from the microstructure. We briefly
describe major advances in the literature on the multi-scale modeling of material failure, starting
from the basic definitions of dislocation theory.
                                                  5


                       (a) Edge dislocation.                 (b) Screw dislocation.
Figure 1.6 An edge dislocation characterized by an extra half-plane of atoms (a), and a screw
dislocation obtained by a twisted displacement (b) [1].
1.2.1   Fundamentals of Dislocation Theory
We present some definitions based on [1]. There are two canonical types of dislocations:
    • Edge dislocations: can be seen as a half-plane of atoms being removed from the crystalline
      structure, displacing the atoms along the missing plane. This creates a linear defect, called
      an edge dislocation, with stronger deflection and stresses being closer to the dislocation line.
    • Screw dislocations: screw dislocations are obtained by sliding the two halves of the crystal
      in opposite directions with respect to the center.
    Fig. 1.6 illustrates the concepts of edge and screw dislocations. In practice, dislocations inside
crystalline materials have mixed character between the edge and screw types.
    An important characterization of dislocations are in the form of the Burgers vector, 𝒃. The
Burgers circuit in a crystal is an atom-to-atom path forming a closed loop. However, around
dislocations the same path will not close the loop, and the 𝒃 characterizes the missing vector
needed to complete the Burgers circuit around the dislocation. The magnitude 𝑏 of burgers vectors
in cubic crystals is measured with respect to the atomic spacing 𝑎, such that
                                                     √
                                                   𝑎 3
                                               𝑏=      .                                         (1.1)
                                                     2
                                                   6


Figure 1.7 Example of construction of Burgers circuit, and Burgers vector definition for an edge
dislocation [1].
    Dislocations must always form a close loop, end at a free surface, or merge with other dis-
locations, but never end inside the crystal. This condition leads to Burgers vector conservation
           Í
such that 𝑖 𝒃𝑖 = 0. The Burgers vector of an edge dislocation is illustrated in Fig. 1.7. For edge
dislocations, the Burgers vectors is perpendicular to the dislocation line tangent vector, while they
are parallel in screw dislocations.
    Finally, the forces acting on a dislocation line with direction 𝜉 tangent to the dislocation, under
a stress field 𝝈 is called a Peach-Koehler force 𝒇 𝑃𝐾 , given by the expression
                                          𝒇 𝑃𝐾 = (𝝈 · 𝒃) × 𝜉.                                      (1.2)
1.2.2   From Nano to Meso-scale
At the nano-scale, atomistic simulations are usually carried out through Molecular Dynamics (MD)
simulations. Specifically concerning failure problems, MD simulations have been used to simulate
dislocation motion and interaction, and fracture problems with dislocation emission. Three-
dimensional MD simulations of Mode-I crack propagataion was performed in [21], providing
visual evidence of dislocation emission from the crack tip. Recent advances in the literature
provided ways to understand the dislocation behavior at the atomistic scale and connect it to more
complex dislocation interactions. In [22], they studied formation of dislocation junctions and
quantified the junction stress to feed a meso-scale dislocation dynamics model. Further works on
atomistic and meso-scale coupling were developed to quantify the elasticity parameters, mobility,
                                                   7


and core energy of dislocations [23]. The issue of quantifying the uncertainties from measures
at the atomistic scale has recently been getting attention [24–26], but the problem of effectively
upscaling material properties and propagating the associated uncertainties remains a challenge.
1.2.3    From Meso to Macro-scale
At the meso-scale, collective dislocation motion gives origin to anomalous features in the form
of dislocation avalanches and power-law energy bursts. Discrete Dislocation Dynamics (DDD)
[11, 27] models have successfully captured many anomalous characteristics at the meso-scale,
while a meaningful connection of dislocation dynamics to macro-scale mechanical failure is still
missing. Recent attempts have used machine learning to provide a connection from micro-structure
to the continuum through the computation of free-energy [28–30].
    At the macro-scale, state-of-the-art phase-field models efficiently simulate qualitative aspects
of brittle [31], ductile [32], dynamic [33], and fatigue [34] failure in complex geometries, and under
arbitrary loading conditions. However, such models fail to incorporate the anomalies associated
with failure processes. On the other hand, the statistics of fracture can be quantitatively recovered
in lattice models, such as Random Fuse Models (RFM) [35], percolation theory [36] and stochastic
LEFM [3]. Despite the quantitative agreement, those models often are limited to simple geometries
and boundary conditions. The pursuit of a unified framework for quantitative and qualitative failure
prediction is a remaining challenge.
    We now discuss in more detail three components that form a major part of the proposed
framework when considering predictive failure modeling and coupling between the meso- and
macro-scales. In the context of modeling of collective dislocation dynamics, nonlocality is a
dominant effect observed experimentally, yet not studied under formal nonlocal calculus definitions.
Therefore, we comment on basic principles of nonlocal calculus and its applications to nonlocal
differential equations. Moving to the topic of damage modeling, we discuss some key aspects of
phase-field models and why they are attractive. Last, we comment on methodologies to perform
forward uncertainty propagation that will later be applied in all major components of the multi-scale
                                                    8


framework.
1.2.4    Nonlocal Models
Nonlocal operators are an elegant alternative for continuum modeling of complex media, such as
heterogeneous materials, anomalous diffusion, and fracture, due to their ability to capture long-
range interactions. In contrast to their local counterpart, nonlocal models are based on integral
operators, leading to challenging numerical implementation. Recent works greatly advanced the
field of nonlocal modeling towards a formal and rigorous nonlocal vector calculus [37, 38].
    We can formulate a general nonlocal problem as follows. Let Ω ⊂ R𝑛 denote a bounded domain
with non-zero volume. The nonlocal diffusion operator L acting on a function 𝑢(𝒙) : Ω → R is
defined as [39]:
                                 ∫
                      L𝑢(𝒙) = 2       (𝑢( 𝒚) − 𝑢(𝒙))𝛾(𝒙, 𝒚)𝑑 𝒚        ∀𝒙 ∈ Ω ⊆ R𝑛 .
                                   R𝑛
    The kernel 𝛾(𝒙, 𝒚) is a non-negative symmetric mapping. The general nonlocal diffusion
equation is then written as
                                     
                                      𝑑𝑢
                                       𝑑𝑡 − L𝑢 = 𝑓          on Ω
                                     
                                     
                                     
                                     
                                     
                                     
                                     
                                     
                                      𝑢 = 𝑢𝑑             on Ω𝑙
                                     
                                     
                                     
                                     
                                      𝑢(0) = 𝑢 0          on Ω,
                                     
                                     
                                     
where Ω𝑙 denotes an interaction domain disjoint from Ω, parameterized by a horizon of interactions
of length 𝛿.
    Under specific choices of kernel and horizon, the general nonlocal operator may take the form
of the fractional Laplacian. The fractional Laplacian of order 𝑠 on a bounded domain Ω is defined
through Riesz representation as
                                            ∫
                                𝑠               𝑢(𝒙) − 𝑢(𝒚)
                            (−Δ) 𝑢 = 𝑐 𝑛,𝑠                       ,  0 < 𝑠 < 1,
                                             R𝑛  | 𝒚 − 𝒙| 𝑛+2𝑠
where 𝑐 𝑛,𝑠 is a normalizing constant such that
                                                         Γ( 𝑛+2
                                                              2 )
                                      𝑐 𝑛,𝑠 = 𝑠22𝑠
                                                    Γ( 12 )Γ(1 − 𝑠)
                                                     9


    If the kernel 𝛾(𝒙, 𝒚) of the nonlocal diffusion operator is chosen to be [39]
                                                        𝑐 𝑛,𝑠
                                        𝛾(𝒙, 𝒚) =                 ,                              (1.3)
                                                   2| 𝒚 − 𝒙| 𝑛+2𝑠
then
                                      −L = (−Δ) 𝑠 ,     0 < 𝑠 < 1,
    Nonlocal and fractional models have been successfully used in material failure simulation over
the last decades. Peridynamic (PD) theory [40] has been introduced as an alternative to classical
continuum mechanics modeling of solid structures and failure. Due to the nonlocal character of
PD, the problem of discontinuities in crack representation was overcome, and PD has successfully
represented many classes of material failure problems, including visco-plasticity [41], fatigue [42],
composites [43], polymers [44], multi-physics problems [45], among many others. Fractional
models have been used extensively in modeling visco-elastic and visco-elasto-plastic material
responses [46, 47].
    With the advance of Machine Learning techniques and computational resources, the field of
nonlocal modeling has seen increasing efforts in the parameter identification of nonlocal operators,
specially in the data-driven construction of the nonlocal kernels which cannot be defined beforehand.
In [48], nonlocal Physics Informed Neural Networks, nPINNs (the nonlocal version of PINNs [49])
were used to solve the forward nonlocal Poisson problem, and the inverse parameter identification of
the fractional order and horizon of the nonlocal kernel given available data. This approach combines
the robustness of Deep Neural Networks with physics-based constraints from the nonlocal problem.
Other works employed alternative approaches for learning nonlocal kernels, with notable examples
on peridynamics [50], homogenization [51], constitutive laws [52], and molecular dynamics [53].
1.2.5    Phase-field Models
Phase-field models were first developed to solve fluid separation problems [54]. The success of
Cahn-Hilliard equation in interfacial dynamics allowed for its expansion into other research areas
                                                   10


that model sharp interface and moving boundaries, including solidification [55], tumor growth [56],
two-phase complex fluid flow [57] and fluid-structure interaction [58].
    The core idea of phase-field modeling is to define the phase variable (sometimes called order
parameter) 𝜂 to be a smoothly-varying indicator function of any particular field quantity of interest,
such as concentration, damage, composition, or a classical phase definition, as observed in phase
transformation problems. Evolution equations for the phase-field variable are usually derived
following variational principles that promote thermodynamic consistency through the definition of
free-energy potentials.
    In general form, the free-energy function for a phase-field problem may read:
                                              ∫
                                  𝐹 [𝜑, Γ] =       𝐾 |∇𝜑| 2 + 𝑓 (𝜑, Γ),                         (1.4)
                                                 Ω
where 𝜑 denotes the phase-field variable, and Γ represents any other variables that influence the
free-energy. We denote 𝐾 as the diffusion constant, and 𝑓 (𝜑, Γ) is a function that describes the
evolution of free-energy in the bulk of each material phase. The interpretation of Eq. (1.4) is a
competition between the bulk energy from 𝑓 and a mixing energy from the gradient term.
    The most famous phase-field models, the Cahn-Hilliard [54] and Allen-Cahn [55] equations,
follow the variational principles described above to obtain evolution equations for phase variables.
Both equations use a double-well potential for 𝑓 , while major difference between the two is that
Cahn-Hilliard equation is conservative and Allen-Cahn is non-conservative. As a consequence,
Cahn-Hilliard equation is fourth-order
                                  𝑑𝜑(𝑥, 𝑡)
                                            = ∇ · (𝑀 (𝑥, 𝑡)∇𝜇(𝑥, 𝑡)),                           (1.5)
                                     𝑑𝑡
where 𝜇(𝑥, 𝑡) denotes the chemical potential
                                   𝜇(𝑥, 𝑡) = 𝑓 ′ (𝜑, Γ) − 𝜖 2 Δ𝜑(𝑥, 𝑡).                         (1.6)
    On the other hand, Allen-Cahn equation is second-order
                            𝑑𝜑(𝑥, 𝑡)
                                      = ∇ · (𝑀 (𝑥, 𝑡)∇𝜑(𝑥, 𝑡)) + 𝑓 ′ (𝜑, Γ).                    (1.7)
                               𝑑𝑡
                                                   11


    The non-conservative characteristics of Allen-Cahn equation are specially interesting in the
development of damage models, since damage itself is non-conservative and appears in virgin
portions of bulk materials as they undergo mechanical stresses and strains. The development of
Allen-Cahn type phase-field equations for evolution of material damage has gained much attention
recently, with notable applications in [31], ductile [32], and dynamic [33] fracture. Although they
originated from crack regularization techniques, they obey the same variational principles used
to derive the original phase-field equations, such as [34]. The Allen-Cahn equation for damage
evolution in this case, in one dimension, is originated from the free-energy functional
                                                             𝛾
                     Ψ(∇𝑢, 𝜑, ∇𝜑, F ) = 𝑑 (𝜑)𝑌 (∇𝑢) 2 + 𝑔𝑐     (∇𝜑) 2 + K (𝜑, F ),            (1.8)
                                                             2
where 𝑌 denotes the Young modulus, 𝑢 represents the displacements, 𝑔𝑐 is the fracture energy
release rate, and 𝛾 > 0 represents the phase-field layer width parameter, 𝑑 (𝜑) is a degradation
function acting on the bulk of the material, and K (𝜑, F ) models the evolution of material damage
when fatigue F evolves from 0 to the Griffith energy 𝑔𝑐 . The thermodynamically consistent
formulation therefore provides the evolution equation as
                   𝑑𝜑                                       1
                 𝜆     = 𝛾𝑔𝑐 Δ𝜑 + (1 − 𝜑)(∇𝑢)𝑇 𝑌 (∇𝑢) − [𝑔𝑐 H ′ (𝜑) + F H ′𝑓 (𝜑)],            (1.9)
                   𝑑𝑡                                       𝛾
where H and H 𝑓 are potentials that govern the evolution of damage associated to the evolution
of material fatigue, and 𝜆 is a positive coefficient. We see the presence of the interfacial term
in the form of Laplacian acting as a smoothening agent to the damage profile, and the bulk term
responsible for the internal creation of material damage from mechanical stress and fatigue. The
main advantage of this formulation in contrast with traditional fracture mechanics modeling is the
automatic crack profile characterization by the phase-field variable, without the need to explicitly
track the crack path.
1.2.6   Uncertainty Quantification
The goal of Uncertainty Quantification (UQ) in materials science is to study the propagation of
uncertainties from inputs to outputs or vice-versa. The former is known as Forward UQ, while
                                                  12


the latter is named Inverse UQ, concerned with quantifying the probability distribution associated
to parameter values when one has output data. Forward UQ aims to propagate uncertainty from
inputs to outputs, with the objective of computing the moments (mean, variance), of quantities of
interest (QoI), their probability distribution, or the reliability of QoI predictions.
    There are two basic types of uncertainty: aleatoric and epistemic [59]. Aleatoric uncertainty
concern stochastic or probabilistic uncertainties that are typical to the underlying problem and
which cannot be further understood by physical or experimental knowledge, yet it could benefit
from extra characterization, such as in the case of non-physical parameters. Aleatoric uncertainty
can naturally incorporated into a framework. Epistemic uncertainty, on the other hand, originate
from simplifying assumptions or missing physics. Modeling and numerical errors are typically
understood as being epistemic.
    For forward input uncertainty quantification several methods could be applied. The Monte Carlo
method consists in computing the moments of the output through random sampling of the stochastic
space. It provides an unbiased estimation of the moments of distribution, yet they converge with
order O (𝑁 −1/2 ), where 𝑁 is the number of realizations. Perturbation methods are also popular in
forward UQ, consisting in writing expressions for the moments of output distribution through a
Taylor expansion around the expected parameter values [60, 61]. Alternatively, one could construct
surrogate models as an alternative to expensive methods such as Monte Carlo. The method of
Polynomial Chaos Expansion (PCE) [62, 63] consists in expressing the output of the stochastic
model as a series expansion of input parameters.
    More promising methods for higher-dimensional stochastic space have been proposed, such as
stochastic Galerkin methods [64, 65]. However, one of the most efficient approaches proposed
recently involve non-intrusive collocation methods, such as the Probabilistic Collocation Method
(PCM) [66, 67]. PCM consists in approximating the solution in the stochastic space by polynomial
expansion, such that the moments of QoI can be efficiently computed by sampling in the colloca-
tion/integration points. Therefore, one keeps the independent sampling typical of methods such as
MC, yet with much faster convergence rates. In the event of high-dimensional stochastic space, one
                                                    13


could use Sparse Grids [68] to reduce the computational load.
    Let (Ω, F , P) be a complete probability space, where Ω𝑠 is the space of outcomes 𝜔, F is the
𝜎−algebra and P is a probability measure, P : F → [0, 1]. In a general sense, we can formulate
PCM for the QoI 𝑄, and write its mathematical expectation E [𝑄(𝑥, 𝑡; 𝜉)] in a one-dimensional
stochastic space as
                                                    ∫   𝑏
                               E [𝑄(𝑥, 𝑡; 𝜉)] =           𝑄(𝑥, 𝑡; 𝜉) 𝜌(𝜉)𝑑𝜉,                   (1.10)
                                                      𝑎
where 𝜌(𝜉) is the PDF of random input variable 𝜉. We evaluate the integration using Gauss
quadrature, mapping physical parametric domain to the standard domain [−1, 1]. The integral
should then be written as
                                           ∫   1
                         E [𝑄(𝑥, 𝑡; 𝜉)] =         𝑄(𝑥, 𝑡; 𝜉 (𝜂)) 𝜌(𝜉 (𝜂))𝐽𝑑𝜉 (𝜂),              (1.11)
                                             −1
where 𝐽 = 𝑑𝜉/𝑑𝜂 represents the Jacobian of the transformation. We approximate the expectation
by introducing a polynomial interpolation of the exact solution in the stochastic space, 𝑄(𝑥,ˆ 𝑡; 𝜉):
                                          ∫   1
                         E [𝑄(𝑥, 𝑡; 𝜉)] ≈         ˆ
                                                𝑄(𝑥,     𝑦𝑡; 𝜉 (𝜂)) 𝜌(𝜉 (𝜂))𝐽𝑑𝜉 (𝜂).           (1.12)
                                            −1
    We interpolate the solution in the stochastic space using Lagrange polynomials 𝐿 𝑖 (𝜉):
                                                  ∑︁𝐼
                                    ˆ 𝑡; 𝜉) =
                                  𝑄(𝑥,                  𝑄(𝑥, 𝑡; 𝜉𝑖 )𝐿 𝑖 (𝜉),                   (1.13)
                                                   𝑖=1
which satisfy the Kronecker delta property at the interpolation points:
                                              𝐿 𝑖 (𝜉 𝑗 ) = 𝛿𝑖 𝑗 .                              (1.14)
    Substituting Eq. (1.13) into (1.12), we approximate the integral using the quadrature rule and
evaluate the expectation as
                                       𝑃
                                      ∑︁                    ∑︁𝐼
                    E [𝑄(𝑥, 𝑡; 𝜉)] ≈      𝑤 𝑝 𝜌(𝜉 (𝜂))𝐽           𝑄(𝑥, 𝑡; 𝜉 (𝜂))𝐿 𝑖 (𝜉 (𝜂)),   (1.15)
                                      𝑝=1                    𝑖=1
where we compute the coordinates 𝜂 𝑝 and weights 𝑤 𝑝 for each integration point 𝑞 = 1, 2, . . . , 𝑃.
We choose the collocation points to be the same as the integration points 𝑝 on the paramectric space
                                                      14


by the Kronecker property of the Lagrange polynomials Eq. (1.14), and simplify the approximation
from Eq. (1.15) to a single summation:
                                             𝑃
                                            ∑︁
                          E [𝑄(𝑥, 𝑡; 𝜉)] =      𝑤 𝑝 𝜌(𝜉 𝑝 (𝜂 𝑝 ))𝐽𝑄(𝑥, 𝑡; 𝜉 𝑝 (𝜂 𝑝 )).          (1.16)
                                            𝑝=1
     Through Eq. (1.16), we have an efficient integration scheme through Gauss quadrature, while
we retain the property of independent realizations that allows for efficient implementation through
parallel computing. The formulation is also open to other forms of polynomial expansion and choice
of Gauss quadrature that are most suitable to each specific problem based on the definition of the
parametric probability distribution. The PCM formulation presented here can be further applied to
the computation of higher moments of the QoI, and in higher-dimensional spaces through the use
of tensor product.
1.3     Towards a Predictive Multi-Scale Failure Model
As observed in experimental data, failure is inherently a stochastic and anomalous process. High-
fidelity simulations of micro-structural physics come with the burden of high computational cost,
which impedes the analysis from a probabilistic point-of-view. Therefore, detailed material models
at this scale are often deterministic and do not provide sufficient information to the upper scales.
Moreover, despite the existence of extensive literature on avalanche and crackling noise, and the use
of mathematical models that accurately describe nonlocality at the micro-scale, there is still a gap
in the incorporation of the sub-grid dynamics into continuum, macro-scale failure models. The use
of nonlocal/fractional models has certainly advanced our understanding of complex phenomena
in a tractable manner, relying on fewer inputs for a generalized representation. However, their
numerical challenges, along with the necessity of pre-defining the nature of the interaction kernel
still hinder their applicability in most practical problems.
     At all scales, accurate and efficient propagation of uncertainty from input parameters to output
solutions is another challenge. Intrusive methods require direct coding of stochastic effects into the
physics solver. Alternatively, standard non-intrusive methods such as Monte Carlo suffer from slow
                                                     15


convergence and high computational costs. Therefore, efficient uncertainty quantification methods
are paramount for failure predictability.
    With the goal of providing a robust and predictive failure model at the continuum level, we
propose the implementation of a probabilistic data-driven framework for failure analysis, which
learns lower-scale parameter distributions to be upscaled to the continuum level. At the macro-
scale, robust stochastic models and machine learning methods incorporate the underlying statistics
to promote accurate predictions of material and component failure. The framework incorporates
the following multi-disciplinary components:
    • Stochastic Modeling: We incorporate the lower-scale stochastic processes to generate pa-
       rameter distributions to be upscaled. Furthermore, we construct fast surrogate models to
       replace expensive high-fidelity solvers by exploiting the statistics of the underlying physi-
       cal processes, leading to the construction of the corresponding stochastic model based on
       simulation data.
    • Uncertainty Quantification: Given the stochastic nature of the coupling between different
       time and length-scales, we need proper UQ and sensitivity analysis (SA) tools to propagate
       the uncertainties across the scales. The curse of dimensionality in UQ is an issue that needs
       to be treated with fast and accurate methods in forward uncertainty propagation, such as
       Probabilistic Collocation Method (PCM) and Sparse Grids. We explore the capabilities of
       PCM to develop simple and cost-efficient reliability analysis methods.
    • Machine Learning: Learning the nature of underlying anomalous physics in the form of
       nonlocal/fractional kernels and operators requires robust machine learning algorithms, where
       data mining of large-scale high-fidelity simulations is used to construct such representative
       stochastic models. Finally, Machine Learning is a robust tool to detect the presence of failure
       in macro-scale applications, even when the available data contains noise originated from
       experimental or simulation-based uncertainties.
                                                  16


1.4     Outline of this Work
This research work has the objective of developing a robust framework for multi-scale failure
analysis. The ability to consider the evolution of defect networks, and how they affect the ultimate
material behavior at the continuum, operational component level, is paramount for a predictive
failure analysis. Moreover, we construct a framework capable of obtaining material parameter
uncertainty at the micro-scale and to simulating failure at the macro-scale while considering
parametric uncertainty. To that end, we show the development of stochastic surrogate models at
the nano-scale which efficiently provides measures of parameter uncertainty. We demonstrate how
the anomalous effects from collective dislocation dynamics in the micro-structure can be modeled
probabilistically with nonlocal models. We propose a framework for incorporating parametric
uncertainty in damage models at the continuum level and identifying the sensitive parameters.
Furthermore, we develop a Machine Learning framework for failure detection at the continuum
level that is robust to noise, either coming from experimental or modeling/parametric uncertainties.
We illustrate the framework in Fig. 1.8. This work is divided in five chapters, which are summarized
below.
Chapter 2: Understanding, modeling, and real-time simulation of the underlying stochastic micro-
structure defect evolution is vital towards multi-scale coupling and propagating numerous sources
of uncertainty from atomistic to eventually aging continuum mechanics. In particular, dislocation
dynamics is directly connected to macro-scale plasticity, void and crack nucleation, and occurs
across different time and length-scales. We aim to study the evolution of dislocation dynamics
from the atomistic all the way to continuum level, and incorporate the anomalous dynamics usually
disregarded in continuum,macro-scale failure morels. In this Chapter, we start by studying the
dislocation mobility, which is a constitutive relation between dislocation velocity and effective
shear stress. The Molecular Dynamics (MD) simulations often used to estimate mobility is com-
putationally expensive, so with the goal of accelerating the computation of dislocation mobility,
and to provide estimates of mobility uncertainty, we developed a graph-based surrogate model of
dislocation glide at the atomistic level. Therefore, the main contributions of this Chapter are:
                                                   17


       Nano/Micro-scale                          Meso-scale                              Macro-scale
                                                                                    Uncertainty Quantification,
   MD simulations/Experiments             DDD simulations/Experiments
                                                                                 Sensitivity Analysis and Machine
                                                                                Learning: 2-D Damage Phase-Field
                       •  High-fidelity
                          physics
                                              •  Collective Dynamics
   Data-Driven Surrogate                   Data-Driven Nonlocal Model
   Model of Dislocation Motion             of Dislocation Dynamics
     •   Stochastic modeling            • Anomalous transport
     •   Uncertainty estimates          • Machine Learning of nonlocal kernel
                                                                              •  Failure detection and prediction
                                                                              •  Incorporation of parametric/model
                                                                                 uncertainty or experimental noise
Figure 1.8 Illustration of the multi-scale failure analysis framework proposed. At each scale we
learn the physics of their underlying processes, and obtain their corresponding stochastic models.
At the macro-scale, robust uncertainty quantification and machine learning algorithms will detect
the presence of damage and predict the material failure.
    • We develop a graph-based surrogate model of dislocation glide for computation of stochastic
        dislocation mobility. We model an edge dislocation as a random walker, jumping between
        neighboring nodes of a graph following a Poisson stochastic process.
    • The network representation functions as a coarse-graining of an MD simulation that provides
        dislocation trajectories for an empirical computation of jump rates, such as forward and
        backwards jump statistics, and waiting time distribution, which we then use to parameterize
        the Poisson process on the surrogate model.
    • We simulate the dislocation movement on the surrogate model by employing a Kinetic
        Monte Carlo method, and recover the original atomistic mobility estimates, with remarkable
        computational speed-up and accuracy.
    • Furthermore, the underlying stochastic process provides the statistics of dislocation mobility
        associated to the original molecular dynamics simulation, allowing an efficient propagation
                                                            18


       of material parameters and uncertainties across the scales, establishing a meaningful link for
       predictive multi-scale failure modeling.
Chapter 3: The collective motion of dislocations at the meso-scale leads to interesting phenomena
observed experimentally such as intermittent energy signals, power-law distribution of energy
spectrum, crackling noise, all of which are indicators that avalanches and scale-free dynamics
occurs during the failure process [11–15]. Experimental evidence suggests that at the meso-scale,
dislocation dynamics has nonlocal character due to the collective dynamics.
    In this Chapter, we investigate the nonlocal dynamics of collective dislocation motion through
a probabilistic perspective, in which the evolution of the probability density function of dislocation
position is governed by a nonlocal transport model. Specifically, the main contributions of this
Chapter are:
    • We simulate a 2-dimensional discrete dislocation dynamics (DDD) problem under periodic
       boundary conditions, where the dislocations move in a single glide plane.
    • The collective dynamics of the dislocation population give origin to Lagrangian trajectories
       that encode the underlying stochastic process directly. We collect all the trajectories of
       dislocation motion to generate the PDFs of dislocation position.
    • We use a Machine Learning framework to solve the inverse problem of parameterizing the
       nonlocal operator that solves the PDF evolution. We employ a bi-level learning framework
       that takes time-series evolution of PDFs and learn the power-law exponent 𝛼 and horizon
       size 𝛿.
    • We identify the shapes of the nonlocal kernel and observe that the presence of dislocation
       multiplication induces kernels associated with super-diffusive processes.
Chapter 4: Failure analysis often relies on uncertain data, whether it is coming from experiments,
where measurements contain significant levels of noise, or it is coming from their own mathematical
                                                  19


models, in the form of uncertain mathematical operators or parameter definition. Therefore, in
order to develop a robust and predictive failure analysis framework, we need a systematic way
of analyzing the model uncertainties. In particular, uncertainty quantification also points to the
operators which could be improved with the goal of mitigating uncertainty.
    In this Chapter, we motivate the necessity of different modling approaches that include the
anomalous effects that occur in the material during failure. We consider a stochastic phase-
field model at the continuum level, simulating failure through introducing damage and fatigue
variables. The damage phase-field is introduced as a continuous dynamical variable representing
the volumetric portion of fractured material and fatigue is treated as a continuous internal field
variable to model the effects of micro-cracks arising from energy accumulation. We formulate
a computational-mathematical framework for quantifying the corresponding model uncertainties
and sensitivities in order to unfold and mitigate the salient sources of unpredictability in the model,
hence, leading to new possible modeling paradigms. The contributions in this Chapter are:
     • We consider 5 parameters related to damage and fatigue to be random variables with a specific
       range. We solve the equations with Finite Element Method in space, and a semi-implicit
       method in time. In the stochastic space, we compared the solution statistics of a Monte
       Carlo Method (MC) and Probabilistic Collocation Method (PCM). We chose the PCM since
       it discretizes the stochastic space using a Lagrange interpolation of the solution, giving exact
       integrations in the stochastic space, with just a few realizations, and in our results we found
       that indeed PCM have a faster convergence than MC.
     • We used PCM as a building block for the uncertainty and sensitivity analyses of the damage
       model. We performed a Local Stochastic Sensitivity Analysis for the 5 parameters by
       taking the expectation of local sensitivity over the parametric domain range, where the local
       sensitivity at each point was computed by Complex-Step Differentiation, which is faster
       and more accurate than Finite-Difference. Then, we performed Global Sensitivity Analysis
       to quantify the contribution of each parameter into the total variance of the solution. An
                                                   20


       Analysis of Variance (ANOVA) based on Sobol index was performed using PCM, where we
       post-process the realizations in the tensor product of collocation points, which is fast and
       accurate, contrary to MC.
     • In the end, we analyzed two representative problems, a single-edge notched tensile test, and
       an I-shaped specimen. From the notched case, since we have a pre-existing crack with a
       known propagation direction due to the loading conditions, we observed that the parameters
       that contribute most to the solution uncertainty are related to damage evolution rate. For
       the I-shaped specimen, we found that parameters with more contribution were related to the
       damage free-energy potential that originated the classical Laplacian operator. The results
       point to the most sensitive parameters, indicating where we can modify the model to mitigate
       uncertainty. Our results suggest a revision of the classical free-energy, with the inclusion of
       neglected nonlocal terms that upscale the anomalous effects disregarded in current models.
Chapter 5: Failure in brittle materials led by the evolution of micro- to macro-cracks under
repetitive or increasing loads is often catastrophic with no significant plasticity to advert the onset
of fracture. Early failure detection with respective location are utterly important features in any
practical application, both of which can be effectively addressed using artificial intelligence.
    In this Chapter, we developed a Machine Learning framework for failure prediction of brittle
materials. We combine a classification algorithm with a pattern recognition scheme, where we
select virtual nodes from Finite Element results of the phase-field damage model to construct a
degradation function, and generate patterns of material softening. From each sensing node, we
extract a time-series such that we obtain a pattern at each time-step. We investigated the spatial
location (by analyzing different patterns) and time occurrence (by defining criteria for onset of
failure, and failure state) of the fracture by k-Nearest Neighbors (k-NN), and Artificial Neural
Networks (ANN) algorithms to analyze and classify the patterns. The major contributions of the
Chapter are:
     • Time-series data of the phase-field model is extracted from virtual sensing nodes at different
                                                  21


     locations of the geometry.
   • A pattern recognition scheme is introduced to represent time-series data/sensor nodes re-
     sponses as a pattern with a corresponding label, integrated with ML algorithms (k-NN and
     ANN), used for damage classification with identified patterns.
   • We perform an uncertainty analysis by superposing random noise to the time-series data to
     assess the robustness of the framework with noise-polluted data.
   • Results indicate that the proposed framework is capable of predicting failure with acceptable
     accuracy even in the presence of high noise levels. The findings demonstrate satisfactory
     performance of the supervised ML framework, and the applicability of artificial intelligence
     and ML to a practical engineering problem, i.,e, data-driven failure prediction in brittle
     materials.
Chapter 6: We discuss the main conclusions of this work, and comment on future research steps.
                                              22


                                           CHAPTER 2
     ATOMISTIC-TO-MESO MULTI-SCALE DATA-DRIVEN GRAPH SURROGATE
                           MODELING OF DISLOCATION GLIDE
2.1     Introduction
Multi-scale materials modeling and simulations are a rapidly growing scientific field, where it
is critical to propagate uncertainties to accurately and efficiently bridge material properties be-
tween adjacent length- and time-scales. Among several types of material imperfections that cause
disturbances in crystal structures, dislocations are line defects [1] that are naturally present from
manufacturing until failure of crystalline materials. Describing the small-scale buildup and dy-
namics of dislocations can provide an important insight on early fatigue precursors [69, 70], which
are beyond the resolution of existing continuum models of high-cycle fatigue damage. In order to
accurately propagate such early statistics of failure to the continuum for large-scale applications,
consistent, robust and efficient coupling frameworks between the atomistic and meso-scales are
fundamental.
    Molecular dynamics (MD) is a first-principle theory that explicitly describes the motion indi-
vidual atoms at small scales based on Newton’s second law. In the context of dislocations, MD has
been employed as an effective tool for the atomistic understanding of canonical types of dislocation
motion for diverse crystal structures and their corresponding mobility/drag coefficients [71–75],
as well as the estimation of core energies, responsible for dislocation self-interactions [23, 76].
In order to describe the complex arrangements and mechanics of dislocation networks at the in-
termediate scale of scanning electron microscopy [77], discrete dislocation dynamics (DDD) has
become a practical computational tool [27] that allowed the discovery of new physics, such as dis-
location multi-junctions [78]. Accurate DDD simulations require precise experimental properties
from dislocations and the corresponding medium, which can be obtained through MD experiments.
However the large number of degrees of freedom required for robust MD simulations may render
                                                 23


such experiments prohibitive, especially when a large number of realizations is needed to propagate
the statistical qualities from small- to large-scales.
    Aiming to simulate processes at longer time-scales, while still respecting the intrinsic physics
of lower-scale dynamics, different approaches have emerged. Kinetic Monte Carlo (KMC) methods
became popular in the last decades in a myriad of materials science applications. KMC is a type
of continuous-time Markov process [79, 80], where the process rates should be known in advance.
This method appeared originally for simulation of vacancies [81] and Ising spin systems [82],
gaining popularity among a variety of applications, including crystal growth [83], visco-elasticity
[84], and surface kinetics [85]. Researchers have also used KMC methods to construct low-fidelity
models for dislocation motion in materials ranging from bcc metals [86] to Silicon [87–89], where
temperature, size, and stress effects are investigated. More recently, [90, 91] used KMC to study
the interaction between solute atoms and screw dislocation bcc metals. This approach has the
advantage to capture rare thermally-activated motions, which is not possible in MD simulations
[92]. However, such models are limited due to uncertainties in atomistic estimation of parameters
used in the computation of rate constants, commonly obtained from activation energies derived
from transition State theory [93]. Phase Field Crystal (PFC) is another fast growing method for
simulation of crystalline structures with atomistic detail, while reaching diffusive time-scales, and
has been used to model dislocation dynamics [94–97].
    More recently, graph theory [98] has also presented itself as a robust approach in the field of
materials science, with applications in coarse-graining [99], and chemical kinetics, in combination
with KMC method [100]. Graph theory has a leading potential to provide efficient coarse-graining
of micro-scale dynamics, furnishing suitable ground for stochastic simulations of underlying dislo-
cation dynamics through a random walk over a network. For an extensive review of random walks
on networks, we refer the reader to [101] and references therein.
    In this work, we propose a data-driven framework for the construction of a surrogate model
of edge dislocation glide at the atomistic level, where dislocation position as a time-series data
is collected from high-fidelity MD simulations to train the model. We first perform a coarse-
                                                   24


graining of the atomistic domain through a graph-theoretical formulation. In the case of dislocation
glide in a periodic domain, a ring graph provides an accurate representation. However, the general
construction of the network and associated operators allows further enhancements for more complex
dynamics in a direct way. We model dislocation motion as a random walker, jumping between
neighboring nodes on the network, following a continuous-time, Markovian stochastic process.
The waiting times for forward or backward jumps between neighboring nodes is exponentially
distributed with rate parameter directly computed from the MD time-series data. We supply a
KMC algorithm with the estimated rate constants to simulate the dislocation motion under different
applied shear stresses, providing fast and accurate calculations of dislocation velocity and mobility.
    Ultimately, beyond the efficient estimates of material properties at the atomistic-level, the
proposed framework allows the propagation of uncertainties across the scales. With the stochastic
description of dislocation motion through a random walk over a network, governed by a Markov
jump process, we can compute statistics associated to the dislocation motion that are intrinsically
attached to the original atomistic setup. In that sense, we provide a mobility experiment of
similar nature to the associated MD simulation, with the advantage of quantifying parametric
uncertainty, which would be prohibitively costly through high-fidelity MD simulations. This
approach differs from existing KMC dislocation models that simulate dislocations at the meso-
scale [87], and expands over current atomistic coarse-graining methods for dislocations [102],
which do not estimate uncertainties associated with the high-fidelity simulations. To the best of our
knowledge, this is the first computational effort in providing uncertainty estimates of dislocation
mobility properties from the atomistic level. Mobility estimates and associated uncertainties
provided by the surrogate model can later be upscaled to meso-scale dislocation simulations,
such as DDD. At that stage, the collective behavior of dislocations would intrinsically incorporate
stochastic effects of lower scales that would be propagated to the continuum (i.e., through dislocation
density and plastic strains), therefore providing efficient multi-scale coupling starting in the MD
domain. This feature is essential to the development of predictive models at the component level,
whether the interest is on visco-elasto-plasticity [46, 47, 103], fracture [104, 105] or fatigue [34].
                                                   25


2.2     Data-Driven Framework
We develop a surrogate model for dislocation glide parameterized by MD data to quickly obtain
estimates of dislocation mobility in a short time-frame. The numerical framework for model
construction and simulation is illustrated in Fig. 2.1. To construct the surrogate, the atomistic
domain is coarse-grained and idealized as a periodic line graph (a ring graph), where nodes
correspond to the sub-domains inside the crystal.
    From the coarse-grained description, we represent the dislocation as a random walker that
jumps between neighboring nodes following a Poisson stochastic process. The rate constants that
parameterize the process are obtained directly from MD simulation data of an edge dislocation
gliding under shear stress, allowing the reconstruction and simulation of the stochastic dislocation
motion through KMC method. KMC and MD are independent techniques for dislocation motion,
yet here we combine both, leading to a fast computation of dislocation mobility using KMC, in
which the parameters come from high-fidelity, costly, MD simulations.
    We start by discussing the methodology of dislocation simulation through MD. Then, we
describe the coarse-graining of the physical domain as a ring graph, and construct the dislocation
random walker based on Poisson processes. Computing the rate constants from MD simulations,
we ensure that sequences of States coming from KMC converge in distribution with MD trajectories
[79], yet using far less computation time, allowing for longer simulation times that are not achievable
in MD.
2.2.1   Molecular Dynamics Simulation of Edge Dislocation Glide
Following body-centered-cubic Fe-C simulations from [23], we generate synthetic dislocation
motion data in a pure Fe system and estimate the edge mobility property through MD simulations
utilizing the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) [106]. All
the MD simulations in this work are run in 80 Intel Xeon Gold 6148 CPUs with 2.40GHz.
    The MD system under consideration is illustrated in Fig. 2.2, consisting of a simulation box of
                                                  26


                                                                                                    Velocity
                                          MD Domain Decomposition                        Time
                                                                                           Dislocation Position
                                           Rate Constants Estimation
                                                                                                   Mobility
                                                                                      Velocity
                                                                                                     m
                                    Graph-Theoretical Kinetic Monte Carlo
    MD Simulation Data                                                                           Shear Stress
                                                Surrogate Model
Figure 2.1 Framework for constructing a network-based KMC surrogate model for dislocation glide.
The surrogate is then employed for fast and accurate simulations of dislocation motion, obtaining
velocity data at different stress levels, leading to the estimation of the dislocation mobility.
61 × 40 × 20 𝛼-Fe unit cells with dimensions 25.14 × 26.96 × 24.06 [𝑛𝑚] in the 𝑥, 𝑦, 𝑧 directions.
A straight edge dislocation with Burgers’ vector b = 12 [1, 1, 1] is generated by removing a (1, 1, 1)
half-plane of atoms from the center of the box. The MD domain consists of 1 353 132 atoms
with periodic boundary conditions applied in the 𝑥 and 𝑧 directions, and shrink-wrapped boundary
conditions applied to the unit cells in the top- and bottom-planes along the 𝑦-direction. We perform
an NVE time-integration, where the system’s temperature is relaxed to 𝑇 = 750 [𝐾] through
velocity-rescaling for 100 [ 𝑝𝑠] (see Fig.2.3a). We utilize a combined Tersoff bond-order and
repulsive Ziegler-Biersack-Littmark (ZBL) interatomic potential, with corresponding parameters
from [107].
   We apply shear stress values in the range 𝜏 ∈ [15, 100] 𝑀 𝑃𝑎 to the top layer in Fig. 2.2a,
parallel to b, which induces a glide motion in the 𝑥-direction on the (1, 1, 0) plane. No temperature
control is enforced in this stage and we run the simulation over 1 [𝑛𝑠] with time-step size Δ𝑡 𝑀 𝐷 =
2 [ 𝑓 𝑠]. The MD time-series data is saved every 100 time-steps and the atom positions are post-
processed utilizing the Polyhedral Template Matching (PTM) method [108] implemented in OVITO
(https://www.ovito.org/) [109], which allows us to detect and track the lattice disturbance.
We define the dislocation position as the average of all 𝑥-coordinates of atoms belonging to the
                                                 27


                               (a)                                  (b)
Figure 2.2 MD domain of the dislocation mobility test. (a) 𝑥 − 𝑦 plane, illustrating the edge
dislocation core as the lattice perturbation at the center. (b) 3D view of the MD domain with the
BCC lattice removed, showing the dislocation line along the 𝑧-axis.
disturbed region (dislocation core) in Fig.2.2a. Therefore, for every applied shear stress 𝜏, we
obtain a position vector x 𝑀 𝐷 (𝑡) with 5000 data-points (see Fig. 2.3b) of size Δ𝑡 𝑀 𝐷 = 2 [ 𝑓 𝑠], from
which we compute the corresponding velocity 𝑣 𝑀 𝐷 through a linear fit. The obtained velocity from
the post-processed MD simulation can be related to the one-dimensional solution from dislocation
dynamics denoted by 𝑣 𝑥 , and given by the following relationship:
                                            𝑣 𝑥 = 𝑀 · 𝑏 · 𝜏.                                        (2.1)
                                                              √
where 𝑀 denotes the edge dislocation mobility, and 𝑏 =          3𝑎/2 represents the magnitude of b.
Equation (2.1) is obtained from the balance between the applied Peach-Kohler force induced by
the shear stress 𝜏 and the dislocation drag force. Therefore, setting 𝑣 𝑀 𝐷 = 𝑣 𝑥 and from the slope
𝑚 = 𝑀 · 𝑏 of the velocity versus stress curve in Fig. 2.3c, we estimate the edge dislocation glide
mobility as 𝑀 = 𝑚/𝑏 ≈ 5931.3 [(𝑃𝑎.𝑠) −1 ], which is in good quantitative agreement (1.73%
difference) compared to the results obtained by Lehtinen et al.[23].
2.2.2   Graph-theoretical Coarse-graining
We begin the surrogate framework by idealizing a coarse-grained version of the atomistic domain
as a graph 𝐺 (𝑉, 𝐸), with a set vertices (or nodes) 𝑉 connected by edges 𝐸. In this representation,
                                                  28


                 (a)                                (b)                               (c)
Figure 2.3 (a) Temperature and total energy for the equilibration step, (b) Edge dislocation position
x 𝑀 𝐷 (𝑡) and (c) mobility through MD simulations for distinct values of applied shear stresses 𝜏
under 𝑇 = 750 [𝐾]. We observe an overdamped motion for the applied shear stress range and a
linear mobility relationship.
each node on the network represents a sub-domain from the original MD system. In the case
of a dislocation glide along a single slip plane, a one-dimensional ring graph is an adequate
simplification of the dislocation movement, also assuring the periodicity presented in the MD
domain.
    The coarse-graining is achieved by dividing the domain into 𝑛 sub-domains, or bins, such that
                                                           
                                                        𝐿
                                            𝑛=                                                     (2.2)
                                                    max( 𝒅)
where 𝐿 is the size of the domain (in the 𝑥 − 𝑦 plane), and 𝒅 is the vector containing the distance
traveled by the dislocation between each MD time-step, with entries 𝑑𝑖 = 𝑥𝑖+1   𝑀 𝐷 − 𝑥 𝑀 𝐷 . We choose
                                                                                        𝑖
this upper bound to ensure that the dislocation only travels to neighboring nodes. In this sense, we
identify the dislocation as corresponding to node 𝑖 of the graph if the dislocation position in the
MD simulation lies between the bounds of bin 𝑖 of width Δ𝑥 = 𝐿/𝑛.
    We now define the standard operators for a continuous-time random walk on a network. The
adjacency matrix 𝑨 has elements 𝐴𝑖 𝑗 = 1 if there is a link between nodes 𝑖 and 𝑗, and 𝐴𝑖 𝑗 = 0
otherwise, for 𝑖, 𝑗 = 1, 2, . . . , 𝑁. The degree matrix 𝑲 represents the number of edges attached
                                     Í
to the node, computed as 𝐾𝑖𝑖 = 𝑁𝑗=1 𝐴𝑖 𝑗 , and 𝐾𝑖 𝑗 = 0 for 𝑖 ≠ 𝑗. From 𝑨 and 𝑲 we define the
                                               𝐴𝑖 𝑗
transition matrix 𝑾, with elements 𝑤 𝑖→ 𝑗 =    𝐾𝑖 ,  representing the probability of the random walker
to transition from node 𝑖 to node 𝑗.
                                                    29


    Specifically for the ring graph considered for the surrogate model, every node is attached to two
other nodes, which makes entries 𝐴𝑖𝑖+1 = 𝐴𝑖𝑖−1 = 1, except when 𝑖 = 1 or 𝑖 = 𝑁. In those cases,
𝐴1𝑛 and 𝐴𝑛1 are set to one to ensure periodicity. As a consequence, degree matrix 𝑲 has all entries
𝐾𝑖𝑖 = 2.
    The transition matrix is finally computed with elements 𝑤 𝑖→𝑖 = 𝑤 𝑖→𝑖+1 = 𝑤 𝑖→𝑖−1 = 21 . Again,
                                                                         1
the exception is for nodes 𝑖 = 1 and 𝑖 = 𝑁, where we obtain 𝑤 1→𝑁 =      2 and 𝑤 𝑁→1 = 12 , respectively,
due to periodicity. At this point we use the transition matrix 𝑾 as a building block for introducing
the dynamics of dislocation motion. Its purpose is to initially restrict the movement of the random
walker to the neighboring nodes with equal probability, later modulated by empirical rates computed
from MD simulations.
2.2.3   Construction of the Random Walk
The construction of the random walk representative of dislocation glide is dependent on two main
aspects: first, on the simplification of dislocation motion and its coarse-graining through a graph-
theoretical framework, as discussed before; second, on the statistical representation of dislocation
mobility through a Poisson process that naturally leads to the use of KMC method. We now discuss
the formulation of the random walk, where we will follow closely the ideas in [79].
    The main attractiveness of KMC is the simplification of complex dynamics into a counting
process, where the entire system moves from State to State. For each possible escape path from
the current State, there is an associated rate constant 𝑞𝑖 𝑗 , which is the probability per unit time to
transition to State 𝑗 from State 𝑖.
    For modeling the dislocation motion through a random walk, we first assume that the transition
probabilities for dislocation motion are independent of history, therefore, characteristic of a Markov
processes. Second, for systems such as the pure Fe-Fe studied in this work, there is no evident
acceleration of dislocation in the long range. Therefore, we assume that the underlying process is
stationary with independent increments.
    Let (Ω, F , P) be a complete probability space, where Ω is the space of outcomes 𝜔, F is the
                                                   30


𝜎-algebra and P is a probability measure, P : F → [0; 1]. From the assumptions, we model the
total number of jumps 𝑁𝑡 (𝑡) between States over time 𝑡 ∈ [0, ∞) as a Poisson process with total
rate 𝑄, such that for any 𝑡, 𝑁𝑡 (𝑡) ∼ Poisson(𝑄𝑡).
    For an arbitrary process with several possible States 𝑗 from current State 𝑖, each with rate 𝑞𝑖 𝑗 , the
                    Í
total rate 𝑄 is 𝑄 = 𝑗 𝑞𝑖 𝑗 , following the assumption that the different processes are independent and
non-overlapping. In the dislocation motion studied here, there are only two possible escape paths
from any current State, a forward or backward jump, with respective rates 𝑞 𝑓 and 𝑞 𝑓 . Therefore,
we have
                                               𝑄 = 𝑞 𝑓 + 𝑞𝑏 .                                       (2.3)
    Furthermore, let 𝑋 : Ω → R be a random variable that represents the waiting times between
jumps over the graph 𝐺. Then, 𝑋 ∼ Exponential(𝑄), meaning that the process is first-order with
exponential decay statistics, i.e., memoryless. The probability of the random walker not performing
any jump, therefore staying on the current node, is given by
                                              𝑝 stay (𝑡) = 𝑒 −𝑄𝑡 ,                                  (2.4)
leading to the standard computation of time increments Δ𝑡 in KMC algorithms,
                                                          ln(𝑟)
                                                Δ𝑡 = −           ,                                  (2.5)
                                                            𝑄
where 𝑟 is a random number sampled from the uniform distribution U (0, 1).
    After each time-step with size given by Eq. (2.5), the system will evolve to a new State with
probability proportional to 𝑞 𝑓 and 𝑞 𝑏 . In general, this is accomplished by recomputing the elements
of 𝑾 as 𝑝𝑖 𝑗 , representing the probability of a jump per unit of time, in units of 𝑠−1 . Probabilities
are obtained through
                                                       𝑤 𝑖→ 𝑗 𝑞 𝑗
                                            𝑝𝑖 𝑗 = Í                 ,                              (2.6)
                                                        𝑗 𝑤 𝑖→ 𝑗 𝑞 𝑗
where 𝑝𝑖 𝑗 is now the walker’s probability to go from node 𝑖 to node 𝑗, per unit time . The result is
                      Í
normalized to make 𝑗 𝑝𝑖 𝑗 = 1. Equivalently, we may simply take
                                                       𝑞𝑗         𝑞𝑗
                                            𝑝𝑖 𝑗 = Í          =                                     (2.7)
                                                        𝑗 𝑞𝑗       𝑄
                                                       31


for 𝑞 𝑗 ∈ {𝑞 𝑓 , 𝑞 𝑏 }.
Remark 2.2.1. Note that the increment in time and the selection of the next State are independent
of each other. First the system waits for any jump with probability related to the total jump rate 𝑄.
Then, in a separate drawing, the next State is chosen with probabilities proportional to 𝑞 𝑓 and 𝑞 𝑏 .
Remark 2.2.2. The general graph-theoretical description of the physical system allows flexibility
and future incorporation of more complex cases, beyond the ring graph currently adopted for the
case of dislocation glide. The inclusion of inhomogeneous Poisson processes (either in time or
space), dislocation climb, or even non-Markovian network dynamics as in the case of Lévy flights
[110] can be built on top of this fundamental framework in a straight-forward fashion.
    Since the graph nodes are positioned in the center of each bin, as illustrated in Fig. 2.1, we have
an approximation for distance traveled by the dislocation from the internodal distance Δ𝑥. Then, at
each time-step, the dislocation spatial position is updated by
                        
                        
                         𝑥 𝑛+1 = 𝑥 𝑛 + Δ𝑥, if dislocation jumps forward
                        
                        
                        
                                                                                                   (2.8)
                        
                         𝑥 𝑛+1 = 𝑥 𝑛 − Δ𝑥, if dislocation jumps backwards
                        
                        
                        
where 𝑥 𝑛+1 is the new dislocation position ate time-step 𝑡 𝑛+1 . In that sense, this model is still a
discrete-space random walk, which calls for extra care when computing the dislocation velocity.
    One possibility is to mimic the procedure from MD simulations, and plot the dislocation distance
as a function of time, performing a linear regression to obtain the dislocation velocity 𝑣. We run
the simulation for each stress level, and plot the dislocation velocity as a function of stress. Again,
a linear regression is used to obtain the slope of the curve 𝑚 for a linear mobility rule as in MD,
and the dislocation mobility from the network dynamics is estimated through Eq. (2.1).
    Algorithm 2.1 summarizes the procedure of running a KMC simulation of dislocation glide
through a random walk on a network for a total of 𝑀 time-steps when we know the rates of forward
𝑞 𝑓 and backward 𝑞 𝑏 jumps. We implemented Algorithm 2.1 in Python 3.7, in addition to a routine
for the computation of empirical jump rates from MD. The ring graph, corresponding matrices,
and operators are constructed using the NetworkX Python package [111].
                                                   32


Algorithm 2.1 Kinetic Monte Carlo method for Dislocation Glide as a Random Walk on a Graph
  1: Given: rates for jump forward 𝑞 𝑓 and jump backward 𝑞 𝑏 , compute total rate through Eq. (2.3).
  2: Given: number of nodes 𝑛 through Eq. (2.2), and the distance between nodes Δ𝑥, compute
     transition matrix 𝑊.
  3: for Time-steps 𝑚 = 0 → 𝑀 − 1 do
  4:     Given the current node position 𝑖, get the corresponding 𝑖−th line of 𝑾.
  5:     Update line 𝑾𝑖 as in Eq. (2.7).
  6:     Choose next position 𝑗 based on the pdf given by 𝑊𝑖 .
  7:     Generate a random number 𝑟 ∼ U (0, 1).
  8:     Advance time by a time-step Δ𝑡 from Eq. (2.5).
  9:     Update the dislocation’s spatial position by Δ𝑥 using Eq. (2.8).
10: end for
2.2.4   Empirical Computation of Rate Constants
One of the major drawbacks of KMC methods is the required knowledge of process rates as
inputs to the method, which may not always be a trivial task, where traditional approaches involve
the computation of rates through physical principles [79, 93]. In this work, we propose a data-
driven approach for the computation of jump rates from dislocation position data obtained in MD
simulations. In this way, the atomistic, high-fidelity simulation with observable dislocation motion
parameterizes the surrogate model through the rate constants.
     From the coarse-graining procedure, at each time-step we can identify and track the node
associated with the dislocation position in MD. With this information, we are able to compute the
waiting times between two consecutive jumps, classified in three main groups of events: forward,
backward, or any jump. We also compute the total number of jump events in any of the three
groups, respectively 𝑁 𝑓 , 𝑁 𝑏 , and 𝑁𝑡 = 𝑁 𝑓 + 𝑁 𝑏 . Both groups of data can be used to estimate the
rate constants.
     We model 𝑁𝑡 (𝑡) following a Poisson distribution, and given that the expectation of a Poisson
random variable with parameter 𝜆 = 𝑄𝑡 [112] is
                                           E[𝑁𝑡 (𝑡)] = 𝑄𝑡,                                      (2.9)
we may infer the rate parameter 𝑄 from empirical data by taking
                                                 E[𝑁𝑡 (𝑡)]
                                            𝑄=             .                                   (2.10)
                                                      𝑡
                                                   33


    The expected number of jumps E[𝑁𝑡 (𝑡)] is taken here to be the number of jumps that occurred
in the MD simulation during simulation time 𝑡. Equivalently, we can replace 𝑁𝑡 (𝑡) by 𝑁 𝑓 and 𝑁 𝑏 ,
to respectively compute 𝑞 𝑓 and 𝑞 𝑏 .
    Alternatively, we can look at the probability that a jump happened by time 𝑡 ′, which is the
integral of the probability of the first jump 𝑝(𝑡), and it is given by
                                       ∫   𝑡′
                                              𝑝(𝑡)𝑑𝑡 = 1 − 𝑝 stay (𝑡 ′).                      (2.11)
                                         0
                                                                  𝜕 𝑝 stay (𝑡)
    It follows that 𝑝(𝑡) can be obtained by taking 𝑝(𝑡) = −            𝜕𝑡      , so that
                                               𝑝(𝑡) = 𝑄𝑒 −𝑄𝑡 ,                                (2.12)
which is an exponential distribution of waiting times. Taking the first moment of Eq. (2.12) gives
the average waiting time for a jump 𝜇 as
                                               ∫   ∞
                                                                 1
                                          𝜇=         𝑡 𝑝(𝑡)𝑑𝑡 =     .                         (2.13)
                                                 0              𝑄
    Note that again we may generalize the result from Eq. (2.13) to average waiting time between
two consecutive forward and backward jumps exclusively, 𝜇 𝑓 and 𝜇 𝑏 , just by isolating those events
from the complete time-series of waiting times. In that case, we can also obtain 𝑞 𝑓 and 𝑞 𝑏 from
waiting time distributions.
    The last method we may use to compute the rate constants is also through distributions of
waiting times. Yet, this time we fit an exponential function to the histogram of waiting times using
Maximum Likelihood Estimation (MLE). The MLE estimator for an exponential fit is equivalent to
the reciprocal of sample mean, i.e. 1/𝜇, therefore we can expect identical results when using both
methods [113]. We compare the accuracy of all three methods in the following section by using
user-defined true rates as reference solution.
2.3     Results and Discussion
We now present the numerical results from the surrogate model simulations. We start by investigat-
ing the accuracy of the rate estimation algorithm, and the convergence as a function of the number
                                                      34


of time-steps from the original data-set using manufactured known process rates. Then, we apply
the framework to real MD simulation data of dislocation glide and compute the mobility using the
surrogate, comparing the results with MD mobility computations.
2.3.1    Convergence of Rate Constant Estimation
We investigate the accuracy and convergence of the rate estimation algorithm through KCM sim-
ulation of a single random walker in a ring graph, with manufactured true rates 𝑞 true for forward
and backward jumps. We test different rate combinations for the jumps, and apply Eqs.(2.10) and
(2.13), and MLE to estimate the original rates in one realization of the stochastic process. We
check the convergence of the rate estimate with different number of time-steps, which in this case
is the exact number of total jumps 𝑁𝑡 (𝑡). We consider a graph with 𝑛 = 20 nodes.
     We show results in Tables 2.1 and 2.2, for the estimation through Eq. (2.10). The other two
methods yield identical results for the manufactured solution, and are omitted. We present the
estimated rates 𝑞 est , and the relative error to the true rates, computed as
                                                    |𝑞 true − 𝑞 est |
                                           error =                    .                          (2.14)
                                                         |𝑞 true |
     We observe that accuracy is dependent on the number of time-steps, which is natural, since
more time-steps provide more data for a reliable statistical representation of the true rates. Second,
the estimate is more accurate for higher rates, relative to lower ones, as in Table 2.1, where the ratio
between the rates is large. For rates of similar magnitude, error levels are comparable, since there
is sufficient data for both estimates.
2.3.1.1    Uncertainty quantification of rate estimation
Due to the probabilistic nature of this framework, results from Tables 2.1 and 2.2 show oscillations
in error measures, which only represent the accuracy of a single realization of the problem in
the stochastic space. This motivates an Uncertainty Quantification (UQ) analysis, where we using
                                                     35


                 Table 2.1 True rates: 200 (forward) and 1 (backward), in units of 𝑠−1 .
              Number of time-steps      Forward Rate     Error    Backward Rate      Error
              101                       191.9520         4.02%    0.0000             100.00%
              102                       202.8340         1.42%    0.0000             100.00%
              103                       214.0373         7.02%    1.9438             94.38%
              104                       197.1265         1.44%    0.9309             6.91%
              105                       199.2291         0.39%    0.9066             9.34%
              106                       200.0162         0.01%    0.9272             7.28%
                Table 2.2 True rates: 100 (forward) and 100 (backward), in units of 𝑠−1 .
              Number of time-steps      Forward Rate     Error     Backward Rate      Error
              101                       51.5011          48.50%    51.5011            48.50%
              102                       124.4437         1.82%     101.8176           1.82%
              103                       108.6405         3.66%     96.3416            3.66%
              104                       98.6444          0.40%     100.3960           0.40%
              105                       100.3814         0.32%     99.6812            0.32%
              106                       99.8083          0.04%     100.0381           0.04%
Monte Carlo method to quantify the level of uncertainty in the rate estimation for data with different
number of time-steps.
      Two types of analysis were performed. First, for a fixed number of 1000 time-steps, expectation
and standard deviation were obtained for different number of MC realizations. Last, for a fixed
number of 1000 realizations, we obtained expectation and standard deviation for different number
of time-steps, i.e., by considering different final simulation times from the time-series data. To
show this result, we average the number of statistical events (total jumps) from each realization
to construct the 𝑥-axis. For the same network of 𝑛 = 20 nodes as before, and true rates of
𝑞 𝑓 ,true = 200 𝑠−1 and 𝑞 𝑏,true = 1 𝑠−1 as in Table 2.1, we plot UQ results in Fig. 2.4.
      From Fig. 2.4 we see that the precise computation of rates from data is almost exclusively
dependent on the number of statistical events, therefore on the length of the simulation. Increasing
the number of realizations does not increase the accuracy of the recovered rates, and the uncertainty
region is kept constant. However, increasing the number of time-steps through considering longer
simulation times leads to the expected value to converge to the true rate, and shrinks the uncertainty
                                                     36


                200                                                               300
                                                                                  250
                150                                                               200
Rate constant                                                     Rate constant
                100                                                               150
                                                                                  100
                 50
                                                                                   50
                  0                                                                 0
                      101       102        103        104   105                         101         102           103       104
                                  Number of realizations                                         Average number of events
                            (a) Number of realizations.                                       (b) Number of time-steps.
 Figure 2.4 Convergence to true rates (y-axis) as a function of number of realizations with fixed
 time-steps (a), or number of time-steps with fixed realizations (b). Dashed lines are the true rates
 (200 and 1), solid lines are the expected rates, and the shaded areas are the regions of uncertainty
 based on standard deviation.
 region.
 2.3.2                Dislocation Mobility
 Here we present numerical results for one complete cycle of the framework, from MD simulation
 of dislocation glide, to rate estimation and final surrogate simulation through a random walk in the
 constructed network.
 2.3.2.1                Rate estimation
 From the raw data of dislocation position and time obtained from LAMMPS and Ovito, we apply
 the domain decomposition into bins equivalently to graph nodes, and track the current node over
 time. We count the number of jumps forward and backward between two nodes, as well as the
 waiting times between events. This also allow us to compute the waiting times between two forward
 or backward jumps.
                Now we show the rate estimation procedure. First, we compile the waiting time statistics in
 histograms, and plot the normalized histograms with a corresponding exponential fit in Fig. 2.5 for
                                                                  37


                                                             Exponential fit                                                                                                            1.0                                   Exponential fit
                       0.6                                                                                0.35                                Exponential fit
                       0.5                                                                                0.30                                                                          0.8
Normalized frequency                                                               Normalized frequency                                                          Normalized frequency
                                                                                                          0.25
                       0.4                                                                                                                                                              0.6
                                                                                                          0.20
                       0.3
                                                                                                          0.15                                                                          0.4
                       0.2                                                                                0.10
                                                                                                                                                                                        0.2
                       0.1                                                                                0.05
                       0.0                                                                                0.00                                                                          0.0
                             0       2        4         6           8                                            0    5         10          15           20                                   0       2        4         6           8
                                         Waiting time [ps]                                                                Waiting time [ps]                                                               Waiting time [ps]
                                            (a)                                                                             (b)                                                                              (c)
                                                             Exponential fit                                                                                                            0.7                                   Exponential fit
                       0.7                                                                                0.06                                Exponential fit
                                                                                                                                                                                        0.6
                       0.6                                                                                0.05
Normalized frequency                                                               Normalized frequency                                                          Normalized frequency
                                                                                                                                                                                        0.5
                       0.5                                                                                0.04                                                                          0.4
                       0.4
                                                                                                          0.03                                                                          0.3
                       0.3
                       0.2                                                                                0.02                                                                          0.2
                       0.1                                                                                0.01                                                                          0.1
                       0.0                                                                                0.00                                                                          0.0
                             0   1       2        3      4          5          6                                 0   20     40        60          80       100                                0   1        2        3       4         5
                                         Waiting time [ps]                                                                Waiting time [ps]                                                               Waiting time [ps]
                                            (d)                                                                             (e)                                                                              (f)
Figure 2.5 Normalized histograms of waiting times between forward (a) and (d), backward (b)
and (e), and any jump (c) and (f), along an exponential fitted curve resulted from MLE parameter
estimation for 𝜏 = 25 𝑀 𝑃𝑎 (top row) and 𝜏 = 100 𝑀 𝑃𝑎 (bottom row).
two values of shear stress, 𝜏 = 25 𝑀 𝑃𝑎 and 𝜏 = 100 𝑀 𝑃𝑎. Observe that distributions of waiting
times can be approximated by an exponential decay through its mean value, given the assumption
made in the random walk construction.
                         We also point that for the lower stress (top row), the distribution of backward waiting times,
Fig. 2.5 (c), is closer to the forward case, when compared to a higher stress level (bottom row),
Fig. 2.5 (f), which is a direct translation of physical effects that occur at the atomic level into a
statistical description of dislocation motion. Furthermore, waiting times for backward jumps at
𝜏 = 100 𝑀 𝑃𝑎 are longer than at 𝜏 = 25 𝑀 𝑃𝑎, since higher stresses hinder the backward dislocation
motion.
                         From the statistical description of waiting times, we compute the rate constants for forward,
backward, and total jumps using the expectation of number of events, Eq. (2.10), average waiting
times, Eq. (2.13), and the parameter of the exponential fit in Fig. 2.5, obtained by MLE. Again, we
                                                                                                                            38


Table 2.3 Rate estimates from MD data for different values of shear stress, using Eq. (2.10),
Eq. (2.13) and MLE fit.
                  𝜏                     25 𝑀 𝑃𝑎                     100 𝑀 𝑃𝑎
                  Method       E[𝑁 (𝑡)]/𝑡    1/𝜇    MLE     E[𝑁 (𝑡)]/𝑡    1/𝜇      MLE
                  𝑞𝑓           0.633        0.634 0.634 0.625            0.626 0.626
                  𝑞𝑏           0.352        0.353 0.353 0.060            0.062 0.062
                  Q            0.985        0.987 0.987 0.685            0.686 0.686
                  𝑞 𝑓 + 𝑞𝑏     0.985        0.987 0.987 0.685            0.688 0.688
                  Error (%)    0.00         0.00 0.00 0.00               0.29 0.29
compare results for 𝜏 = 25 𝑀 𝑃𝑎 and 𝜏 = 100 𝑀 𝑃𝑎, and construct Table 2.3.
    Table 2.3 shows the estimates of 𝑞 𝑓 , 𝑞 𝑏 , and 𝑄 directly. We also compute the quantity
𝑞 𝑓 + 𝑞 𝑏 and compare it with total rate 𝑄 through a relative error measure. We assume that 𝑄 is the
reference value since it comes directly from data. We observe that all methods yield nearly identical
results, specially for 𝑞 𝑓 , which has more available data points. For 𝑞 𝑏 , difference is greater in the
𝜏 = 100 𝑀 𝑃𝑎 case due to lower number of data points. We also observe greater error between
𝑞 𝑓 + 𝑞 𝑏 and 𝑄 for 𝜏 = 100 𝑀 𝑃𝑠, for the same reason.
    Nevertheless, the three methods are equivalent, and the differences between their results are
negligible, so the choice of any particular method yields nearly identical results in the stochastic
simulation. The MLE estimate and the 1/𝜇 result are identical, as expected for the exponential fit.
The sample mean estimation from 1/𝜇 should converge to the first case, E[𝑁 (𝑡)]/𝑡 as 𝑡 → ∞ or as
𝑁 → ∞, since the computation of 𝜇 involves the summation of waiting times, which will approach
the total simulation time when the 𝑡 or 𝑁 are large. For simplification purposes, for the remaining
simulations we will use the expectation estimate, Eq. (2.10) only due to the agreement between
𝑞 𝑓 + 𝑞 𝑏 , and total rate 𝑄 obtained directly from data points.
    We also check the convergence of estimated rates as in the example with manufactured true
rates. Here, we do not know the exact rates, therefore we observe the trend of forward and backward
rates as we increase the number of observations. Similarly to the manufactured case, each data
point in the plot is generated by considering a truncated time-series, until the final data point which
includes the whole time series. Fig. 2.6 shows the results of rate estimation, where the 𝑥-axis
                                                   39


                0.7                             Forward
                                                Backward                                                                               0.6
                                                                            0.6
                0.6                                                                                                                    0.5
Rate constant                                               Rate constant                                              Rate constant
                                                                            0.5
                0.5                                                                                        Forward                     0.4                            Forward
                                                                                                           Backward                                                   Backward
                                                                            0.4                                                        0.3
                0.4
                                                                            0.3                                                        0.2
                0.3                                                                                                                    0.1
                                                                            0.2
                       101             102            103                         101              102           103                            101             102
                             Number of events                                           Number of events                                           Number of events
                       (a) 𝜏 = 25 𝑀 𝑃𝑎.                                           (b) 𝜏 = 50 𝑀 𝑃𝑎.                                           (c) 𝜏 = 100 𝑀 𝑃𝑎.
Figure 2.6 Convergence in the jump rates from MD time-series data for different stress levels. We
observe a more steady and monotonic trend with higher stress levels.
again shows the number of statistical events (total number of jumps 𝑁𝑡 (𝑡)). We observe that the
higher the stress level, the smoother is the curve, which is physically consistent. Higher stresses
make the forward rates much larger than the backward rates, and the dislocation movement in the
MD simulation flows with less noise, so the rate estimates will tend towards a final value with less
oscillations.
2.3.2.2               Surrogate results
For each value of shear stress in the surrogate simulation, we obtain the corresponding rate constants
through Eq. (2.10) and simulate the random walk on a ring graph through the KMC framework,
Algorithm 2.1. In the end, we are able to plot the distance traveled by the dislocation as a function
of time, similar to what is done in MD, by updating the spatial position using Eq. (2.8). We plot the
position-time evolution of one realization of the random walk under three different shear stresses,
in comparison with the MD results in Fig. 2.7.
                 From Fig. 2.7 we make some observations. First, under lower stress, MD results are intrinsically
noisy, with the dislocation moving more easily under higher stresses, where the MD plot becomes
smoother. Those characteristics are manifested in the rate constants as discussed in Table 2.3 and
Fig. 2.6, and in the position versus time plots generated from the stochastic process in Fig. 2.7.
                 We also verify form Fig. 2.7 that the position evolution of the random walk closely follows the
same trend as in the original data set. We then compute the dislocation velocity by applying a linear
                                                                                           40


                           1e4                                                                                     1e4
                                                                                                                                                                                                    1e5
                                      LAMMPS                                                                   9         LAMMPS                                                               1.6         LAMMPS
                       5              Surrogate                                                                          Surrogate
                                                                                                               8                                                                              1.4         Surrogate
                                                                                                               7                                                                              1.2
                       4
  Position [pm]                                                                           Position [pm]                                                                  Position [pm]
                                                                                                               6                                                                              1.0
                                                                                                               5                                                                              0.8
                       3
                                                                                                               4
                                                                                                                                                                                              0.6
                                                                                                               3
                       2                                                                                                                                                                      0.4
                                                                                                               2
                                                                                                                                                                                              0.2
                                                                                                               1
                            0          200        400     600           800     1000                                0         200    400     600       800     1000                                  0       200      400     600      800     1000
                                                    Time [ps]                                                                          Time [ps]                                                                       Time [ps]
                                        (a) 𝜏 = 25 𝑀 𝑃𝑎.                                                                      (b) 𝜏 = 50 𝑀 𝑃𝑎.                                                             (c) 𝜏 = 100 𝑀 𝑃𝑎.
Figure 2.7 Position versus time of edge dislocation, comparison between MD results from LAMMPS
and one realization of surrogate model through the random walk on a network.
                       0.200                                             Gaussian fit                                                                   Gaussian fit                          0.14                                    Gaussian fit
                                                                                                               0.175
                       0.175                                                                                                                                                                  0.12
                                                                                                               0.150
Normalized frequency                                                                    Normalized frequency                                                           Normalized frequency
                       0.150                                                                                                                                                                  0.10
                                                                                                               0.125
                       0.125
                                                                                                               0.100                                                                          0.08
                       0.100
                                                                                                               0.075                                                                          0.06
                       0.075
                       0.050                                                                                   0.050                                                                          0.04
                       0.025                                                                                   0.025                                                                          0.02
                       0.000                                                                                   0.000                                                                          0.00
                                 32      34       36      38 40         42     44                                        66     68   70 72 74           76     78                                    140.0 142.5 145.0 147.5 150.0 152.5 155.0 157.5
                                                       Velocity [m/s]                                                                 Velocity [m/s]                                                                  Velocity [m/s]
                                        (a) 𝜏 = 25 𝑀 𝑃𝑎.                                                                      (b) 𝜏 = 50 𝑀 𝑃𝑎.                                                             (c) 𝜏 = 100 𝑀 𝑃𝑎.
Figure 2.8 Normalized histograms of velocity estimates from different applied shear stresses.
Gaussian fit is plotted after computation of expectation E [𝑣] and standard deviation 𝜎 2 [𝑣] from
1000 MC realizations.
regression model to the plots and computing the slope of the linear fit. We repeat this procedure for
a large number of realizations, and run a UQ analysis to obtain the statistics of dislocation mobility.
                           We use a simple MC framework to run several realizations of the surrogate simulation, and we
obtain the expectation E [𝑣], and standard deviation 𝜎 2 [𝑣] of dislocation velocity under each value
of stress. We collect velocity results under 𝜏 = 25 𝑀 𝑃𝑎, 𝜏 = 50 𝑀 𝑃𝑎, and 𝜏 = 100 𝑀 𝑃𝑎, and
plot the histograms in Fig. 2.8. Using the estimated values of E [𝑣] and 𝜎 2 [𝑣] we approximate
a Gaussian to the velocity distributions, closely following the histogram. The agreement between
the curve and the histogram comes from the Central Limit Theorem [114], given that the total
simulation time of the surrogate is a summation of exponentially distributed random variables 𝑋.
                           We plot the results of velocity as a function of applied stress in Fig. 2.9, where we show the
                                                                                                                                      41


                                                         LAMMPS Fit
                                              140        Surrogate Fit
                 Dislocation velocity [m/s]
                                              120        LAMMPS
                                                         Surrogate
                                              100
                                               80
                                               60
                                               40               LAMMPS Mobility = 5937.72
                                                                Surrogate Mobility = 5861.2
                                               20
                                                    20          40         60       80        100
                                                                     Stress [MPa]
Figure 2.9 Velocity versus stress plot, comparison between MD results of dislocation glide from
LAMMPS, and surrogate model simulations using a random walk in a network under two different
system temperatures. The surrogate model accurately estimates the mobility with 1.29% relative
error.
expected velocity value, and its corresponding uncertainty represented as error bars, for 1000 MC
realizations of the surrogate model. We apply a linear regression model to the velocity-stress plot
and obtain the mobility 𝑀 using the linear fit slope 𝑚, as in Eq. (2.1).
   By introducing the expected velocity with corresponding uncertainty, as in Fig. 2.9, we can prop-
agate the uncertainty to the computation of mobility itself. For the set of 1000 realizations shown
in Fig. 2.9, we obtain the corresponding standard deviation for mobility 𝜎𝑀 = 137.27 [1/(𝑃𝑎.𝑠)].
This is an important contribution of this framework, as it allows a multi-scale propagation of un-
certainties related to material properties, starting with the mobility estimate through its modeling
as a Poisson process.
2.3.3   Discussion
Through the definition of a KMC algorithm for a random walk defined on a ring graph topology,
where the jump rates are computed directly from time-series data of dislocation motion from an
MD simulation, we successfully reproduced the stochastic motion of a dislocation glide in a bcc
                                                                         42


crystal. The computational advantage of this procedure is two-fold. First, the coarse-graining lumps
all the atomic domain information into the network topology, with the dislocation represented as
a random walker. The atomistic degrees-of-freedom are condensed into the 𝑛 nodes that define
the graph. Second, we are able to reach the same simulation time faster, which allows for longer
time integration, due to the computation of waiting time statistics that feed the KMC algorithm.
In the end, 94 hours of one MD simulation with postprocessing at a single stress level turns into
an average of 0.45 second surrogate simulation. If we consider the MC estimation of the mobility
with 1000 runs at each stress level, the surrogate takes around 50 minutes.
     One important aspect is that the physics of dislocation motion is embedded in the time-series
data originated from the MD simulation. Therefore, the computation of process rates of forward and
backward jumps already takes that into account from the data itself. This is evident, for example,
in Fig. 2.6, where the effect of higher stress rates applied to the atomistic structure translates into
higher forward jump rates and lower backward rates. Much of the physics of dislocation motion
is embedded in the jump rates, and it would be natural to expand this reasoning to other physical
features beyond stress. The characterization of process rates in this broader parametric space
can then be achieved with the use of State-of-the-art machine learning (ML) algorithms, with MD
simulations used as training data, for a more effective and robust upscaling of dislocation properties.
     Furthermore, the mobility uncertainty can be propagated to higher scales to be used as an input
with associated error, e.g., in DDD simulations. Later, outputs from stochastic DDD may be used
to inform lumped-element models of elasto-visco-plasticity, or even phase-field models of failure.
Through the use of this surrogate model, we provide a quick and efficient method for propagation
of uncertainties across scales, starting form the uncertainty estimation at the atomistic level.
                                                  43


                                            CHAPTER 3
     DATA-DRIVEN LEARNING OF CONTINUUM NONLOCAL EQUATIONS FOR
                                    DISLOCATION DYNAMICS
3.1     Introduction
Dislocation dynamics is intrinsically connected to plasticity [22] and material failure, emitted from
crack tips [115], and piling-up leading to fatigue crack initiation [116]. The long-range interaction
of dislocation stress-fields leads to collective motion characterized by avalanches, intermittency,
and power-law scaling in energy and velocity distributions [11, 117, 118]. Numerical simulations
have successfully reproduced those features from discrete dislocation dynamics (DDD) models
[27]. From a continuum perspective, early attempts of proposing evolution laws led to overly
phenomenological models [119, 120]. Stochastic approaches have been proposed to account for
uncertainties during dislocation motion [121, 122]. Lately, continuum dislocation dynamics (CDD)
emerged as another alternative for the continuous modeling of dislocation lines [123, 124], yet still
focused on explicit modeling of dislocation-dislocation interactions. A meaningful representation
of the collective dynamics of dislocations that highlights the nonlocal, stochastic, and anomalous
behavior of dislocation ensembles in a fluid-limit continuous model is still missing. The use of
nonlocal vector calculus for continuous modeling of dislocation dynamics is a natural, yet novel
alternative.
    Nonlocal models present an alternative to classic differential models where discontinuities are
allowed, and long-range interactions are naturally present in an integral formulation. These features
are attractive in the solution of problems involving convection-diffusion [125, 126], heterogeneous
media [127], turbulent flows [128–132], anomalous materials [47], and subsurface dispersion
[133, 134]. For more applications, please refer to [135] and references therein. The peridynamic
theory [40] was proposed as a nonlocal alternative to classical continuum mechanics of solids,
with applicability in fracture problems with discontinuities [136, 137]. Over the last decade, a
                                                  44


formalization of nonlocal models into a nonlocal vector calculus has been extensively discussed
[37, 38], along with advances towards the unification of nonlocal/fractional models [39, 138].
    With the popularity of Machine Learning (ML) methods, several disciplines have seen in-
creasing applicability of learning algorithms to enhance the understanding of the physics, to learn
parameters of a model, or to construct robust surrogates based on high-fidelity data. Data-driven
approaches for dislocation dynamics have lately acquired more interest. In [139], authors used
two-dimensional DDD simulations to train an algorithm for prediction of stress-strain curves. A
ML approach for prediction of material properties from dislocation pile-ups was presented in
[140]. Classification algorithms have also been used in the context of dislocation micro-structures
[141, 142]. Data-driven surrogate modeling of dislocation glide for computation of mobility esti-
mates with uncertainty was proposed in [143]. Other ML approaches have also grown in the context
of learning the physics in the form of PDEs. We note the contributions of Physics-Informed Neural
Networks (PINNs) [49] which enhances deep neural networks with physics-based constraints, and
PDE discovery approaches through the use of candidate terms and operators [144–147].
    The problem of learning kernels in integral operators has gained attention over the last years,
with major contributions in the context of homogenization via nonlocal modeling and, more
generally, in nonlocal and fractional diffusion diffusion. In one front, nPINNs [48] was introduced
as the nonlocal counterpart of the PINNs framework. Here, nonlocal equations are incorporated
as constraints while training a deep neural network, for both forward and inverse problems with
power-law kernel and finite horizon. The extraction of more complex kernels was investigated,
via an operator regression approach, in [51], allowing the possibility of sign-changing kernels
by representing the kernel function through a polynomial expansion. This approach was further
used in diverse applications such as peridynamics [50], constitutive laws [52], coarse-graining
of molecular dynamics simulations [53], and homogenization of subsurface transport through
heterogeneous media [134].
    In the present work, we use two-dimensional DDD simulations to generate probability distribu-
tion functions from shifted dislocation positions obtained from numerous realizations of the DDD
                                                  45


problem. This approach gives us directly the Lagrangian dynamics of dislocation position. We
transform the particle dynamics into a continuum Probability Density Function (PDF) evolving
over time through an Adaptive Kernel Density Estimation method, generating a time-series of
dislocation position PDFs. We propose a nonlocal model defined through a kernel-based integral
operator for the evolution of the PDFs as the fluid-limit of the underlying stochastic process, and
develop a ML framework to parameterize the nonlocal kernel, learning from the PDF snapshots
generated from DDD data.
    We summarize our main contributions below.
    • We obtain the probabilistic particle dynamics directly from DDD simulations. We high-
      light the effect of external loading and multiplication in the final probability distribution of
      dislocation position. Such differences are not evident in velocity distributions.
    • We propose a general nonlocal equation to model the evolution of dislocation probability
      distributions in space, establishing the link between the discrete nature of dislocation dynam-
      ics at the mesoscale and the origin of its corresponding nonlocal operator at the continuum
      scale.
    • We develop a ML framework to solve the inverse problem of recovering the parameters of the
      nonlocal equation from high-fidelity data. Specifically, we feed PDFs obtained from DDD
      simulations into the ML algorithm and obtain the parameters of the nonlocal power-law
      kernel, in terms of the fractional order 𝛼, horizon 𝛿, and a linear coefficient.
    This work establishes, for the first time, a systematic, direct connection between the discrete
anomalous dynamics of dislocations at the mesoscale to their ultimate effect in a continuum sense.
With this mindset, we obtain a fast alternative to simulate dislocation dynamics through the nonlocal
surrogate model, while still maintaining the underlying physics of micro-structural processes. We
focus on the collective motion of dislocations as an indication of microstructural evolution and
adopt the position PDFs as its measure. When we discover the governing equations for the PDFs,
we have a more efficient and fast evaluation of microstructure evolution compared compared to
                                                   46


high-fodelity DDD. This leads to a more efficient connection to macroscale problems such as
visco-elasticity [47] and fracture [104, 105].
     We also establish an efficient framework for the parameterization of nonlocal kernels, over-
coming limitations of existing methods. Existing learning approaches have the disadvantage of
minimization in a high-dimensional space [51] with gradient-based optimization. The present work
overcome these challenges by defining a bi-level learning algorithm. Given the linearity of our
operator, we separate the learning of coefficients from the nonlocal kernel parameterization. In the
first level, trial pairs of kernel parameters are used to obtain the best coefficients through a Least-
Squares algorithm. The minimization occurs at the second level, where we define a minimization
problem for the kernel parameters only, restricting the optimization to two dimensions and using a
gradient-free search method.
3.2     Two-Dimensional Discrete Dislocation Dynamics
The simplified setup of a two-dimensional simulation, although lacking the curvature and natural
multiplication mechanisms that a three-dimensional simulation provides, is still a robust and effi-
cient way of observing the collective interactions of dislocation populations in a controlled manner.
It allows the extraction of important quantities of interest, such as velocity distributions, stress and
plastic strain evolution, and has been adopted in the literature to understand dislocation avalanches
and power spectrum time-signals [11, 139]. Therefore, here we adopt a two-dimensional discrete
dislocation approach with the goal of learning the main characteristics of collective dislocation
dynamics as a first step to translate such effects into a nonlocal continuum model. Particular
implementation details will be explained in each section where necessary.
     We consider a two-dimensional square domain of size 𝐿, populated with straight edge disloca-
tions with directions along the 𝑧 direction, each of them with a Burgers direction 𝒃 = ±𝑏 along 𝑥,
assuming a single-slip system. We assume there is no climb mechanism, so dislocations may only
move in the 𝑥 direction.
     Immersed in an elastic continuum medium, dislocations create a long-range stress field, such
                                                   47


that each dislocation is affected by the presence of all other dislocations in the crystal through an
interaction stress, as well as any external stress 𝜎𝑒𝑥𝑡 . Given the distance between dislocations 𝒓,
the dislocation-dislocation interaction stress, 𝜎𝑖 (𝒓), is given by [148]
                                                  𝜇𝑏      𝑥(𝑥 2 − 𝑦 2 )
                                    𝜎𝑖 (𝒓) =                            ,                          (3.1)
                                              2𝜋(1 − 𝜇) (𝑥 2 + 𝑦 2 ) 2
where 𝑥 and 𝑦 represent the distances between the edge dislocations in the 𝑥 and 𝑦 directions,
respectively, 𝜇 is the shear modulus, and 𝜈 is the Poisson ratio.
    We simulate a domain in the bulk of the material, and assume it to be sufficiently far from any
free surface, therefore Periodic Boundary Conditions (PBC) are needed. In order to apply the PBC
and take into account all the long-range interactions, we include the forces due to infinite images
of the simulation box. The exact form of the interaction stress is [149]
                                 𝜇𝑏 1 sin(𝑋) [cosh(𝑌 ) − cos(𝑋) − 𝑌 sinh(𝑌 )]
                    𝜎𝑖 (𝒓) =                                                       ,               (3.2)
                              2(1 − 𝜇) 𝐿            [cosh(𝑌 ) − cos(𝑋)] 2
             2𝜋𝑥          2𝜋𝑦
where 𝑋 =     𝐿  and 𝑌 =   𝐿 .
    Then, under the assumption that dislocation motion is overdamped under the viscous drag
regime, the equation of motion for the 𝑖−th dislocation along the 𝑥 direction, from the single-slip
and no climb assumption, is
                                                𝑁
                                                                          !
                                  1 𝑑𝑥𝑖        ∑︁
                                         = 𝑏𝑖      𝜎𝑖 (𝒓 𝑚 − 𝒓𝑖 ) + 𝜎𝑒𝑥𝑡 ,                         (3.3)
                                 𝑀 𝑑𝑡          𝑚≠𝑖
where 𝑀 is the dislocation mobility. All stress definitions refer to the shear stress component 𝜏𝑥𝑦 ,
such that the combination of 𝜎𝑒𝑥𝑡 and 𝜎𝑖 result in the resolved shear stress acting on the dislocation.
The resolved shear stress is the effective driver of motion in the edge dislocation.
    We can solve this equation at first using a Forward-Euler scheme. For simplicity, we rescale
                                                                                        𝜇
the units and solve the problem with length in units of 𝑏, stress in units of 𝜎0 =   2𝜋(1−𝜈) , and time
                  1
in units of 𝑡0 = 𝑀𝜎0 .
    The plastic strain resulting from dislocation motion can be computed following Orowan’s
relation
                                                     𝑁
                                                 1 ∑︁
                                           𝛾= 2          𝑏𝑖 Δ𝑥𝑖 .                                  (3.4)
                                                𝐿 𝑖=1
                                                   48


     Beyond the constitutive relations that govern the dislocation glide velocity due to interactions
and external stress, Eq.(3.3), two-dimensional discrete dislocation dynamics simulations also need
to consider other phenomenological aspects such as annihilation and multiplication.
     Results from linear elasticity become invalid near the dislocation core due to nonlinearity in the
stress field. Therefore, when two dislocations of opposite Burgers vector are within a distance 𝑑 𝑎 ,
they annihilated each other and are removed from the simulation.
     Dislocation multiplication does not occur naturally in 2D simulations, as compared to 3D-
DDD. In order to mimic the bowing of dislocation curves due to Frank-Read sources, we need to
consider a phenomenological model. Here we follow the procedure in [149], where we distribute
𝑁 𝑠 dislocation sources randomly in the domain. At each time-step, we check at the sources the
resulting shear stress, and compare it to a critical stress 𝜏𝑐 . If the stress is above 𝜏𝑐 for more than
𝑡 𝑛𝑢𝑐 time-steps, we generate a pair of dislocations with distance 𝐿 𝑛𝑢𝑐 , such that, in scaled units,
                                                       1
                                              𝐿 𝑛𝑢𝑐 =    .                                          (3.5)
                                                      𝜏𝑐
3.2.1    Representative Example: Single Crystal Under Creep
We simulate three examples of single crystals under creep loading, following the setup from [11].
We consider a domain of size 𝐿 = 300 𝑏, initial number of dislocations of 𝑁0 = 1500, and
annihilation with a critical distance of 𝑑 𝑎 = 2 𝑏. We distribute 20 dislocation sources randomly
throughout the domain, with mean critical nucleation time 𝑡¯𝑛𝑢𝑐 = 10, and critical distance with
mean 𝐿¯ 𝑛𝑢𝑐 = 50, both parameters with variance of 10% 𝐿¯ 𝑛𝑢𝑐 .
     We test three representative examples to understand the effect of dislocation sources and external
load in the form and parameterization of the final nonlocal kernel, where the external shear load is
                                                                    𝜇
expressed using the definition of rescaled stress units, 𝜎0 =    2𝜋(1−𝜈) .
      • Case 1: load of 𝜎𝑒𝑥𝑡 = 0.0125 𝜎0 without dislocation multiplication.
      • Case 2: load of 𝜎𝑒𝑥𝑡 = 0.0125 𝜎0 with dislocation multiplication.
                                                   49


     • Case 3: load of 𝜎𝑒𝑥𝑡 = 0.0250 𝜎0 with dislocation multiplication.
    Case 1 represents locations inside the mechanical part without the presence of imperfections,
impurities, or microcracks such that dislocations that are present inside the material do not multiply,
and hence just glide until occasional annihilation. Therefore, Case 1 is representative of regions
with lower internal stresses, less intense dislocation activity and plastic flow, and no evident rapid
failure processes.
    Conversely, Cases 2 and 3 contain dislocation multiplication sources in a phenomenological
way, representing regions in a component where we would normally observe higher degradation,
under the presence of microcracks, voids, impurities and rough surfaces. Those characteristics are
natural dislocation generators and are typically associated with failure regions. Therefore, in Cases
2 and 3 we are observing what happens near failure-inducing locations.
    The DDD simulations are executed with an in-house Python code running on Intel Xeon Gold
6148 CPUs with 2.40GHz. In all cases, we first let the system relax for 10000 time-steps of size
Δ𝑡 = 1 𝑡0 with no external stress. This procedure leads to intense activity and annihilations until the
dislocations reach a meta-stable configuration with about half the number of original dislocations.
Then, we apply 𝜎𝑒𝑥𝑡 with a time-step of Δ𝑡 = 0.01 𝑡0 until final time of 𝑇 = 30000 for Case 1, and
𝑇 = 25000 for Cases 2 and 3.
    Fig. 3.1 shows one realization of an initial dislocation configuration and the relaxed metastable
                                                                                       Í
                                                                                         𝑖 𝑣𝑖
configuration for Case 1. The time-series plots of the collective velocity 𝑉 =          𝑁 ,   number of
dislocations, and plastic strain during the relaxation steps are shown in Fig. 3.2.
    The collective statistics for the representative realizations discussed in this Section can be
seen in Fig. 3.3. We plot the collective velocity, number of dislocations, and accumulated plastic
strain during the creep load. The collective velocity signal is intermittent for Cases 2 and 3 with
multiplication, where the spikes indicate bursts of activities during an avalanche. The higher load
of Case 3 makes the baseline collective velocity be higher than Case 2, yet the spikes of Case 3 are
not as large, since dislocations will tend to move faster, therefore having less time to interact close
to the critical regime, as we see in Case 2. The higher baseline velocity also affects the accumulated
                                                   50


                                      300                                                                                           300
                                      250                                                                                           250
                                      200                                                                                           200
                                      150                                                                                           150
                                      100                                                                                           100
                                       50                                                                                           50
                                           0                                                                                         0
                                               0       50     100     150                     200               250      300              0      50       100                    150      200      250     300
                                                      (a) Initial configuration.                                                               (b) Relaxed configuration.
Figure 3.1 Dislocation distribution at the beginning of the simulation (a), and after the relaxation
(b) in a metastable structure for Case 1. Red and blue markers correspond to dislocations with
positive and negative Burgers, respectively.
                       400                                                                                                                                                      0.00010
                                                                                                  1400
                                                                             Number of Dislocations
                       300
 Collective Velocity
                                                                                                                                                                                0.00005
                                                                                                  1200
                       200                                                                                                                                    Plastic Strain    0.00000
                                                                                                  1000
                       100                                                                                                                                                     −0.00005
                                                                                                      800
                        0                                                                                                                                                      −0.00010
                             0      2000       4000    6000   8000   10000                                  0         2000     4000 6000      8000    10000                               0     2000   4000 6000   8000   10000
                                                   Time                                                                            Time                                                                   Time
                                 (a) Collective velocity.                                               (b) Number of dislocations.                                                    (c) Plastic strain 𝛾(𝑡).
Figure 3.2 Time-series plots of collective velocity, number of dislocations in the system, and plastic
strain during the relaxation steps to show the system’s stabilization.
plastic strain that is larger for Case 3. The number of dislocations is almost stable for Case 1, with
Cases 2 and 3 showing more oscillations. This is due to the rearrangements that occur after a new
dislocation pair is introduced, which eventually leads to more annihilations than Case 1.
                        Last, we investigate the velocity statistics from the DDD simulations. Fig. 3.4 shows the PDF
of individual dislocation velocity statistics collected throughout the whole simulation time for the
single DDD realization of each case discussed in this section. We find that, in accordance with
[11], the velocity PDFs show a power-law decay in the form 𝑣 ∝ 𝜎 −𝛽 with exponent around 𝛽 = 2.4
for Cases 2 and 3 with multiplication. For Case 1, we see that the decay is slightly sharper.
                        The velocity PDF has been extensively studied both experimentally and numerically over the past
                                                                                                                               51


                                                                  Case 1                                     0.010        Case 1
                       500                                        Case 2                                                  Case 2                                                                                               Case 1
                                                                                                                                                                                           700
                                                                                Accumulated Plastic Strain
                                                                  Case 3                                     0.008
                                                                                                                          Case 3                                                                                               Case 2
                       400                                                                                                                                                                                                     Case 3
 Collective Velocity
                                                                                                                                                                  Number of Dislocations
                                                                                                                                                                                           690
                       300                                                                                   0.006
                                                                                                                                                                                           680
                       200                                                                                   0.004                                                                         670
                                                                                                                                                                                           660
                       100                                                                                   0.002
                                                                                                                                                                                           650
                        0                                                                                    0.000
                                                                                                                                                                                           640
                             0      50   100   150    200   250      300                                             0     50      100   150    200   250   300
                                               Time                                                                                      Time                                                    0   50   100   150    200   250   300
                                                                                                                                                                                                                Time
                                 (a) Collective velocity.                                                     (b) Accumulated plastic strain.                                                    (c) Number of dislocations.
Figure 3.3 Time-series plots of collective velocity, accumulated plastic strain, and number of
dislocations in the system for a single realization of creep test for all three cases.
                                                                                                                                                            Case 1
                                                                                2                                                    3                      Case 2
                                                                                                                                                            Case 3
                                                                                1
                                                                                0                                                               2.4
                                                                      P(|v|)
                                                                               −1
                                                                               −2
                                                                               −3
                                                                                                        −5           −4    −3       −2     −1     0     1     2
                                                                                                                                         |v|
Figure 3.4 Probability Density Function of dislocation velocity for Cases 1, 2, and 3. We observe a
power-law scaling of order 𝛽 = 2.4 for Cases 2 and 3 with multiplication, and a sharper decay for
Case 1.
years, and strongly suggests that the nature of dislocation dynamics has anomalous characteristics.
The power-law exponent of 𝛽 = 2.4 is directly associated to intermittent velocity signals typical
of avalanches and super-diffusive behavior. However, dislocation dynamics simulations, whether
discrete or continuous are still too expensive to be run in the long time with the goal of understanding
how the dislocation-network anomalous behavior influences the failure processes at the macroscale.
                        One could look at dislocation dynamics from the perspective of particle dynamics, where
the dislocations would move under a underlying stochastic process. Ideally, we would analyze
the statistics and construct a stochastic process that governs such particle dynamics, as a way of
generating infinitely many particle trajectories that lead to the fluid-limit dynamics of the process.
However, in the case of dislocation dynamics, even though the velocity distributions give us an idea
                                                                                                                                    52


of the type of stochastic behavior due to the heavy tails with power-law decay, the process cannot
be simply described. Some dislocations are stuck, others jiggle around an equilibrium state, and a
few others move intermittently with large velocity in a highly correlated motion.
    Therefore, in order to obtain the fluid-limit dynamics, we still need to obtain statistics from a
sufficient number of dislocation particles. The steps we take to this end will be discussed in the
next section.
3.3     Data Generation
In this section we describe the methodology to obtain empirical PDFs directly from DDD simula-
tions, without the construction of a stochastic process, as discussed before. On the one hand, this
makes the data generation process expensive, as it relies on simulation of multiple DDD simula-
tions, instead of cheaper stochastic process trajectories. On other hand, this approach benefits from
directly using high-fidelity dislocation data and is the most accurate representation of the dynamics
we could generate.
3.3.1    Obtaining Data of Shifted Positions
We start by defining the shifted position 𝑋𝑖 (𝑡) of a dislocation 𝑖 in a single realization of the DDD
simulation. Given the initial position in the absolute frame of reference of the DDD box, 𝑥𝑖 (0),
and the current absolute position in the simulation frame of reference, 𝑥𝑖 (𝑡), the shifted position is
a measure of the relative displacement of dislocation 𝑖 with respect to its initial position:
                               𝑋𝑖 (𝑡) = 𝑥𝑖 (𝑡) − 𝑥𝑖 (0), for 𝑡 ∈ (0, 𝑇].                           (3.6)
    We obtain a statistically significant collection of data-points 𝑋𝑖 (𝑡) by considering the DDD
simulation as stochastic. We run 𝑛𝑟 = 2000 DDD simulations, each with random initial positions
of dislocations and multiplication sources (for Cases 2 and 3). We distribute the execution of
realizations among several HPC cores to take advantage of embarrassingly parallel stochastic
simulations. With the number of dislocations after the relaxation between 600 and 700, the
                                                     53


compilation of all dislocation shifted positions across the 𝑛𝑟 realizations gives the trajectories of
106 Lagrangian particles that move following an underlying stochastic process starting at 𝑋𝑖 (0) = 0.
    We take the Lagrangian particle trajectories obtained directly from DDD and translate them
into an evolving PDF 𝑝(𝑥,
                        ˆ 𝑡), defined as the probability to find a dislocation at a distance 𝑥 from its
initial position from the start of the DDD simulation. In the end, we want to construct a model for
the time-evolution of 𝑝(𝑥,
                        ˆ 𝑡), and we propose that its evolution is governed by an integral operator
that we model as a nonlocal Laplacian. In the following, we discuss how to transform the DDD
position data into density estimates that will be used as training data for the ML algorithm.
3.3.2    Density Estimation
Under the proposed nonlocal model for evolution of dislocation position PDF, we are most interested
in the nature of the dynamics of dislocation particles, i.e., how they react when put under creep stress
along with an ever-changing stress landscape due to addition and removal of other dislocations.
Given the focus on the dynamics due to load, multiplication, and annihilation mechanisms in a
broader sense, in the limit of infinitely many particles, and not attempting to model the creation and
destruction themselves, we do not include the birth/death in the nonlocal formulation. Instead, we
only describe the nature of the dislocation motion from the continuum perspective, as a consequence
of those mechanisms from the discrete representation at the DDD level.
    In this sense, the annihilation and creation of dislocations in DDD induce a level of noise when
considering a continuous PDF representation. In the creep regime, we can minimize the interference
of such noise by selecting a time domain over which the initial burst of dislocation multiplications
and motion due to the rapid increase in stress up to the creep level has reached a steady-state regime.
For training and testing of the ML framework, we select the last 10000 time-steps from the DDD
simulations to generate the PDFs, as to minimize the effect of applying the creep load, and to obtain
a data-set with the least changes in the number of particles. In the time-series selected, we observe
only 0.1%, 0.69%, and 0.92% in relative difference between final and initial number of dislocations
in the selected data for Cases 1, 2, and 3, respectively. We can then assume a conservative nonlocal
                                                   54


                                   1.38 1e6
                                                                     Case 1
                                   1.37                              Case 2
                                   1.36                              Case 3
                                                                     Selected Data
                                   1.35
                               N   1.34
                                   1.33
                                   1.32
                                   1.31
                                   1.30
                                           0    5000 10000 15000 20000 25000 30000
                                                         Time steps
Figure 3.5 Time-series of the total number of dislocations across the 𝑛𝑟 = 2000 realizations of DDD
for Cases 1, 2, and 3. We highlight the selected data for training and testing the ML algorithm.
transport model in the continuum, yet annihilation, multiplication, and external load will directly
influence the shape and parameterization of the nonlocal operator. We highlight the specific range
of selected data over the whole time-series generated from DDD in Fig 3.5.
    Before applying the estimator, we select the central 99.99% of the probability mass, therefore
discarding the outermost particles at each time-step. This procedure clearly defines a compact
support for the PDF.
    We quickly summarize the classical and Adaptive Kernel Density Estimation formulations
below.
3.3.2.1    Kernel Density Estimation (KDE)
KDE are non-parametric estimators that do not require assumptions on the form of the sampling
distribution. Yet, to use KDE, we need to apply a specific kernel 𝑘 (𝑥 − 𝑥𝑖 ; 𝑤 0 ) parameterized by a
bandwidth 𝑤 0 . At this stage, we obtain an initial density estimate 𝑝ˆ0 (𝑥) from
                                                        𝑛
                                                    1 ∑︁
                                          𝑝ˆ0 (𝑥) =       𝑘 (𝑥 − 𝑥𝑖 ; 𝑤 0 ),                     (3.7)
                                                    𝑛 𝑖=1
where 𝑥 is the coordinate over which we wish to evaluate the PDF, and 𝑥𝑖 are the positions of data
points 𝑖 = 1, 2, . . . , 𝑛.
    Kernels are normalized to unity, i.e.,
                                        ∫        ∞
                                                     𝑘 (𝑥; 𝑤 0 )𝑑𝑥 = 1,                          (3.8)
                                               −∞
                                                         55


and take the form
                                                                            
                                                           1        𝑥 − 𝑥𝑖
                                    𝑘 (𝑥 − 𝑥𝑖 ; 𝑤 0 ) =       𝐾                 .              (3.9)
                                                          𝑤0          𝑤0
    Then, the final density estimator is
                                                         𝑛                
                                                  1 ∑︁            𝑥 − 𝑥𝑖
                                     𝑝ˆ0 (𝑥) =               𝐾                .               (3.10)
                                                𝑛𝑤 0 𝑖=1            𝑤0
    Let 𝑠 be the sample standard deviation. In the limit of number of data-points 𝑛 → ∞ of normally
distributed data, the Mean Integrated Square Error of 𝑝ˆ0 is minimized when [150]
                                             𝑤 0 = 1.06𝑠𝑛−1/5 .                               (3.11)
3.3.2.2   Adaptive Kernel Density Estimation (AKDE)
Given the large jumps observed in DDD, we can expect the PDFs to have heavy tails, yet with
limited data, while the majority of the mass would fall into the central part. A uniform binning
method such as KDE would lead to the occurrence of noise in the tails, which we need to avoid
as those curves feed the ML algorithm. Since higher density regions need narrower bins than low
density tails, we use AKDE to obtain a continuous, smooth function.
    We run the AKDE starting from an initial classical KDE estimate 𝑝ˆ0 (𝑥) with a fixed bandwidth
𝑤 0 from Eq. (3.11). Then, the adaptivity takes place in the variable bandwidth for each data point
[150]:
                                                                       −𝜉
                                                             𝑝ˆ0 (𝑋𝑖 )
                                     𝑤 𝑖 = 𝑤 0 𝜆𝑖 = 𝑤 0                       ,               (3.12)
                                                                𝐺
where 𝐺 is defined as                                                    !
                                                        𝑛
                                                    1 ∑︁
                                      𝐺 = exp               ln 𝑝ˆ0 (𝑋𝑖 ) .                    (3.13)
                                                    𝑛 𝑖=1
    Furthermore, 0 ≤ 𝜉 ≤ 1 is a sensitivity parameter that controls how important the shape of the
initial guess is with respect to the second estimation [151]. A theoretical optimal value was found
to be 𝜉 = 0.5 [152]. We finally obtain the density from AKDE as
                                                     𝑛                    
                                                1 ∑︁ 1            𝑥 − 𝑥𝑖
                                     𝑝ˆ1 (𝑥) =               𝐾                .               (3.14)
                                                𝑛 𝑖=1 𝑤 𝑖            𝑤𝑖
                                                       56


     We define a domain Ω for the density function by adding a band of zeros beyond the right and
left-most points in the 𝑥 direction, providing some space between the compact support of the PDF
and the nonlocal simulation domain. We compute the estimates 𝑝ˆ1 at equally spaced points inside
Ω defining a fixed grid size of ℎ = 0.05. In Case 1, we have Ω1 = [−15, 15], and 𝑚 = 601 points.
Cases 2 and 3 are defined with Ω2 = Ω3 = [−40, 40] leading to 𝑚 = 1601 points. Computation of
𝑝ˆ1 (𝑥) for each time-step is also executed in parallel, each core processing a distinct time-step, for
a total of 1000 HPC cores used for each case.
     Finally, once the final estimates 𝑝ˆ1 (𝑥) are obtained for all time-steps, in the last operation we
enforce symmetry, as to avoid any inconsistencies in the ML algorithm, as we adopt a symmetric,
radial-basis nonlocal kernel in the definition of our operator. We apply the symmetrization by
checking the evolution of mean 𝜇 and skewness factor 𝜇˜ 3 from 𝑝ˆ1 (𝑥), defined as
                                                         ∫
                                       𝜇 = E[ 𝑝ˆ1 (𝑥)] =     𝑝ˆ1 (𝑥)𝑥𝑑𝑥,                           (3.15)
                                   "             3#      ∫
                                      𝑝ˆ1 (𝑥) − 𝜇             (𝑥 − 𝜇) 3 𝑝ˆ1 (𝑥)𝑑𝑥
                          𝜇˜ 3 = E                    = √︃                        3 .            (3.16)
                                           𝜎                ∫
                                                               (𝑥 − 𝜇) 2 𝑝ˆ1 (𝑥)𝑑𝑥
     We initially verify that they are sufficiently close to zero, assuming any non-zero measure of
those parameters are due to lack of sufficient data. Then, we take the left side of 𝑝ˆ1 (𝑥), mirror
it to the other side, and re-normalize the PDF. We verify that after this procedure we guarantee a
symmetric PDF that will better fit the radial-basis nonlocal kernel. Fig. 3.6 shows the values of
𝜇(𝑡) and 𝜇˜ 3 (𝑡) before and after the symmetrization.
     Fig. 3.7 shows snapshots of the final PDF estimate for all three cases at selected time-steps. We
plot the PDFs in logarithmic scale in the 𝑦 direction: we can readily see that the PDFs do not decay
fast as one would expect from a Gaussian process. Instead, we have power-law decaying tails, with
seemingly heavier ends on Cases 2 and 3 where multiplication mechanisms activate the collective
dynamics more intensively. The heaviness on tails is accompanied by a larger support of the PDF.
As we will see in the following sections, when we feed the PDFs into the learning algorithm, the
kernel of the nonlocal operator will reflect those differences, establishing a meaningful link between
the discrete and continuous dynamics.
                                                      57


                                                                            1e−15                                                                                                                                                                  1e−15
                                                      Case 1          3.0                                                                                                                                        Case 1
                                                                                                                                                                                                                                             1.0
                                                      Case 2                                                                                                0.06                                                 Case 2                                                      Case 1
        0.001                                         Case 3                                                                                                                                                     Case 3                                                      Case 2
                                                                                                                                                                                                                                                                             Case 3
                                                                      2.5                                                                                   0.04                                                                             0.5
        0.000
                                                                                                                                         Skewness Factor                                                                  Skewness Factor
                                                                      2.0                                                                                   0.02                                                                             0.0
                                                                                    Case 1
Mean   −0.001
                                                               Mean                 Case 2
                                                                                                                                                                                                                                            −0.5
                                                                                    Case 3                                                                  0.00
                                                                      1.5
       −0.002
                                                                                                                                                           −0.02                                                                            −1.0
                                                                      1.0
       −0.003                                                                                                                                                                                                                               −1.5
                                                                                                                                                           −0.04
                                                                      0.5
                0    20     40          60       80     100                  0               20           40          60    80     100                             0      20    40            60           80      100                              0         20       40          60   80        100
                                 Time                                                                          Time                                                                  Time                                                                                   Time
                           (a)                                                                            (b)                                                                  (c)                                                                                     (d)
Figure 3.6 Evolution of mean and skewness factor of 𝑝ˆ1 (𝑥). The evolution of mean (a) before, and
(b) after the symmetrization and re-normalization. The evolution of skewness factor (c) before, and
(d) after symmetrization and re-normalization.
          100                                                                                            100                                                                                              100
                                                                             t=0                                                                                               t=0                                                                                                      t=0
                                                                             t = 25                                                                                            t = 25                                                                                                   t = 25
                                                                             t = 75                      10−1                                                                  t = 75                     10−1                                                                          t = 75
         10−1                                                                t = 100                                                                                           t = 100                                                                                                  t = 100
                                                                                                         10−2                                                                                             10−2
         10−2
  P(x)                                                                                            P(x)   10−3                                                                                      P(x)   10−3
         10−3
                                                                                                         10−4                                                                                             10−4
         10−4
                                                                                                         10−5                                                                                             10−5
         10−5                                                                                            10−6                                                                                             10−6
            −15      −10         −5          0          5             10                15                  −40       −30   −20    −10                0            10    20    30        40                  −40      −30                   −20         −10        0    10         20   30        40
                                             x                                                                                                        x                                                                                                            x
                            (a) Case 1.                                                                                          (b) Case 2.                                                                                                   (c) Case 3.
Figure 3.7 Final shape of dislocation shifted position PDF from AKDE with symmetrization and
re-normalization at selected time-steps.
3.4                 Nonlocal Transport Models
For the evolution of the dislocation position PDFs, we propose a parabolic nonlocal transport model
defined through a nonlocal operator characterized by a kernel function, that we aim to determine.
We use the data from DDD simulations to train a machine-learned surrogate model, for which we
identify the model parameters.
           We let 𝑝(𝑥, 𝑡) represent the empirical PDF estimate at time 𝑡 ∈ [0, 𝑇], and 𝑥 denote the position
in over the domain Ω = [−𝐿, 𝐿], where 𝐿 is defined by the taking the maximum support at the last
time-step and including extra zeros, as seen in Fig. 3.7 . We model the evolution of 𝑝 following the
nonlocal parabolic equation
                                                                                                         
                                                                                                         
                                                                                                           ¤ 𝑡) = L 𝑝,                                                  𝑥∈Ω
                                                                                                         
                                                                                                          𝑝(𝑥,
                                                                                                         
                                                                                                         
                                                                                                                                                                                                                                                                                        (3.17)
                                                                                                         
                                                                                                          BI 𝑝(𝑥) = 𝑔(𝑥),                                              𝑥 ∈ ΩI ,
                                                                                                         
                                                                                                         
                                                                                                         
                                                                                                                                    58


where L denotes the nonlocal (linear) Laplacian operator defined as
                                     ∫
                              L𝑝 =            𝐾 (|𝑦 − 𝑥|)( 𝑝(𝑦) − 𝑝(𝑥))𝑑𝑦.                        (3.18)
                                      𝐵 𝛿 (𝑥)
    𝐵𝛿 (𝑥) represents the ball centered at 𝑥 of radius 𝛿, also called the horizon, defining the compact
support of L. It is relevant to note that for specific choices of kernel functions, L corresponds to
well-known operators such as the fractional Laplacian [39, 138]. In fact, when 𝐾 (|𝑦−𝑥|) ∝ |𝑦−𝑥| −𝛼 ,
with 𝛼 = 1+2𝑠, 𝑠 ∈ (0, 1), the operator L corresponds to the one-dimensional fractional Laplacian.
Furthermore, when the same kernel is restricted to the compact support 𝐵𝛿 (𝑥), L corresponds to
the so-called truncated fractional Laplacian. The latter turns out to be the operator of choice in our
framework.
    The interaction domain where nonlocal boundary conditions (or volume constraints) are pre-
scribed is defined as:
                       ΩI = {𝑦 ∈ R \ Ω such that |𝑦 − 𝑥| < 𝛿 for some 𝑥 ∈ Ω}.                     (3.19)
    We prescribe nonlocal homogeneous Dirichlet volume constraints, given (in one dimension)
                                                              Ð
by the nonlocal interaction operator 𝐵I : [−𝐿 − 𝛿, −𝐿) (𝐿, 𝐿 + 𝛿] → R, such that 𝑔(𝑥) = 0 at
𝑥 ∈ ΩI .
    The objective of the proposed ML algorithm is to train the nonlocal model on the basis of the
series of PDFs; specifically, we find the best form and parameters of the kernel 𝐾 (|𝑦 − 𝑥|) such
that Eqs. (3.17,3.18) are satisfied and the predicted distributions are as close as possible to the
high-fidelity dataset.
3.5    Machine Learning of Nonlocal Kernels for Dislocation Dynamics
3.5.1   A Bi-level Machine Learning Framework
We start the approximation by assuming that the kernel 𝐾 (|𝑦 − 𝑥|) is a radial function compactly
supported on 𝐵𝛿 (𝑥), decaying with 𝛼−th order power-law, multiplied by a 𝑃(|𝑦 − 𝑥|) function
defined over [0, 𝛿]
                                                        𝑃(|𝑦 − 𝑥|)
                                     𝐾 (|𝑦 − 𝑥|) = 𝐷                ,                             (3.20)
                                                          |𝑦 − 𝑥| 𝛼
                                                    59


where we assume the coefficient 𝐷 ∈ R, 𝐷 > 0. The form of the function 𝑃 is part of the learning
problem and its form strongly depends on the underlying physical system we want to reproduce.
In the literature [51–53], the choice of a linear combination of Bernstein polynomials has been
particularly successful. However, for the application considered in this work, the employment of
Bernstein polynomials does not increase the surrogate’s prediction power. For these reasons, we
consider the simplified case of 𝑃(|𝑦 − 𝑥|) = 1, for which the resulting operator corresponds to a
truncated fractional Laplacian. Thus, the learning problem consists of finding the parameters 𝛼, 𝛿,
and the coefficient 𝐷 that parameterize the kernel.
    We adopt a bi-level learning approach to reduce the dimensions of the minimization problem by
exploiting the linearity of the nonlocal operator. Level 1 consists in obtaining the best coefficient 𝐷
for a given pair of parameters 𝛼 and 𝛿, while at Level 2 the algorithm iterates over different values
of 𝛼 and 𝛿 and minimizes a cost function, each iteration using the optimal 𝐷 found in Level 1.
    For the numerical solution of the bi-level optimization problem, we rewrite the nonlocal trans-
port model, Eq. (3.17), in a semi-discrete manner using a meshless approach, i.e.
                                         ∑︁
                              ¤ 𝑖 , 𝑡) =
                             𝑝(𝑥               𝐾 (|𝑥 𝑗 − 𝑥𝑖 |)( 𝑝(𝑥 𝑗 ) − 𝑝(𝑥𝑖 ))ℎ,              (3.21)
                                         𝑗 ∈H
where 𝐻 is the family of points 𝑥 𝑗 in the neighborhood of point 𝑥𝑖 , and ℎ is the distance between
the points.
    By using the power-law definition of the kernel in (3.20), with 𝑃 = 1, we can write the equation
as
                                           ∑︁        𝐷
                               ¤ 𝑖 , 𝑡) =
                               𝑝(𝑥                            ( 𝑝(𝑥 𝑗 ) − 𝑝(𝑥𝑖 ))ℎ.              (3.22)
                                                |𝑥 𝑗 − 𝑥𝑖 | 𝛼
                                           𝑗 ∈H
3.5.1.1   Level 1
We adapt the ideas presented in [146] for discovery of PDEs, yet, instead of identification of
different PDE terms, our goal is to use the linear structure of Eq. (3.22) to obtain the coefficient 𝐷
given a specific pair of values (𝛼, 𝛿).
                                                       60


     For given values of 𝛿 and 𝛼, we construct vectors 𝑈, and 𝑈𝑡 . 𝑈 contains the RHS of Eq. (3.22),
where the spatio-temporal data are reshaped into a single stacked column array. 𝑈𝑡 is the LHS of
Eq. (3.22) with the time-derivative computed through a forward Euler method at all space and time
points, also transformed into a single column array. Given 𝑛 time-steps and 𝑚 grid points, both 𝑈
and 𝑈𝑡 have size 𝑛𝑚.
     Our problem 𝑈𝑡 = 𝑈𝐷 reads:
                                                                                          
                                              ¤
                                            𝑝(𝑥    0 , 𝑡 0 )   
                                                                
                                                                          𝐶 (𝑥 0 , 𝑡0 ) 
                                                                                            
                                                                                          
                                              ¤
                                            𝑝(𝑥      , 𝑡
                                                    1 0    )           
                                                                            𝐶 (𝑥   , 𝑡
                                                                                  1 0   )   
                                                                                          
                                               ¤ 2 , 𝑡0 )                𝐶 (𝑥 2 , 𝑡0 ) 
                                                                                          
                                            𝑝(𝑥
                                                               =𝐷                                        (3.23)
                                                   .
                                                    .                           .
                                                                                  .          
                                           
                                                   .           
                                                                
                                                                         
                                                                                 .          
                                                                                             
                                                                                          
                                            ¤
                                            𝑝(𝑥
                                                 𝑚−1 𝑛 ,  𝑡  )         
                                                                          𝐶  (𝑥      ,
                                                                                 𝑚−1 𝑛 𝑡  ) 
                                                                                          
                                               ¤ 𝑚 , 𝑡𝑛 )                𝐶 (𝑥 𝑚 , 𝑡 𝑛 ) 
                                                                                          
                                            𝑝(𝑥
                                                                                          
with
                                        ∑︁                    1
                        𝐶 (𝑥𝑖 , 𝑡 𝑘 ) =                                     ( 𝑝(𝑥 𝑗 (𝑡 𝑘 )) − 𝑝(𝑥𝑖 (𝑡 𝑘 ))ℎ. (3.24)
                                               |𝑥 𝑗 (𝑡 𝑘 ) − 𝑥𝑖 (𝑡 𝑘 )| 𝛼
                                        𝑗 ∈H
     Then, for every pair of 𝛼, 𝛿 being considered in the minimization, we use a Least Squares solver
to obtain the best 𝐷.
3.5.1.2    Level 2
We use a minimization algorithm to find 𝛼 and 𝛿 that minimize the Mean Logarithmic Absolute
Error (MLAE):
                                                         𝑛𝑚
                                                   1 ∑︁
                                           𝜖=                   | log( 𝑝 𝑙 ) − log( 𝑝˜𝑙 )|,                  (3.25)
                                                 𝑛𝑚 𝑙=1
where 𝑝 𝑙 represents the true value of the function at a particular 𝑥 ant 𝑡, and 𝑝˜𝑙 is the solution of
the nonlocal model at 𝑥 and 𝑡 starting from the initial conditions at 𝑡 = 0, for the current trial values
𝛼trial and 𝛿trial , and its corresponding 𝐷. We adopt the MLAE with the goal of giving as much
significance to the information on the tails as we give to the central part of the PDF.
                                                                    61


Algorithm 3.1 Bi-Level Machine Learning Algorithm with Nelder-Mead Minimization.
  1: Choose the initial guess (𝛼0 , 𝜎0 ).
  2: for Each iteration 𝑖 of NM (Level 2) do
  3:       Construct the matrix equation Eq. (3.23) and obtain the coefficient 𝐷 with trial parameters
      (𝛼𝑖 , 𝜎𝑖 ) (Level 1).
  4:       Solve the nonlocal model and obtain the trial solutions 𝑝˜ using Eq. (3.26).
  5:       Using the true 𝑝 and trial solutions 𝑝,  ˜ Compute the error using Eq. (3.25).
  6: end for
  7: The algorithm gives the optimal parameters (𝛼opt , 𝜎opt ) and associated 𝐷 opt .
     For all time-steps, we obtain 𝑝˜ at time-step 𝑘 + 1 and grid-point 𝑖, 𝑝˜𝑖𝑘+1 , from 𝑝˜𝑖𝑘 , using the
forward Euler scheme. Thus, Eq. (3.22) becomes
                                                   ∑︁        𝐷
                                𝑝˜𝑖𝑘+1 = 𝑝˜𝑖𝑘 + Δ𝑡                    ( 𝑝˜ 𝑘𝑗 − 𝑝˜𝑖𝑘 )ℎ.           (3.26)
                                                        |𝑥 𝑗 − 𝑥𝑖 | 𝛼
                                                   𝑗 ∈H
     For the solution of the minimization problem we adopt, among several possible choices, the
Nelder-Mead Method (NM), which is a gradient-free, downhill simplex approach that uses a direct
search method (based on function comparison). The overall bi-level algorithm for the identification
of kernel parameters based on minimization by NM is presented in Algorithm 3.1.
     In the solution of the inverse problem, the advantages of high-performance computing become
more evident, since we solve a regression problem and simulate 10000 time-steps of a nonlocal
diffusion equation at each iteration. Therefore, it is paramount that we exploit parallelism in the
solution of Algorithm 3.1. We implement the learning algorithm in Python; we make use of the
NumPy library for the Least Squares regression, and SciPy for the minimization, using the built-in
Nelder-Mead method. We parallelize both Level 1 and 2 using the MPI4Py library. At Level 1,
each processor computes a section of the RHS, as they are independent computations from the
already available training dataset. The LHS is computed once at the beginning of the algorithm,
as it is constant for all iterations. Then, in Level 2, at each time-step of Eq. (3.26), we parallelize
the computation of 𝑝˜𝑖𝑘+1 . In the end, the parallel implementation speeds up the costly computation
of nonlocal operators, and allows the algorithm to converge in less than two hours for Case 1, with
𝑚 = 601 grid points running 200 cores, and in slightly more than two hours for Cases 2 and 3, with
𝑚 = 1601 grid points using 400 cores.
                                                        62


3.6     Results and Discussion
3.6.1   Method of Manufactured Solution
We assess the learning algorithm and nonlocal modeling proposed via the Method of Manufactured
Solutions, where we produce training data from a known kernel and recover it through the ML
algorithm as a necessary consistency check of the proposed ML framework. Starting from an initial
condition, we solve the nonlocal diffusion equation, Eq. (3.17), with the kernel parameterized by
known 𝛼true , 𝛿true , and 𝐷 true , generating the snapshots of 𝑝(𝑥, 𝑡) to be provided to the ML algorithm.
    We simulate the nonlocal diffusion problem, Eq. (3.22), in a domain D = [−1, 1] with 𝐿 = 2,
and select 𝛼true = 1.5, 𝛿true = 𝐿/2, and 𝐷 true = 0.1 as our parameters. For comparison, we simulate
the nonlocal diffusion problem with spatial discretization using 𝑚 = 101 points in space, solving
the equation over 𝑛 = 200 time-steps in size with Δ𝑡 = 0.01. The initial condition for the nonlocal
diffusion problem is a Dirac delta function at 𝑥 = 0 with area equal to 1. Similarly to the DDD
dataset, we let the system evolve and only use the last 200 time-steps of the simulation to collect
the training and testing sets, using 80% of time-steps for training, and the rest for testing.
    We compare the relative errors of 𝛼, 𝛿, and 𝐷 by the following expression
                                                   |𝜉𝑖,opt − 𝜉𝑖,true |
                                             𝜀𝑖 =                      ,                            (3.27)
                                                        |𝜉𝑖,true |
where 𝜉𝑖 represents the parameter in consideration, opt corresponds to the optimal value found by
the algorithm, and true denotes the true parameter value.
    We adopt a parametric space with bounds of 𝛼 = [0, 4], and 𝛿 = [Δ𝑥, 𝐿]. Given the true values
of 𝛼true = 1.5 and 𝛿true = 𝐿/2, we take the initial guess to be 𝛼0 = 2, 𝛿0 = 𝐿/4. We present the
parameter, training and testing error results in Table 3.1. We verify that the algorithm successfully
identifies the parameters within a maximum of 1% error for the horizon 𝛿, while 𝛼 and 𝐷 are within
0.01% error. This example showcases that the decoupling of the learning in 2 levels leads to the
correct kernel. Given the higher number of training points, and the time-dependent dynamics, it is
expected to have a lower testing error, compared to training.
                                                        63


                   Table 3.1 Parameter and algorithm errors for the manufactured solution
                                     𝛼            𝛿                𝐷                Training            Testing
                                     9.81e-4      1.02e-2          6.63e-4          9.50e-4             1.06e-4
                                                               Train                                                 Function Evaluations
                                                                              2.0
                  2.0                                          Test                                                  Initial Guess
                                                                              1.8                                    Solution
                  1.5                                                         1.6
                                                                              1.4
           MLAE                                                           α
                  1.0                                                         1.2
                                                                              1.0
                  0.5
                                                                              0.8
                  0.0                                                         0.6
                        0   10     20    30 40 50        60   70    80              0.5   0.6     0.7    0.8       0.9   1.0    1.1    1.2
                                           Iterations                                                          δ
                                 (a) Iteration errors.                                          (b) Solution path.
Figure 3.8 Iteration errors (training and testing), and the solution path of the ML algorithm when
solving the inverse problem of a manufactured solution with a known kernel.
    We plot the training and test errors over the number of iterations in Fig 3.8. We also illustrate the
solution path from the initial guess to the final parameter estimates of 𝛼 and 𝛿, explicitly showing
the function evaluations driving the iterations of Nelder-Mead optimization. We further explore
the robustness of the algorithm with the DDD-based dataset.
3.6.2   DDD-Driven Results
We now employ the ML framework on the dataset generated by DDD simulations, represented by
the PDFs of shifted dislocation positions obtained at the last 10000 time-steps, as highlighted in
Fig. 3.5. We expand the robustness assessment of the framework, and we test other critical aspects
such as sensitivity to the initial guess and train/test ratios.
    The AKDE algorithm removes the noise from lack of data-points, especially at the tails, and
produces a smooth curve throughout the domain. The intrinsic noise related to the variable number
of particles is embedded in the PDF estimation, leading to smooth curves throughout the time-range
of our data. For these reasons, we expect the algorithm to perform well with the DDD data.
                                                                         64


    We start by investigating the solution with different train-test splits. We compare the results
of a 80/20 split as in the manufactured solution with a 60/40 split. We adopt an initial guess of
𝛼0 = 2, 𝛿0 = 𝐿/2, which resides at the center of the same parametric range used before, 𝛼 = [0, 4],
and 𝛿 = [Δ𝑥, 𝐿]. We run Algorithm 3.1 for Cases 1, 2, and 3, and collect the optimal values of 𝛼,
𝛿, 𝐷, the computational cost in terms of Nelder-Mead iterations, the cost function evaluations, and
the training and validation cost. We present those results in Table 3.2.
    We highlight the results of 𝛼 in Table 3.2 in comparison to the exponent of power-lay scaling in
velocity distributions from Fig. (3.4). We note that the faster velocity decay in Case 1 with 𝛽 = 3 is
consistent with the kernel exponent 𝛼 = 2.99. Similarly, for Cases 2 and 3, the velocity distribution
decay was found to be around 𝛽 = 2.4, while the kernel exponent from the ML was found to be
𝛼 = 2.40 for Case 2 and 𝛼 = 2.54 for Case 3 under the 80/20 split. We will further comment this
connection on the Discussion section.
    The results obtained with the two train-test splits are equivalent in terms of the kernel parameters
and the overall cost. Indeed, there is no evident difference in choosing one ratio over the other.
The main difference comes in the overall training and testing cost. We observe that in all cases
the training cost is larger than the test cost, similarly to the results obtained with the manufactured
solution. This is due to the time-dependent dynamics of the PDF evolution and the higher availability
of training points, as in the manufactured case. However, here we have another contributing factor.
As discussed in Section 3.3, earlier data-points will be heavily influenced by the initial load
application, while late points will be closer to a steady-state. The test set in the 60/40 split includes
more earlier points, and therefore sees their influence reflected in higher training costs, besides
allowing for less training points to make the model more general. For the remaining results, we
choose the 80/20 split as our representative case.
    We plot the evolution of training and test MLAE values over the Nelder-Mead iterations in
Fig 3.9. We see that the algorithm quickly gets near the solution as the errors drop sharply near
the initial iterations. Then, the errors remain nearly constant as the minimizer further explores the
research space in the proximity of the minimum.
                                                    65


                             Table 3.2 Machine Learning results for two train-test split solutions.
                                                           Case 1                              Case 2                              Case 3
                  Train/Test Split           80/20             60/40               80/20           60/40                  80/20        60/40
                  𝛼                          2.99     3.00     2.40     2.37     2.54     2.51
                  𝛿                          20.62    18.79    33.7     33.07    34.03    33.72
                  𝐷                          3.64e-4 3.55e-4 7.40e-4 7.66e-4 1.78e-3 1.83e-3
                  # Iterations               79       75       46       79       60       85
                  # Evaluations              154      149      89       157      108      159
                  Training Cost              6.30e-02 5.73e-02 6.39e-02 5.55e-02 6.75e-02 5.57e-02
                  Testing Cost               4.42e-02 6.33e-02 4.95e-02 8.70e-02 4.28e-02 6.21e-02
                                                  Train                                              Train                                                      Train
        0.8                                       Test             0.30                              Test                                                       Test
        0.7                                                                                                         0.4
        0.6                                                        0.25
        0.5                                                                                                         0.3
                                                                   0.20
 MLAE                                                       MLAE                                             MLAE
        0.4
                                                                   0.15                                             0.2
        0.3
        0.2                                                        0.10
                                                                                                                    0.1
        0.1
                                                                   0.05
              0    10   20   30 40 50       60   70   80                  0   10     20        30   40                    0   10    20       30       40   50      60
                               Iterations                                           Iterations                                           Iterations
                        (a) Case 1.                                            (b) Case 2.                                         (c) Case 3.
    Figure 3.9 Evolution of training and testing MLAE values computed over the NM iterations.
         We further explore the capabilities of the proposed ML algorithm and test the performance of
the parameter learning under different initial conditions beyond the central point, and choose four
extra points near the corners of our parametric search space:
        1. Guess 1 (original guess at the center): 𝛼0 = 2, 𝛿0 = 𝐿/2.
        2. Guess 2: 𝛼 − 0 = 1, 𝛿0 = 𝐿/4.
        3. Guess 3: 𝛼0 = 3, 𝛿0 = 3𝐿/4.
        4. Guess 4: 𝛼0 = 3, 𝛿0 = 𝐿/4.
        5. Guess 5: 𝛼0 = 1, 𝛿0 = 3𝐿/4.
         We present the final results in Table 3.3. In general, an initial guess close to the center of the
parametric space leads to less iterations for Cases 2 and 3. The only different result we obtain is
                                                                                   66


               Table 3.3 Machine Learning results for different initial guess combinations of 𝛼 and 𝛿.
                        Case                                 Guess 1              Guess 2          Guess 3        Guess 4                  Guess 5
                                        𝛼             2.99                        2.99             2.99           2.99                     2.99
                                        𝛿             20.62                       20.63            20.63          20.56                    30.00
                              1
                                        𝐷             3.63e-4                     3.63e-4          3.63e-4        3.63e-4                  3.63e-4
                                        # Iterations  79                          122              57             76                       30
                                        # Evaluations 154                         219              114            152                      59
                                        𝛼             2.40                        2.40             2.40           2.40                     2.40
                                        𝛿             33.70                       33.66            33.66          33.68                    33.68
                              2
                                        𝐷             7.40e-4                     7.40e-4          7.40e-4        7.40e-4                  7.40e-4
                                        # Iterations  46                          73               86             67                       129
                                        # Evaluations 89                          142              163            132                      253
                                        𝛼             2.54                        2.54             2.54           2.54                     2.54
                                        𝛿             34.03                       34.02            34.03          34.03                    34.01
                              3
                                        𝐷             1.78e-3                     1.78e-3          1.78e-3        1.78e-3                  1.78e-3
                                        # Iterations  60                          87               120            68                       211
                                        # Evaluations 108                         169              233            126                      408
     4.0                                                    4.0                                                       4.0
                                                                       Function Evaluations                                      Function Evaluations
     3.5                                                    3.5        Initial Guess                                  3.5        Initial Guess
                                                                       Solution                                                  Solution
     3.0                                                    3.0                                                       3.0
     2.5                                                    2.5                                                       2.5
 α   2.0                                                α   2.0                                                   α   2.0
     1.5                                                    1.5                                                       1.5
     1.0         Function Evaluations                       1.0                                                       1.0
     0.5         Initial Guess                              0.5                                                       0.5
                 Solution
     0.0                                                    0.0                                                       0.0
           0      5      10       15     20   25   30             0   10   20     30    40    50   60   70   80             0   10   20     30    40    50   60   70   80
                                    δ                                                   δ                                                         δ
                        (a) Case 1.                                             (b) Case 2.                                               (c) Case 3.
                           Figure 3.10 Solution path from different combinations of initial guess.
for Case 1, Guess 5, where the horizon 𝛿 is computed as the upper-bound 𝐿, yet with 𝛼, 𝐷, and
MLAE values sufficiently close to the results of other initial guess combinations. Based on this
observation, the horizon 𝛿 seems to have a lower bound, above which the results are less sensitive to
increasing horizon. We show the different paths the algorithm takes under the proposed initial guess
combinations in Fig. 3.10, where we can see the function evaluations made by the algorithm and
the final solutions, illustrating their proximity. We can also distinguish the upper-bound solution of
Case 1, Guess 5 in the same figure.
                                                                                    67


                                       106
                                                                                   Case 1
                                                                                   Case 2
                                       104                                         Case 3
                                       102
                                K(x)
                                       100
                                       10−2
                                              0.0   0.2    0.4         0.6   0.8       1.0
                                                                 x/δ
Figure 3.11 Final kernel shapes from optimized parameters obtained trough the ML algorithm,
scaled by 𝛿.
    We choose the results from Guess 1, at the center of the parametric space, to be the representative
parameters for kernel reconstruction and visualization. From the optimal values of the power-law
decay exponent 𝛼, horizon 𝛿, and coefficient 𝐷, we compute the nonlocal kernel following the
definition from Eq. (3.20). We plot the kernel shapes for Case 1, 2, and 3 in Fig 3.11.
    Finally, to illustrate the nonlocal model’s potential to simulate the evolution of position PDFs
from DDD simulations, we run the model using Eq. (3.26) with the optimal parameter values
from Guess 1 combination, starting from the initial time-step of training, until the last time-step of
testing, covering the whole range of available data. We measure the accuracy of the model using
the 𝑙2 relative error at the last time-step, defined as
                                                          ∥ 𝑝˜ − 𝑝∥ 2
                                                    𝜖=                ,                         (3.28)
                                                             ∥ 𝑝∥ 2
where 𝑝˜ represents the model solution at the specified time-step, and 𝑝 is the true PDF obtained
from DDD at the same time measure.
    We compute the relative 𝑙2 error and obtain 𝜖1 = 4.75e-2, 𝜖2 = 4.22e-2, and 𝜖3 = 6.30e-2 for
Case 1, Case 2, and Case 3, respectively. Considering that this simulation takes over the 10000
time-steps of available data, the maximum relative error of 6.3% in the 𝑙 2 sense for Case 3 shows that
the model can successfully reproduce the overall dynamics of the fluid-limit motion of dislocation
particles in one dimension. We further illustrate the final shape of the PDF from the model, and
compare it with the true shape at the final time-step in Fig 3.12.
                                                           68


                                                Initial                                                   Initial                                                  Initial
                                                                                                                                     −1
                                                                           −1
                                                                                                                                10
             −1                                 True                  10                                  True                                                     True
        10
                                                Predicted                                                 Predicted                                                Predicted
                                                                                                                                     −2
                                                                           −2                                                   10
                                                                      10
             −2
        10
                                                                           −3                                                        −3
                                                                      10                                                        10
 P(x)   10
             −3
                                                               P(x)                                                      P(x)
                                                                           −4                                                        −4
                                                                      10                                                        10
             −4
        10                                                                 −5
                                                                      10                                                        10
                                                                                                                                     −5
                                                                           −6
        10
             −5                                                       10                                                             −6
                                                                                                                                10
                                                                           −7
                                                                      10
                  −15   −10     −5   0      5   10        15                    −40   −20        0   20             40                    −40   −20    0      20             40
                                     x                                                           x                                                     x
                              (a) Case 1.                                             (b) Case 2.                                               (c) Case 3.
Figure 3.12 Simulation of the nonlocal model for the whole time-interval of available data, high-
lighting the initial PDF, and the final distributions of the true data and the nonlocal prediction.
         We verify that the ML algorithm successfully captured the parameters that best describe the
evolution of dislocation position PDFs according to the proposed nonlocal diffusion model.
3.6.3               Discussion
Given the broad scope of this work, we divide the main discussion among three dominant facets.
We discuss the overall capabilities of the proposed bi-level ML algorithm, followed by a discussion
on the nonlocal model itself, and how the nonlocal kernel is connected to the particle dynamics
observed in DDD.
         We start by examining the ML aspects of the data-driven approach. It is evident that the
proposed framework for solving the inverse problem of finding the parameters of a nonlocal Laplace
operator is successful in the present scenario. Starting with the manufactured solution, we see the
training error quickly approaching the plateau in Fig. 3.8. The decoupling of the nonlocal diffusion
coefficient 𝐷 from other kernel parameters 𝛼 and 𝛿 (which is possible due to the linearity of the
operator) facilitates the dimensionality reduction and the implementation of the bi-level algorithm.
This shows one advantage of the proposed algorithm when compared to existing optimization-based
kernel learning methods [51].
         The robustness is guaranteed by the data-driven learning of a DDD-based kernel. The bi-
dimensional optimization algorithm converged in as few as 𝑂 (102 ) function evaluations and 𝑂 (10)
Nelder-Mead iterations in most cases. We also observed the convergence to the same parameters in
                                                                                            69


all but one different combinations of initial guess, which is an advantage over existing algorithms
based on deep-neural networks [48]. Additionally, the proposed framework easily generalizes to
more complex kernel shapes other than a pure power-law decaying shape, by simply adding the
multiplicative factor 𝑃 and learning its parameters. In fact, when defining the kernel as, e.g., a
combination of basis functions, the algorithm seamlessly accommodates this approximation through
inserting additional columns on the RHS of Eq. (3.23), and solving for more coefficients at Level 1.
In other words, the minimization will always be two-dimensional, and the inclusion of polynomial
basis functions would only affect the Least-Squares problem. In the end, this is more efficient than
minimizing over all parameters, since the bottleneck of the learning problem is in the computation
of the forward nonlocal solution during the optimization. The use of high-performance computing
and parallelization further enhanced the performance of the learning framework, making it scalable
and reducing the computational burden of the forward problem.
    The efficiency of the algorithm is reflected in the consistency of results obtained in Cases 1,
2, and 3. It is clear that the nonlocal model is the appropriate choice for this particular problem,
evidenced by the large value of horizon 𝛿, to the order of 400-600 times the grid size. The long-
range interactions from DDD are therefore represented as a nonlocal kernel with large horizon.
Moreover, multiplication mechanisms correspond to larger values of 𝛿, since the avalanches and the
associated collective dynamics, represented by intermittency in velocity signals, lead to heavy-tail
velocity distributions, which translates into heavy tails in the corresponding PDFs. This is in
contrast with Case 1 without multiplication, where we have the PDFs closer to normal distributions
than the ones from Cases 2 and 3. Therefore, the anomalous behavior in the discrete case leads to
nonlocality also in the continuum case.
    The other immediate observation related to the nonlocal model is the meaning of 𝛼, which
clearly distinguishes the dynamics of dislocations with and without multiplication. For Case 1
without multiplication, we obtain a value of 𝛼 closer of 2.99, while the multiplication mechanisms
of Cases 2 and 3 are translated into 𝛼 = 2.4 and 𝛼 = 2.54, respectively. As anticipated in Section
3.4, the operator L obtained with the power-law nonlocal kernel with finite horizon is equivalent
                                                 70


to the truncated fractional Laplacian of fractional order 𝑠 via the relationship 𝛼 = 1 + 2𝑠. Under
such view, Case 1 would correspond to a fractional Laplacian with 𝑠 = 0.99, while Cases 2 and 3
would take 𝑠 = 0.7 and 𝑠 = 0.77, respectively. In this perspective, it is straightforward to see the
evolution of dislocation PDFs as being super-diffusive, with Case 1 being the closest to a classical
diffusion process, yet with a pronounced nonlocality due to rearrangements in the dislocation
structure due to annihilations. However, the multiplication mechanism is the main factor that turns
a rather diffusive process into super-diffusive. Moreover, the super-diffusion is intensified under a
lower load, Case 2, where the external stress state allows faster relaxation and stronger subsequent
avalanches compared to Case 3, where the overall higher stress state makes all dislocations move
faster, yielding less relaxation time to a critical, metastable configuration.
    The most striking observation related to the kernel discovery is the correspondence between
the kernel fractional parameter 𝛼 and the scalings of velocity distribution tail from Fig. 3.4. The
empirical scaling observed here and among other works in the literature matches the values of
𝛼 found by the data-driven kernel learning algorithm. This is not surprising, as we can take the
velocity distribution to be jump size distributions that define a particular Lévy measure, essential
when transforming a stochastic process into differential equations through the Semigroup theory
[114]. The formal and complete definition of the stochastic process that governs the dislocation
trajectories is out of the scope of this paper, yet we see jump size distributions following Fig. 3.4
lead to operators defined by power-law kernels taking the form of Eq. (3.20).
    From a wider perspective, the procedure and reasoning developed in this paper do not need
to be restricted to dislocation dynamics. The problem of learning kernels and dynamics from
high-fidelity simulations and real-life data is a relevant research topic of increasing popularity.
While the methodology presented in this work highlights the inference of a nonlocal operator from
physical mechanisms, more specifically dislocation dynamics, the same procedure may be applied
to tie the use of other non-standard nonlocal operators to the anomalous physical processes that
they describe.
                                                    71


                                            CHAPTER 4
AN INTEGRATED SENSITIVITY-UNCERTAINTY QUANTIFICATION FRAMEWORK
        FOR STOCHASTIC PHASE-FIELD MODELING OF MATERIAL DAMAGE
4.1     Introduction
Companies want to deliver safe products to clients with a solid knowledge of the component’s
life cycle within admissible ranges of operation. With that objective, industry spends millions of
dollars every year in manufacturing, instrumenting and conducting validation tests. In order to
reduce costs, manufacturers invest heavily in early stages of product design to reduce final project
cost and development time. Decisions made in design phase have an impact of 70% of final cost.
In that sense, numerical simulations have contributed to reduce the cost and time spent in product
development through CAE applications in solid and fluid mechanics, and failure analysis play an
important role in saving costs and ensuring safety.
     The importance of reliable failure analysis motivates the development of trustworthy mathemat-
ical models and robust numerical methods. Several researchers have developed models of damage
initiation, propagation and fatigue life of materials, using continuum damage models and fracture
mechanics [153–155]. The ad hoc characteristics of those classical models prevent them to apply
to a wider range of problems, affecting their predictability. Some of the limitations appear when
dealing with crack initiation or branching, for example.
     Recently, phase-fields have become a solid alternative to treat those difficulties in damage
and fatigue modeling. Phase-field models were first developed to solve fluid separation problems
[54]. Its capability of modeling sharp interfaces through a smooth continuous field extended
its application to different multiphase problems with moving boundaries, including solidification
[55], tumor growth [56], two-phase complex fluid flow [57] and fluid-structure interaction [58].
Over the last decade, phase-field models were used in simulations of brittle [31, 156, 157] and
ductile fracture [32, 158]. Phase-field damage models are able to capture many effects such as
                                                  72


crack initiation, propagation, branching and coalescence. Crack branching is typically observed
in dynamic fracture [33, 159]. Fatigue effects were modeled using thermodynamically consistent
approaches [160] and fractional derivatives [161]. Boldrini et al (2016) [34] developed a non-
isothermal and thermodynamically consistent framework for damage and fatigue using phase-fields.
Spatial convergence and 2D results were presented by Chiarelli et al (2017) [162]. A comparison
between semi and fully implicit time integration schemes was analyzed by Haveroth et al (2018)
[163].
    Despite the ability to describe crack geometry and incorporate naturally fatiguing mechanisms
and different constitutive laws, most examples of phase-field solutions studied so far include
geometric characteristics that drive the crack path to determined places. The presence of notches,
indentations, regions of stress concentration, is recurrent and leads to controlled experiments
with a predictable crack path (always assuming a perfect material). In those models, there is no
consideration of stochastic effects to account for material and manufacturing imperfection, surface
roughness, or even misalignment of loading conditions that, in practice can drive failure to other
locations.
    The development of solution methods for stochastic differential equations have been under great
focus over the last decades. Traditional sampling methods such as the Monte Carlo (MC) method
[59, 164] are widely used to compute solution statistics. However, MC has slow convergence rate, to
the order of 𝑁 −1/2 (𝑁 is the number of realizations), requiring an abundant number of simulations,
which can be prohibitive for a computationally expensive forward solver. A remedy is to build
surrogate models through established methods such as Polynomial Chaos [63, 165, 166], where the
stochastic model is approximated by a set of polynomial basis, or its generalization via Galerkin
projections [64, 65, 167].
    Such surrogate models have the disadvantage of being intrusive, in which they require mod-
ifications on the governing equations and become impractical in complex systems. Conversely,
non-intrusive methods use the forward solver as a black-box, with general application. Proba-
bilistic Collocation Methods (PCM), or Stochastic Collocation[66, 67], sample the random space
                                                  73


efficiently, leading to better convergence than MC, while still preserving a simple solution struc-
ture, where realizations can be independently sampled, with easy parallelization. The curse of
dimensionality carried by the tensorial products involved in the computation of solution statistics
can be overcome with the use of Sparse Grids [68], or dimensionality reduction techniques, such
as active subspace methods [168–170].
    Yet, only a few works focused on studying the stochastic nature of phase-field models. Un-
certainty quantification and sensitivity analysis in the context of phase-field models for polymeric
composites was addressed by Hamdia et al (2015) [171]. Authors used different methods to address
the parametric sensitivity from a toughness test geometry with two notches. In that scenario, matrix
elasticity modulus, volume fraction of clay platelets and the fracture energy of the matrix were the
most influential parameters. In another application, sensitivity analysis was performed in a tumor
growth phase-field model [172]. Using stochastic collocation, authors identified two most sensitive
parameters: the rate of cellular mitosis and nutrient mobility. In that case, nutrient transport was
governed by a traditional diffusion equation.
    Besides the deterministic treatment of failure process, historically, phase-field models neglected
nonlocal terms in the free energy potentials. The question of how those assumptions affect the
model predictability of failure motivates a systematic way to evaluate model form uncertainty, which
has become an active research topic recently [173, 174].Through determining the salient sources
of uncertainty, educated modifications in the modeling process can be proposed, informed by the
model itself. This self-assessment procedure could be further extended to incorporate stochasticity
in space, and stochastic processes.
    In this chapter we develop a framework to assess model form uncertainty of a damage and
fatigue phase-field model through uncertainty propagation and sensitivity. We define parameters
related to damage and fatigue, as well as viscous damping effects, as random variables and solve
a stochastic system of equations that model material damage using phase-fields. We compute the
expected solution and standard deviation fields in univariate and multivariate setups. We define
local sensitivity expectation, where we use complex-step differentiation [175] to compute local
                                                  74


sensitivity at each collocation point in the random space. A variance-based method [176] is used
to compute each parameter’s Sobol indices in the global sensitivity analysis. These methods
establish a framework to systematically investigate model form uncertainty. An incorrect operator
can be detected by looking at the parameters that multiply them, which will be more sensitive
and influential in the total output uncertainty. Assuming the model form is correct transfers the
salient uncertainty associated with the operator to its parameters. We identify the parameters that
most contribute to total output variance so that new models can be derived, mitigating model form
uncertainty.
    The chapter is organized as follows. In Section 4.2 we present a stochastic version of the
damage and fatigue phase-field model derived in Boldrini et al (2016) [34]. We discuss the system
of PDEs, the finite element spatial discretization, the semi-implicit scheme for time integration and
the stochastic discretization methods, namely Monte Carlo sampling and Probabilistic Collocation.
Then, we present methods used for local and global sensitivity analysis in Section 4.3. In Section 4.4
we show the results of uncertainty and sensitivity for two representative numerical examples.
4.2    A Stochastic Damage and Fatigue Phase-Field Framework
We show a stochastic version of the phase-field framework for structural damage and fatigue pre-
sented in Boldrini et al (2016) [34], which consists in a deterministic system of coupled differential
                                                             ¤ damage 𝜑 and fatigue F . Damage 𝜑 is
equations for the evolution of displacement u, velocity v = u,
a phase-field variable describing the volumetric fraction of degraded material, taking values 𝜑 = 0
for virgin material, 𝜑 = 1 for fractured material, and can change between those states, 0 ≤ 𝜑 ≤ 1,
as a damaged material. The evolution equation for the damage field is an Allen-Cahn type, and
is obtained along with the equations of motion for u and v through the principle of virtual power
and entropy inequalities. The fatigue field F is treated as an internal variable, whose evolution
equation is obtained through constitutive relations that must satisfy the entropy inequality for all
admissible processes. The geometry is defined over a spatial domain Ω𝑑 ⊂ R𝑑 , 𝑑 = 1, 2, 3 at time
𝑡 ∈ (0, 𝑇].
                                                  75


     The overall construction of the model gives origin to a general set of equations that can take
different forms based on suitable choices of free-energy potentials, boundary and initial conditions.
The choice of free-energy potentials affects directly the final model form, the specific coupling
between the fields of interest and material behavior. The free-energy potential J (𝜑, F ), related to
damage 𝜑 and fatigue F has the traditional form [34]
                                                  𝛾
                                    J (𝜑, F ) = 𝑔𝑐 |∇𝜑| 2 + K (𝜑, F ),                            (4.1)
                                                  2
where 𝑔𝑐 is the Griffith energy, 𝛾 > 0 is the phase-field layer width parameter and K (𝜑, F ) is a
function that relates the change of damage to material fatigue. This general form for free-energy
has been used since the first phase-field models, where the first term corresponds to interfacial
energy and originates the Laplacian operator in Allen-Cahn type equations. The second term is
called mixing energy in other disciplines, and may take different forms based on the application,
even within damage models.
     From the governing equations of the deterministic phase-field model we can identify many
material parameters that are easily obtained from experimental procedures, namely elasticity con-
stants (𝐸 and 𝜈), and density 𝜌0 . For that reason, and to reduce the complexity of the analysis,
we consider them to be deterministic. The remaining parameters, on the other hand, are either
not physically measurable, or their value is uncertain, and are considered to be stochastic. First,
we have those parameters which are proportionality constants due to mathematical modeling, such
as rate of change of damage and fatigue, 𝑐 and 𝑎, respectively. Similarly, phase-field layer width
𝛾 is a parameter that controls the diffusivity of damage field, therefore is a mathematical artifact
that should be as close to zero as possible to recover the sharp interface. Furthermore, the viscous
damping coefficient 𝑏 is not promptly identifiable, and it would require a correlation with damping
ratio to be obtained experimentally through modal testing. Finally, the Griffith energy 𝑔𝑐 , although
related to stress intensity factor, would require further experiments for each specific material, since
its range varies broadly for materials with the same elastic properties.
     We define the parameters mentioned as random variables and derive the stochastic version of the
damage and fagitue phase-field model. Let (Ω𝑠 , G, P) be a complete probability space, where Ω𝑠 is
                                                   76


the space of outcomes 𝜔, G is the 𝜎−algebra and P is a probability measure, P : G → [0, 1]. We
define the five-dimensional set of random parameters 𝜉 (𝜔) = {𝛾(𝜔), 𝑔𝑐 (𝜔), 𝑎(𝜔), 𝑏(𝜔), 𝑐(𝜔)}.
The random nature of 𝜉 makes the operators and output fields to also be random. However, we
simplify the notation and only explicitly represent the random parameters as 𝜉 = 𝜉 (𝜔), supressing
the random variable indication elsewhere. Choosing appropriate free-energy potentials, we obtain
the stochastic equations for a linear elastic and isotropic material without temperature evolution
defined over Ω𝑑 × (0, 𝑇] × Ω𝑠 :
           
           
           
           
            u¤ = v,
                                    
                                 C                         𝛾(𝜔)𝑔𝑐 (𝜔)
           
           
                              2         𝑏(𝜔)
              ¤
            v = div (1 − 𝜑) 𝜌0 E + 𝜌0 div (D) −                       div (∇𝜑 ⊗ ∇𝜑) + f,
           
           
           
                                                               𝜌0
                  𝛾(𝜔)𝑔𝑐 (𝜔)        1                      1                                    (4.2)
           
           
            𝜑¤ =             Δ𝜑 + (1 − 𝜑)E𝑇 CE −               [𝑔𝑐 (𝜔)H ′ (𝜑) + F H ′𝑓 (𝜑)],
           
                      𝜆            𝜆                   𝜆𝛾(𝜔)
           
                     𝐹ˆ
            F¤ = − 𝛾(𝜔) H 𝑓 (𝜑),
           
           
           
           
subjected to appropriate initial and boundary contitions, which depend on the physical problem.
For the equation of motion, usually displacement or stress are known at the boundaries. For damage
evolution, ∇𝜑 · n = 0 at the boundary, 𝜕Ω𝑑 .
    The infinitesimal strain and the rate of strain tensors are represented by E = ∇𝑆 u and D = ∇𝑆 v,
respectively. C represents the elasticity tensor, as a function of Young’s modulus 𝐸 and Poisson
coefficient 𝜈. Parameter 𝑏 is the viscous damping of the material, 𝜌0 corresponds to the material’s
density. Following an argument by Lemaitre and Desmorat (2005) [153], 𝜆 is constructed such that
the rate of change of damage should increase as damage increases, following the relation:
                                          1         𝑐(𝜔)
                                             =               ,                                  (4.3)
                                          𝜆 (1 + 𝛿 − 𝜑) 𝜍
where 𝑐 and 𝜍 are positive parameters that should depend on the material. The relation includes 𝛿,
a small positive constant, in order to avoid numerical singularity.
    The terms H ′ (𝜑) and H ′𝑓 (𝜑) are the derivatives of H (𝜑) and H 𝑓 (𝜑) with respect to 𝜑 and
play an important role in the evolution of damage. Different choices for those potentials change
the form of the transition of damage phase-field as fatigue changes from zero to 𝑔𝑐 . Further details
                                                   77


about the behavior of fatigue potentials can be found in Boldrini et al (2016) [34]. If we choose the
transition to be continuous and monotonically increasing, suitable choices for the potentials are:
                                     
                                        0.5𝜑2            for 0 ≤ 𝜑 ≤ 1,
                                     
                                     
                                     
                                     
                                     
                                     
                                     
                                     
                            H (𝜑) = 0.5 + 𝛿(𝜑 − 1) for 𝜑 > 1,                                   (4.4)
                                     
                                     
                                     
                                     
                                      −𝛿𝜑               for 𝜑 < 0.
                                     
                                     
                                     
                                           
                                              −𝜑 for 0 ≤ 𝜑 ≤ 1,
                                           
                                           
                                           
                                           
                                           
                                           
                                           
                                           
                                 H 𝑓 (𝜑) = −1 for 𝜑 > 1,                                        (4.5)
                                           
                                           
                                           
                                           
                                            0 for 𝜑 < 0.
                                           
                                           
                                           
    The growth of the fatigue field F is controlled by the 𝐹ˆ parameter, which is related to the
formation of micro-cracks that occur in cyclic loadings. Even in monotonic loadings there is
growth of the fatigue variable, because a monotonic loading can be considered as one portion of a
complete cyclic load. The form of 𝐹ˆ depends on the absolute value of the power related to stress
in the virgin material:
                                𝐹ˆ = 𝑎(𝜔)(1 − 𝜑) |(CE + 𝑏D) : D| ,                              (4.6)
where the parameter 𝑎 in this case is chosen to give a linear dependence of the power of stress.
    We construct a general framework to compute the stochastic solutions using three levels of
discretization: in space, using finite element method; in time, using a semi-implicit integration
scheme; and in the random space, where we choose our realizations randomly through Monte Carlo
sampling, or by Probabilistic Collocation. Fundamentally, the finite element solver acts as a black
box for any nonintrusive stochastic method. Details on the spatio-temporal discretization of the
phase-field model can be found in the Appendix A.
4.2.1   Stochastic Discretization
Displacement, velocity, damage and fatigue from the phase-field framework solution are functions
of random parameters; therefore, they are random fields. In order to obtain the statistical moments
                                                 78


of those outputs we will solve the deterministic system of equations over an ensemble of different
realizations, each of them with distinct parameter values. The inputs for each realization depends
on the sampling method. In this work we employ Monte Carlo (MC) and Probabilistic Collocation
Method (PCM). Since those methods are non-intrusive, the finite element solver acts as a black
box, where the choice of parameter values and computation of statistical quantities are simply pre-
and post-processing tasks, respectively.
4.2.1.1   Probabilistic collocation method
The Probabilistic Collocation Method (PCM) poses a great advantage over the MC method, since
PCM uses polynomial interpolation to approximate the solution in the random space. The map-
ping between the random and physical space is made through the probability density function
of the uncertain parameters. Using orthogonal polynomials, such as Lagrange, the computation
of expectation and variance reduces to running the simulation at the collocation points, reducing
computational cost significantly, while improving convergence rates.
    The mathematical expectation of the solution, E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)], in a one-dimensional random
space can be written as
                                                   ∫   𝑏
                             E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] =        𝑈 (𝑥, 𝑦, 𝑡; 𝜉) 𝜌(𝜉)𝑑𝜉,                 (4.7)
                                                     𝑎
where 𝜌(𝜉) is the probability distribution function of 𝜉. In order to use Gauss quadrature, we must
map the domains of the distribution to the interval [−1, 1] in the standard domain of a variable 𝜂.
The integral should then be written as
                                            ∫  1
                       E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] =      𝑈 (𝑥, 𝑦, 𝑡; 𝜉 (𝜂)) 𝜌(𝜉 (𝜂))𝐽𝑑𝜉 (𝜂),            (4.8)
                                              −1
where 𝐽 = 𝑑𝜉/𝑑𝜂 represents the Jacobian of the transformation. We approximate the expectation
by introducing a polynomial interpolation of the exact solution in the random space, 𝑈ˆ (𝑥, 𝑦, 𝑡; 𝜉):
                                            ∫  1
                       E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] ≈      𝑈ˆ (𝑥, 𝑦, 𝑡; 𝜉 (𝜂)) 𝜌(𝜉 (𝜂))𝐽𝑑𝜉 (𝜂).           (4.9)
                                              −1
                                                     79


     We interpolate the solution in the random space using Lagrange polynomials 𝐿 𝑖 (𝜉):
                                                       ∑︁𝐼
                                   𝑈ˆ (𝑥, 𝑦, 𝑡; 𝜉) =         𝑈 (𝑥, 𝑦, 𝑡; 𝜉𝑖 )𝐿 𝑖 (𝜉),                     (4.10)
                                                        𝑖=1
which satisfy the Kronecker delta property at the interpolation points:
                                                   𝐿 𝑖 (𝜉 𝑗 ) = 𝛿𝑖 𝑗 .                                    (4.11)
     We substitute Equation (4.10) into (4.9), and approximate the integral using a quadrature rule.
Then, the expectation is now written as
                                           𝑄
                                          ∑︁                    ∑︁𝐼
                   E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] ≈        𝑤 𝑞 𝜌(𝜉 (𝜂))𝐽           𝑈 (𝑥, 𝑦, 𝑡; 𝜉 (𝜂))𝐿 𝑖 (𝜉 (𝜂)),     (4.12)
                                          𝑞=1                    𝑖=1
where we compute the coordinates 𝜂 𝑞 and weights 𝑤 𝑞 for each integration point 𝑞 = 1, 2, . . . , 𝑄.
This computation can be evaluated efficiently by choosing the collocation points to be the same as
the integration points 𝑞 on the paramectric space. Through the Kronecker property of the Lagrange
polynomials (4.11), the approximation (4.12) is simplified as a single summation:
                                                 𝑄
                                                ∑︁
                         E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] =       𝑤 𝑞 𝜌(𝜉 𝑞 (𝜂 𝑞 ))𝐽𝑈 (𝑥, 𝑦, 𝑡; 𝜉 𝑞 (𝜂 𝑞 )).            (4.13)
                                                𝑞=1
     We use a linear affine mapping from the standard to the real domain: 𝜉 𝑞 (𝜂 𝑞 ) = 𝑎 + (𝑏−𝑎)      2 (𝜂 𝑞 + 1).
This mapping gives us the Jacobian (for a one-dimensional integration) simply as 𝐽 = (𝑏 − 𝑎)/2.
In practice, after we find the quadrature points in the standard domain, we use the mapping to find
the respective values of the random variable in our interval.
     We can now approximate the integral and rewrite it as a summation over the collocation
points, again assuming a uniform distribution for the parameters over the interval [𝑎, 𝑏], which
gives us a constant value of 𝜌(𝜉) = 1/(𝑏 − 𝑎). The expectation becomes E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] =
1 Í𝑄
2 𝑞=1 𝑤 𝑞 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 𝑞 ). Similarly to the Monte Carlo method, the standard deviation is 𝜎 [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] =
√︃                                                     2
   1 Í𝑄
   2 𝑞=1 𝑤 𝑞 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 𝑞 ) − E [𝑈 (𝑥, 𝑦, 𝑡; 𝜉)] .
                                                           80


    If we want to generalize PCM for higher dimensions, it is just a matter of having additional
integrals to Eq. (4.7). In discrete form, this reduces to
  E 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 1 , . . . , 𝜉 𝑘 ) = E𝑃𝐶 𝑀 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 1 , . . . , 𝜉 𝑘 )
                                                                               
                                         𝑄
                                        ∑︁       ∑︁𝐿
                                     ≈      ···        𝑤 𝑞 . . . 𝑤 𝑙 𝜌(𝜉 𝑞 ) . . . 𝜌(𝜉𝑙 ) 𝐽𝑞 . . . 𝐽𝑙 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 𝑞1 , . . . , 𝜉𝑙𝑘 )
                                        𝑞=1      𝑙=1
                                                                                                                                (4.14)
where we have 𝑘 summations, one for each dimension in the random space. In 𝜉𝑙𝑘 the superscript
indicates the dimension in the random space, and the subscript specifies the collocation point in
that dimension. Simplifying the notation using E 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 1 , . . . , 𝜉 𝑘 ) = E [𝑈], the standard
                                                                                                    
deviation becomes
     𝜎 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 1 , . . . , 𝜉 𝑘 ) = 𝜎𝑃𝐶 𝑀 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 1 , . . . , 𝜉 𝑘 )
                                                                                 
         v
         u
         u
         t∑︁𝑄      ∑︁𝐿                                                                                            2
      ≈        ···                                                                       1
                        𝑤 𝑞 . . . 𝑤 𝑙 𝜌(𝜉 𝑞 ) . . . 𝜌(𝜉𝑙 ) 𝐽𝑞 . . . 𝐽𝑙 𝑈 (𝑥, 𝑦, 𝑡; 𝜉 𝑞 , . . . , 𝜉𝑙 ) − E [𝑈] .
                                                                                                      𝑘                         (4.15)
           𝑞=1     𝑙=1
    Here, we are required to perform a tensor product to obtain the set of parameters for each
realization. We have the flexibility to introduce anisotropy on the sample space by using different
number of integration points in each dimension, depending on the parameter. Moreover, once
we have simulated all multidimensional tensor product simulations, obtaining first and total-order
sensitivity indices is just a matter of post-processing. In all examples showed in this work we
assume the random variables to be mutually independent. That assumption may not always hold
true for a specific scenario, in which case a proper treatment should be given due to correlation
between parameters.
Remark 4.2.1. For the model studied in this work, a fully tensorial product as performed in
PCM is sufficient. However, in more general systems with higher parametric dimensions (more
than the 5 dimensions considered here), such approach is not efficient, since the tensor product
increases the number of simulations exponentially. In medium-to-high dimensions, one alternative
is to use Smolyak sparse grids[68] to reduce the number of realizations, while still achieving
                                                                 81


comparable accuracy. However, even such methods become expensive in high dimensions. Then,
UQ is usually combined with dimensionality reduction techniques. Traditional methods include
Principal Component Analysis (PCA) [177, 178] and low-rank approximations [179]. Recently,
active subpace methods have been developed to find and exploit low-dimensional structures in the
random space to obtain the solution statistics, and perform sensitivity analysis [168–170].
4.3    Sensitivity Analysis
The stochastic methods presented give us the ability to propagate parametric uncertainty through
the model and obtain statistical information from the solution. We can take a step further and use
the stochastic information to systematically study the importance of each parameter to solution
uncertainty. Using a univariate PCM framework we can compute local sensitivity in a probabilistic
way. Through multivariate PCM uncertainty propagation we can analyze the influence of each
parameter in the total variance, by calculating sensitivity indices. In this Section, we derive the
local and global sensitivity frameworks from the phase-field model stochastic discretization.
4.3.1   Stochastic Sensitivity Analysis
In local sensitivity analysis, we study the effect of only one parameter while keeping all others
at their expected values. The sensitivity of the output 𝑈 (𝜉 𝑗 ) with respect to input parameter 𝜉 𝑗 ,
𝑗 = 1, 2, 3, 4, 5 being one of the parameters in 𝜉 (𝜔) = {𝛾(𝜔), 𝑔𝑐 (𝜔), 𝑎(𝜔), 𝑏(𝜔), 𝑐(𝜔)}, is also a
random variable, so we compute its expected value as
                                           𝑆𝑈,𝜉 𝑗 = 𝑁 𝑓 E(𝑆𝑈,𝜉 𝑗 )                             (4.16)
where 𝑁 𝑓 is a normalization factor defined as
                                                        𝜉¯ 𝑗
                                                  𝑁𝑓 =                                         (4.17)
                                                        𝑈¯
    In Eq. (4.17), 𝜉¯ 𝑗 is the expected value of the input parameter under consideration and 𝑈¯ is the
volume average of the expected solution 𝑈. In Eq. (4.16) we consider the expectation of sensitivities
                                                    82


over all the parameter intervals, computed using PCM as
                                                           𝑄
                                                      1 ∑︁
                                       E (𝑆𝑈,𝜉 𝑗 ) ≈          𝑤 𝑞 𝑆𝑈𝑞 ,𝜉 𝑗                    (4.18)
                                                        2 𝑞=1           𝑞
with 𝑆𝑈𝑞 ,𝜉 𝑗 being the sensitivity at collocation point 𝑞 for parameter 𝑗. The sensitivity around
            𝑞
each integration point is computed using complex-step differentiation [175]. Unlike traditional
finite-difference approximations, in complex-step differentiation we perturb the imaginary part of
the parameter. Let 𝜉 𝑗 = 𝑝 + 𝑖ℎ and ℎ ∈ R. Expanding 𝑈 (𝜉) in Taylor series about the real part 𝑝
we obtain
                     𝑈 (𝜉 𝑗 ) = 𝑈 ( 𝑝 + 𝑖ℎ) = 𝑈 ( 𝑝) + 𝑖ℎ𝑈 ′ ( 𝑝) − ℎ2𝑈 ′′ ( 𝑝) + O (ℎ2 ).    (4.19)
We then take the imaginary part on both sides to obtain the derivative of 𝑈 around the real part of
𝜉:
                                                         𝐼𝑚(𝑈 ( 𝑝 + 𝑖ℎ))
                                     𝑈 ′ ( 𝑝) = 𝑆𝑈,𝜉 𝑗 ≈                                      (4.20)
                                                                 ℎ
We can also take the real part of 𝑈 ( 𝑝 + 𝑖ℎ) to recover the unperturbed solution and compute the
expectation used in the normalization factor, Eq. (4.17).
    The complex-step differentiation is second-order accurate and allows for small perturbations
without incurring round-off and cancellation errors. Using complex-step differentiation has another
enormous advantage over finite-difference schemes. We only need to evaluate one solution at each
point, instead of two. We take the imaginary part of the solution to compute the derivative, and use
the real part to calculate the volume average of the expected solution in Eq. (4.17).
Remark 4.3.1. Using MATLAB, this method can be applied almost immediatly by running the sim-
ulation with an imaginary perturbation. However, in the derivation of complex-step differentiation,
we have to assume that 𝑈 (𝜉) is analytic, implying the Cauchy-Riemman equations must hold. We
must also assure all operators in the implementations to be analytic [175] . Most of operators
in MATLAB are compatible with complex numbers. Operations like maximum, minimum, greater
or less than, should always compare only the real part. Moreover, when transposing vectors and
matrices, we should use the dot-apostrophe transpose, that do not take conjugate of the elements.
                                                       83


The only operation that we must redefine to be analytic is the absolute value, implemented as
                                                      
                                                      
                                                       −𝑥 − 𝑖𝑦, if 𝑥 < 0
                                                      
                                                      
                                                      
                                     abs(𝑥 + 𝑖𝑦) =                                                       (4.21)
                                                      
                                                       +𝑥 + 𝑖𝑦, if 𝑥 ≥ 0
                                                      
                                                      
                                                      
4.3.2    Global Sensitivity Analysis
In a global analysis we are interested in how sensitive is the output when we perturb all input
parameters. We use a variance-based method based on Sobol indices [180], where we compute
the relative importance of input parameters to total output variance. To derive the algorithms for
variance-based sensitivity analysis we refer to Saltelli et al (2010) [176]. For simplicity, we now
denote our solution vector only as 𝑈 (𝜉) = 𝑓 (𝜉 1 , 𝜉 2 , . . . , 𝜉 𝑘 ), a random variable function of random
parameters 𝜉 𝑗 , 𝑗 = 1, 2, . . . , 𝑘, 𝑘 being the total dimension of the random space (the number of
random parameters). The effect of parameter 𝜉 𝑗 on variance 𝑉 is
                                                                    
                                               𝑉𝜉 𝑗 E𝜉 ∼ 𝑗 (𝑈|𝜉 𝑗 )                                      (4.22)
where 𝜉 ∼ 𝑗 denotes the combination of all possible values for random parameters with the exception
of 𝜉 𝑗 , which is fixed at some value. We interpret Eq. (4.22) as taking the expected value of 𝑈
having fixed a value for 𝜉 𝑗 , and then taking the variance over all possible values of 𝜉 𝑗 . From the
Law of Total Variance, we have
                                                                          
                              𝑉𝜉 𝑗 E𝜉 ∼ 𝑗 (𝑈|𝜉 𝑗 ) + E𝜉 𝑗 𝑉𝜉 ∼ 𝑗 (𝑈|𝜉 𝑗 ) = 𝑉 (𝑈)                        (4.23)
    The second term on the left-hand side is called the residual and 𝑉 (𝑈) is the total variance.
Normalizing Eq. (4.23) we obtain the sensitivity index that measures the effect on total variance by
random variable 𝜉 𝑗 as:                                                 
                                                   𝑉𝜉 𝑗 E𝜉 ∼ 𝑗 (𝑈|𝜉 𝑗 )
                                            𝑆𝑗 =                                                         (4.24)
                                                         𝑉 (𝑈)
    The sensitivity indices 𝑆 𝑗 only measure the first-order effect on variance from 𝜉 𝑗 , and disregards
the interactions between 𝜉 𝑗 and other parameters. The total sum of 𝑆 𝑗 should be less than 1, the
remainder being the high-order interactions between the parameters.
                                                        84


    If we would like to compute any second, third, or higher-order indices, we would have to repeat
this procedure for all combinations of interactions. An alternative is to compute the sensitivity
index related to a given parameter and all its possible interactions with all parameters. In order
                                    𝑗
to obtain the total effect index 𝑆𝑇 we start from the total normalized variance and subtract the
contribution of all first and high-order effects that do not include 𝜉 𝑗 . We write
                                               𝑉𝜉∼ 𝑗 E𝜉 𝑗 (𝑈|𝜉 ∼ 𝑗 )
                                                                     
                                       𝑗
                                      𝑆𝑇 = 1 −                                                  (4.25)
                                                       𝑉 (𝑈)
    In practical terms, for every combination of parameters that do not include 𝜉 𝑗 we compute
the expectation with respect to 𝜉 𝑗 . That means performing different combinations of univariate
expected solutions with respect to 𝜉 𝑗 . Then, all these different expectations compose a 𝑘 − 1
dimensional random space, from which we compute the variance. From 5-D PCM full tensor
product we already possess all possible combinations between the parameters, so the computation
             𝑗
of 𝑆 𝑗 and 𝑆𝑇 are just a matter of postprocessing the realizations.
4.3.3    Integrated Sensitivity-Uncertainty Quantification Framework
Given the stochastic discretization methods from Section 4.2.1 as integrating building blocks, we for-
mulate a framework to quantify model-form uncertainty through a parametric uncertainty/sensitivity
propagation. The stochastic and global sensitivity analyses are naturally integrated to the frame-
work using PCM. The stochastic sensitivity procedure is summarized in Algorithm 4.1, where
the solution sensitivity with respect to a given parameter is obtained through the complex-step
differentiation. The participation of each parameter in total uncertainty given by sensitivity indices
is outlined in Algorithm 4.2. We further elaborate on the integrated framework’s development and
capabilities via two canonical examples in the next Section.
4.4     Numerical Results
We now present two representative numerical examples to show the capabilities of the proposed
methodology to assess uncertainty and sensitivity of damage phase-field models. The first example
                                                   85


Algorithm 4.1 Stochastic Sensitivity Analysis
  1: for Random parameters 𝜉 𝑗 = 1 → 𝑘 do
  2:     Choose a perturbation ℎ, and define the random parameter as 𝑈 (𝜉 𝑗 ) = 𝑈 ( 𝑝 + 𝑖ℎ).
  3:     Solve the stochastic problem in one dimension with perturbed inputs 𝜉 𝑗 = 𝑝 + 𝑖ℎ.
  4:     Compute the sensitivity of the solution at every
                                                          collocation
                                                                      point following Eq. (4.20).
  5:     Compute the expectation of sensitivities E (𝑆𝑈,𝜉 𝑗 ) , following Eq. (4.18).
  6:     Compute the normalization factor 𝑁 𝑓 using the volume average of expected solution,
     Eq. (4.17).         
  7:     Given E (𝑆𝑈,𝜉 𝑗 ) and 𝑁 𝑓 compute the local sensitivity 𝑆𝑈,𝜉 𝑗 using Eq. (4.16).
  8: end for
Algorithm 4.2 Global Sensitivity Analysis
  1: For all 𝑘 random parameters, solve the 𝑘−dimensional stochastic problem employing PCM.
  2: for Random parameters 𝜉 𝑗 = 1 → 𝑘 do
  3:     Given the tensor product results in all collocation points, compute first-order sensitivity
     index 𝑆 𝑗 using Eq. (4.24).
  4:     Given the tensor product results in all collocation points, compute total-order sensitivity
             𝑗
     index 𝑆𝑇 using Eq. (4.25).
  5: end for
is the single-edge notched tensile test case, a traditional benchmark test with mode I crack propa-
gation. With the notched geometry, we first investigate the convergence of MC and PCM methods
in the univariate and multivariate uncertainty propagation. Then, we show the expectation and
standard deviation of damage evolution for each parameter and compare that with 5-D parametric
uncertainty propagation. Next, we show local sensitivity expectation results for each parameter in
the univariate framework. Last, we compute Sobol indices in the global sensitivity analysis and
comment on the different influence of each parameter.
     The second example is a standard tensile test specimen, symmetric and with no notches or
existing cracks. We run the stochastic framework and show the expected crack path and its
uncertainty. We also run local and global sensitivity analyzes to understand how the lack of
pre-existing crack or notch affects the output uncertainty.
                                                  86


Figure 4.1 Left: Geometry and boundary conditions for single-edge notched tensile test. Right:
Finite element mesh.
        Table 4.1 Expected value of stochastic parameters for single-edge notched tensile test.
                           Parameter                         Value
                           𝑎 (rate of change of fatigue)     5 × 10−7 𝑚 2
                           𝑏 (viscous damping)               1 × 108 𝑁 𝑠/𝑚 2
                           𝑐 (rate of change of damage)      1 × 10−5 𝑚/𝑁 𝑠
                           𝑔𝑐 (Griffith energy)              2700 𝑁/𝑚
                           𝛾 (phase-field layer width)       1 × 10−3 𝑚
4.4.1    Single-Edge Notched Tensile Test
The notched geometry is a benchmark test, consisting in a square of material with a pre-existing
crack in the middle of the specimen, see Fig. 4.1. We constrain the body at the top and apply a
prescribed displacement at the bottom, at a rate of 3 × 10−4 𝑚/𝑠. The finite element mesh has 3395
nodes, composing 6498 linear triangle elements with smallest element size being 0.404 𝑚𝑚. The
final time is 𝑇 = 0.5 𝑠 and we integrate the solution over time-steps of Δ𝑡 = 1 × 10−3 𝑠. We consider
a material with 𝐸 = 160 𝐺𝑃𝑎, 𝜈 = 0.3 and 𝜌0 = 7800 𝑘𝑔/𝑚 3 under plane stress conditions with
thickness of ℎ = 5 𝑚𝑚. As stated in Section 4.2.1, we assume a uniform distribution for the random
variables, with a range of ±10% from their expected values, which are given in Tables 4.1.
                                                   87


                            -3                                                                                 -1                                                                                 -3                                                                                     -1
                       10                                                                                 10                                                                                 10                                                                                     10
                                                                   L1                                                                                L1                                                                                      L1                                                                                       L1
                                                                   L                                                                                 L                                                                                       L                                                                                        L
                                                                       2                                                                                 2                                                                                       2                                                                                        2
                                                                   L                                                                                 L                                                                                       L                                                                                        L
                                                                                                                                                                                                  -4                                                                                     -2
                                                                                                                                                                                             10                                                                                     10
 Relative error norm                                                                Relative error norm                                                                Relative error norm                                                                    Relative error norm
                            -4                                                                                 -2                                                                                 -5                                                                                     -3
                       10                                                                                 10                                                                                 10                                                                                     10
                                                                                                                                                                                                  -6                                                                                     -4
                                                                                                                                                                                             10                                                                                     10
                            -5                                                                                 -3                                                                                 -7                                                                                     -5
                       10                                                                                 10                                                                                 10                                                                                     10
                                 0                    1                         2                                   0                   1                         2                                    0                   1                              2                                   0                        1                           2
                            10                   10                        10                                  10                  10                        10                                   10                  10                             10                                  10                       10                          10
                                      Number of PCM realizations                                                        Number of PCM realizations                                                         Number of PCM realizations                                                                  Number of PCM realizations
                                               (a)                                                                              (b)                                                                                 (c)                                                                                        (d)
Figure 4.2 Univariate simulations: expectation (a) and standard deviation (b) error between MC
results with 104 samples and solutions from PCM with different number of collocation points. We
observe that with 4 points we obtain comparable accuracy in PCM. Expectation (c) and standard
deviation (d) PCM convergence, where the reference for error computation is a solution with 100
collocation points in the univariate case. The convergence rate is close to linear.
4.4.1.1                                 Convergence
We study the convergence of PCM estimates (expectation and standard deviation) of the damage
field 𝜑(𝑥,
      ˆ 𝑦, 𝑇), at final simulation time 𝑇 = 0.5 𝑠. The relative error, 𝑒 𝑃𝐶 𝑀 , between the PCM
         ˆ 𝑦, 𝑇) and a reference solution 𝜑𝑟𝑒 𝑓 (𝑥, 𝑦, 𝑇), over the spatial domain at time 𝑇, is
estimate 𝜑(𝑥,
defined as
                                                                                                                                                              ∥ 𝜑ˆ − 𝜑𝑟𝑒 𝑓 ∥ 𝑝
                                                                                                                                            𝑒 𝑃𝐶 𝑀 =                           ,                                                                                                                                                    (4.26)
                                                                                                                                                                 ∥𝜑𝑟𝑒 𝑓 ∥ 𝑝
                                                                                                                                                                                                                                        Í𝑛                                                     1/𝑝
where the 𝑝−norm for a vector w = (𝑤 1 , 𝑤 2 , . . . , 𝑤 𝑛 ) is ∥w∥ 𝑝 :=                                                                                                                                                                 𝑖=1 |𝑤 𝑖 |
                                                                                                                                                                                                                                                    𝑝                                                 , for 𝑝 = 1, 2, and
∥w∥ ∞ := max𝑖 |𝑤 𝑖 |.
                                 In Fig. 4.2 we plot the univariate convergence analysis for random parameter 𝛾. We first choose
the reference solution to be a MC estimate with 104 realizations ((a) and (b)), showing that, with
only a few realizations, PCM has equivalent accuracy as MC. Since our method is shown to be
convergent, although biased, we establish a PCM benchmark solution with 100 collocation points.
The goal is to estimate the practical convergence rate of PCM, which sheds light on the smoothness
of the solution in the parametric space. We see that the convergence rate of PCM with respect to
the PCM benchmark solution is close to linear ((c) and (d)), instead of e.g. exponential, revealing
that the true solution to our problem is not smooth, and it does not belong to higher-class Sobolev
spaces in the parametric space.
                                 Fig. 4.3 (a) and (b) show the relative error of PCM, taking a 104 sample MC estimate as
                                                                                                                                                                      88


                            -2                                                                                 -1                                                                                  -2                                                                                 -1
                       10                                                                                 10                                                                                  10                                                                                 10
                                                                   L1                                                                                 L1                                                                                  L1                                                                                 L1
                                                                   L                                                                                  L                                                                                   L                                                                                  L
                                                                       2                                                                                  2                                                                                   2                                                                                  2
                                                                   L                                                                                  L                                                                                   L                                                                                  L
                                                                                                                                                                                                   -3                                                                                 -2
                                                                                                                                                                                              10                                                                                 10
 Relative error norm                                                                Relative error norm                                                                 Relative error norm                                                                Relative error norm
                            -3                                                                                 -2                                                                                  -4                                                                                 -3
                       10                                                                                 10                                                                                  10                                                                                 10
                                                                                                                                                                                                   -5                                                                                 -4
                                                                                                                                                                                              10                                                                                 10
                            -4                                                                                 -3                                                                                  -6                                                                                 -5
                       10                                                                                 10                                                                                  10                                                                                 10
                                 1         2                   3                4                                   1         2                   3                4                                    1         2                   3                4                                   1         2                   3                4
                            10        10                  10               10                                  10        10                  10               10                                   10        10                  10               10                                  10        10                  10               10
                                     Number of PCM realizations                                                         Number of PCM realizations                                                          Number of PCM realizations                                                         Number of PCM realizations
                                               (a)                                                                                (b)                                                                                 (c)                                                                                (d)
Figure 4.3 (a) and (b): Comparison between MC results with             samples, and solutions from                                                                                                                                104
PCM with different number of realizations in the multivariate case. With higher dimensions the
advantage of PCM over MC becomes more evident, with only 3 points needed in each dimension
to stabilize the error in both expectation (a) and standard deviation (b). (c) and (d): Convergence
of damage field on multivariate PCM simulations of notched geometry, where the reference for
error computation is a solution with 6 collocation points in each dimension. We obtain linear
convergence for both expectation (c) and standard deviation (d).
reference, in the multivariate case (considering the parameters from Table 4.1, a 5-D random
space). Again, we observe that PCM quickly achieve comparable accuracy in expectation and
standard deviation to 104 MC realizations. On Fig. 4.3 (c) and (d), we see the convergence of PCM
in 5 dimensions, where the PCM benchmark is a solution with 6 points in each dimension (total of
65 = 7776 realizations). The convergence rates within the PCM algorithm are again close to linear.
Nonetheless, we consistently show that PCM exhibits a faster convergence than MC (a half-order
increase), while requiring far less realizations for a desired accuracy.
4.4.1.2                                Uncertainty and sensitivity analyses
We first investigate the univariate uncertainty propagation, where we assume that each random
parameter has a uniform distribution centered at the values on Table 4.1 with 10% variation to left
and right. In this 1-D parametric setting, we assume that the other parameters are deterministically
known at their expected values. Fig. 4.4 shows the damage field expectation and standard deviation
when we consider the phase-field layer width 𝛾 as uncertain, while keeping the other parameters
fixed in PCM simulations with 4 integration points. The expected crack propagates to the right,
as we could expect from a tensile load, and the uncertainty follows the crack tip. In Fig. 4.5
we plot the damage field from random 𝛾 over the crack propagation line (at 𝑦 = 50 𝑚𝑚, from
                                                                                                                                                                       89


                  (a) Time 𝑡 = 0.3 𝑠.         (b) Time 𝑡 = 0.4 𝑠.           (c) Time 𝑡 = 0.5 𝑠.
                  (d) Time 𝑡 = 0.3 𝑠.         (e) Time 𝑡 = 0.4 𝑠.           (f) Time 𝑡 = 0.5 𝑠.
Figure 4.4 Damage phase-field expectation (top) and standard deviation (bottom) after crack prop-
agation taking 𝛾 as random input. From tensile load the crack propagates in Mode-I as expected.
𝛾 has influence around the crack path, because it controls the diffusion of damage. Once the crack
propagates and the expected value is 1 in the crack path, the uncertainty vanishes in the cracked
region. However, the deviation around the crack tip grows with time.
50 𝑚𝑚 ≤ 𝑥 ≤ 100 𝑚𝑚). Looking at the damage field profiles from the expected solution we see
the crack tip as a moving interface. The standard deviation’s peak follows the interface and grows
in time. Since 𝛾 controls the diffusivity we see that with increasing uncertainty the interface profile
of the expected solution becomes less sharp with time.
     For the other parameters, the damage field expectation is similar to the ones shown in Fig. 4.4.
The major difference lies in the deviation field, as shown in Fig. 4.6, where we see that for the other
variables the uncertainty is more concentrated around the crack tip. This example has a known
crack path, so most of uncertainty is related to the speed of crack growth through the Griffith
fracture energy 𝑔𝑐 that controls how fast damage grows with intermediate values of fatigue, and the
rate of change of damage 𝑐, directly affecting damage time evolution.
     Still in the univariate framework, we use complex-step differentiation to evaluate the stochastic
                                                                                                𝑗  𝑗
sensitivity of every random parameter, where the perturbation is chosen to be ℎ = 0.001(𝜉𝑚𝑎𝑥 −𝜉𝑚𝑖𝑛 ).
                                                     90


                                               1                                                                                       0.12
                                                                                          t = 0.2                                                                                       t = 0.2
                                                                                          t = 0.3                                                                                       t = 0.3
                                                                                                           Damage standard deviation
                                                                                                                                        0.1
                                              0.8                                         t = 0.4                                                                                       t = 0.4
                         Damage expectation
                                                                                          t = 0.5                                                                                       t = 0.5
                                                                                                                                       0.08
                                              0.6
                                                                                                                                       0.06
                                              0.4
                                                                                                                                       0.04
                                              0.2
                                                                                                                                       0.02
                                               0                                                                                         0
                                               0.05   0.06       0.07      0.08        0.09         0.1                                  0.05       0.06       0.07      0.08        0.09         0.1
                                                             x position at y = 50 mm                                                                       x position at y = 50 mm
                                                      (a) Damage expectation profile.                                                           (b) Damage standard deviation profile.
Figure 4.5 Time evolution of damage phase-field expectation and standard deviation profiles at the
crack path line taking 𝛾 as random input. From damage expectation we observe the crack tip as a
moving interface. The standard deviation peak follows the advecting boundary and grows in time,
which makes the expected interface less sharp.
   (a) Rate of change of fatigue 𝑎.                                 (b) Viscous damping 𝑏.                                               (c) Rate of change of damage 𝑐.                    (d) Griffith fracture energy 𝑔𝑐 .
Figure 4.6 Damage phase-field standard deviation after crack propagation in univariate uncertainty
quantification. Fatigue parameter 𝑎 and viscous damping 𝑏 do not propagate uncertainty as much
as Griffith energy 𝑔𝑐 and rate of change of damage parameter 𝑐. Since the crack path is defined
by geometry, the majority of uncertainty is related to the speed of crack propagation, controlled
mostly by 𝑔𝑐 and 𝑐.
Fig. 4.7 shows expectation of damage sensitivity fields at final time 𝑇 = 0.5 𝑠. We make two
observations here. First, the sensitivity fields show that damage increases at the crack tip when
we increase 𝛾, 𝑎 and 𝑐. By increasing 𝑏 we damp vibrations, thus, reducing damage. Again, the
effect of 𝑔𝑐 is visible, where increasing its value we require higher levels of fatigue to drive damage
towards 1, thus the sensitivity around the crack tip is negative. Second, if we order sensitivity and
standard deviation by their maximum absolute values, we obtain the same decreasing order for both:
𝑐, 𝑔𝑐 , 𝑏, 𝛾, 𝑎, with similar proportionality between them in the two cases. Both results evidence
that with a clear crack location and path, speed of propagation drives uncertainty and sensitivity.
   From the univariate uncertainty propagation and the stochastic sensitivity analysis, we identify
                                                                                                          91


                                        (a) Phase-field layer width 𝛾.                   (b) Griffith fracture energy 𝑔𝑐 .
            (c) Rate of change of fatigue 𝑎.                         (d) Viscous damping 𝑏.                         (e) Rate of change of damage 𝑐.
Figure 4.7 Expected sensitivity fields with respect to each input parameter. Similarly to standard
deviation fields, local sensitivity results also point to 𝑐 and 𝑔𝑐 , related to propagation speed, as
parameters with more sensitive output, since we have a specific crack initiation location and path.
the most influential parameters when we assume all others to be known. Now we show results
of a 5-D PCM simulation, where we consider all parameter combinations through tensor product,
using 6 points in each dimension. Fig. 4.8 shows the evolution of damage phase-field profiles at
the crack path over time. The maximum standard deviation at final time (0.257) is comparable to
peak uncertainty of 𝑔𝑐 (0.201) and 𝑐 (0.214).
    In order to quantify the relative participation of each parameter in 5-D uncertainty, we compute
                                            𝑗
the Sobol indices 𝑆 𝑗 and 𝑆𝑇 from Eq. (4.24) and (4.25), respectively. Fig. 4.9 shows the total
deviation field, as a reference, and sensitivity index fields 𝑆 𝑗 at final time for all parameters.
Despite 𝛾 having high maximum value of its sensitivity index, we point out that most of its
influence is around the crack, where the uncertainty is small. Ahead of the crack, 𝛾 has no
influence in uncertainty in this case where the path is defined. We see again that at the region of
high uncertainty 𝑔𝑐 and 𝑐 play an important role when we only consider their sole effects. We plot
                                                       𝑗
the total sensitivity index fields 𝑆𝑇 in Fig. 4.10. We see that high order interactions between the
                                                                              92


                                      1                                                                                        0.3
                                                                                 t = 0.2                                                                                       t = 0.2
                                                                                 t = 0.3                                                                                       t = 0.3
                                                                                                  Damage standard deviation
                                                                                                                              0.25
                                     0.8                                         t = 0.4                                                                                       t = 0.4
                Damage expectation
                                                                                 t = 0.5                                                                                       t = 0.5
                                                                                                                               0.2
                                     0.6
                                                                                                                              0.15
                                     0.4
                                                                                                                               0.1
                                     0.2
                                                                                                                              0.05
                                      0                                                                                         0
                                      0.05   0.06        0.07     0.08        0.09         0.1                                  0.05       0.06        0.07     0.08        0.09         0.1
                                                    x position at y = 50 mm                                                                       x position at y = 50 mm
                                             (a) Damage expectation profile.                                                           (b) Damage standard deviation profile.
Figure 4.8 Time evolution of damage phase-field expectation and standard deviation profiles at the
crack path line when propagating the uncertainty of all 5 random parameters. We observe that the
combined effect of all parameters results in a larger standard deviation around the crack tip at final
time, comparable to peak values of 𝑐 and 𝑔𝑐 uncertainties.
parameters have great impact over the sensitivity indices. In the majority of the geometry, the fields
are uniform with base value around 0.4, except for 𝛾. The influence of 𝑔𝑐 and 𝑐 is carried over the
other parameters in the region ahead of the crack, with 𝑔𝑐 still being the parameter that has most
impact ahead of the crack.
   From single-edge notched tensile test uncertainty and sensitivity analyses it is clear that the load
conditions and geometric singularity define the position of crack initiation and its path. Parameters
that control rate of change of damage, such as 𝑔𝑐 or 𝑐, are more sensitive and contribute more
to solution uncertainty. For general geometries and load situations, we may expect a shift in the
relative contribution of total uncertainty. Take the sensitivity indices from Fig. 4.9, for example.
Uncertainty that is not around the crack tip in the direction of crack propagation is dominated almost
entirely by 𝛾 and 𝑔𝑐 . In cases where the uncertainty is not concentrated around a specific region of
the geometry, we can expect 𝛾 and 𝑔𝑐 to have stronger influence. Not surprisingly, both parameters
are multiplying the Laplacian in damage equation, which arises from the squared gradient, a local
interaction term in the free-energy potential.
                                                                                                 93


                 (a) Total deviation field.              (b) Phase-field layer width 𝛾.             (c) Griffith fracture energy 𝑔𝑐 .
             (d) Rate of change of fatigue 𝑎.               (e) Viscous damping 𝑏.                 (f) Rate of change of damage 𝑐.
Figure 4.9 Notched tensile total damage deviation field and sensitivity indices (𝑆 𝑗 ) fields for all
parameters using 6 points in each dimension at final time 𝑇 = 0.5 𝑠. Ahead of the crack, 𝑔𝑐 and 𝑐
are the most influential parameters to total damage field variance. The remaining parameters have
little participation at the most uncertain region of the geometry.
              Table 4.2 Expected value of stochastic parameters for tensile test specimen.
                                         Parameter                                    Value
                                         𝑎 (rate of change of fatigue)                5 × 10−7 𝑚 2
                                         𝑏 (viscous damping)                          1 × 108 𝑁 𝑠/𝑚 2
                                         𝑐 (rate of change of damage)                 2 × 10−6 𝑚/𝑁 𝑠
                                         𝑔𝑐 (Griffith energy)                         2700 𝑁/𝑚
                                         𝛾 (phase-field layer width)                  3 × 10−4 𝑚
4.4.2    Tensile Test Specimen
The tensile test geometry without notch from Fig. 4.11 is the standard design with a mesh of 3912
nodes and 7236 linear triangle elements, where we also constrain one of its ends and apply a
prescribed displacement of 4.5 × 10−4 𝑚/𝑠 until 𝑇 = 0.5 𝑠, with time increments of Δ𝑡 = 5 × 10−4 𝑠.
The smallest element size in the mesh is 0.614 𝑚𝑚. We consider the same material properties
and thickness from the notched case, with expected values for the stochastic parameters shown on
Table 4.2.
                                                                     94


              (a) Total deviation field.      (b) Phase-field layer width 𝛾.         (c) Griffith fracture energy 𝑔𝑐 .
           (d) Rate of change of fatigue 𝑎.      (e) Viscous damping 𝑏.              (f) Rate of change of damage 𝑐.
                                                                                                                         𝑗
Figure 4.10 Notched tensile total damage deviation field and total effect sensitivity indices (𝑆𝑇 )
fields for all parameters using 6 points in each dimension at final time 𝑇 = 0.5 𝑠. When we combine
parameter effects and include their interactions the dominant sensitivity at the crack tip gets carried
out to all parameter indices. In the remaining regions, the sensitivity index is uniform except for 𝛾:
the diffusion coefficient is more influential throughout the specimen.
                                                                               6mm    10mm
                                                            30mm
                                                                     6mm              10mm
                          u
Figure 4.11 Top: Geometry and boundary conditions for tensile test specimen. Bottom: finite
element mesh.
                                                          95


                                       10 -1                                                                             10 0
                                                                               L                                                                                   L
                                                                                   1                                                                                   1
                                                                               L2                                                                                  L2
                                                                               L                                                                                   L
                                       10 -2                                                                             10 -1
                 Relative error norm   10 -3                                                       Relative error norm   10 -2
                                       10 -4                                                                             10 -3
                                           10 1    10 2                10 3            10 4                                  10 1      10 2                10 3            10 4
                                                  Number of PCM realizations                                                          Number of PCM realizations
                                                   (a) Expectation.                                                                 (b) Standard deviation.
Figure 4.12 Convergence of damage field in multivariate PCM simulations for tensile test spec-
imen, where the reference for error computation is a solution with 6 collocation points in each
dimension. We have lower convergence rates when compared to notched geometry due to due to
more uncertainty of crack location.
   We investigate the multivariate uncertainty propagation in the tensile test case. Fig. 4.12 shows
the convergence of 5D PCM using the solution of 6 points in each dimension as a reference. Due to
the absence of an initial crack, the convergence of the expected solution and deviation has a lower
rate when compared to the single-edge notched test. Figs. 4.13 and 4.14 show the expectation and
standard deviation of the damage field when we consider 5-dimensional PCM simulations with 6
integration points in each dimension. Unlike the notched case, here we have different expected
locations for crack initiation that propagate from the surface to the interior of the body following
the stress field concentration. Moreover, the final uncertainty in this case is more than 30% of the
maximum damage, higher than any value from the notched case.
   Fig. 4.15 shows the stochastic sensitivity fields for all parameters on the tensile test case
using 8 PCM points to compute the expected sensitivity. We can observe that 3 parameters are
more sensitive, namely 𝛾, 𝑔𝑐 and 𝑎, and their absolute range are equivalent, going from 5 to 30,
approximately. We can argue that, since we do not have a preferential crack path nor a specific
crack initiation position, all the uncertain parameters have the same sensitivity. The 𝑐 parameter
in this case is not as sensitive compared to the notched geometry, since here the crack location and
path are not defined, so damage increase rate becomes less important compared to crack position.
Last, we can see that 𝑏 has little sensitivity on damage field in this case.
                                                                                              96


Figure 4.13 Damage phase-field expectation tak-                                  Figure 4.14 Damage phase-field deviation taking
ing all parameters in 𝜉 as random inputs. From                                   taking all parameters in 𝜉 as random inputs. We
tensile load we see the appearance of 4 possi-                                   have regions of uncertainty around all 4 points
ble crack initiation points, based on the stress                                 of possible crack initiation. At final time, the
concentration profile from the geometry. The                                     the uncertainty vanished where the crack prop-
expected solution at final time gives a curved                                   agated, and the maximum deviation around the
crack path at both sides of the geometry.                                        crack is more than 30% of maximum damage.
                                       (a) Phase-field layer width 𝛾.                   (b) Griffith fracture energy 𝑔𝑐 .
           (c) Rate of change of fatigue 𝑎.                         (d) Viscous damping 𝑏.                         (e) Rate of change of damage 𝑐.
Figure 4.15 Local sensitivity expectation fields with respect to each input parameter.𝛾, 𝑔𝑐 and 𝑎
are the most sensitive parameters, with the same absolute range. 𝑏 is not sensitive in the range
considered and 𝑐 is less sensitive than in the notched case.
                                                                             97


                (a) Total deviation field.     (b) Phase-field layer width 𝛾.  (c) Griffith fracture energy 𝑔𝑐 .
            (d) Rate of change of fatigue 𝑎.      (e) Viscous damping 𝑏.      (f) Rate of change of damage 𝑐.
Figure 4.16 Tensile test specimen total damage deviation field and sensitivity indices (𝑆 𝑗 ) fields
for all parameters using 6 points in each dimension at final time 𝑇 = 0.5 𝑠. Differently than in the
notched case, here 𝛾 and 𝑔𝑐 are the most influential parameters in the region of higher uncertainty.
    Another observation is that, contrary to the notched case, here the sensitivity of 𝛾 around the
crack is negative, being positive elsewhere. Since we do not have a defined crack path, the increase
of diffusion coefficient should smoothen the field. In other words, this case makes the 𝛾 associated
with the Laplacian in the damage equation more sensitive than 𝛾 parameters in the remaining terms.
    We present the total uncertainty field and sensitivity indices 𝑆 𝑗 for the tensile test specimen in
Fig. 4.16. Similarly to the notched case, here we see that the order of most influential parameters
to the uncertainty regions are equivalent to the most sensitive parameters from Fig. 4.15. In the
regions of high uncertainty around the possible crack paths, the deviation field is influenced most
by 𝛾 and 𝑔𝑐 . Fatigue parameter 𝑎 comes next, with relatively less importance than what was found
in the local sensitivity analysis. Viscous damping 𝑏 has little participation in total variance, while
damage rate 𝑐 has less importance when compared to the notched case. In summary, participation
in regions of high uncertainty is dominated by 𝛾 and 𝑔𝑐 .
                                             𝑗
    Total sensitivity indices 𝑆𝑇 for tensile specimen are presented in Fig. 4.17. We can observe that
                                                           98


                 (a) Total deviation field.   (b) Phase-field layer width 𝛾.            (c) Griffith fracture energy 𝑔𝑐 .
             (d) Rate of change of fatigue 𝑎.    (e) Viscous damping 𝑏.                (f) Rate of change of damage 𝑐.
Figure 4.17 Tensile test specimen total damage deviation field and total effect sensitivity indices
   𝑗
(𝑆𝑇 ) fields for all parameters using 6 points in each dimension at final time 𝑇 = 0.5 𝑠. With the
combined effect of all parameters, we still have 𝛾 and 𝑔𝑐 as having most influence in the uncertainty
regions.
when we include interactions between the parameters, 𝑎, 𝑏 and 𝑐 present sensitivity fields that are
almost uniform between 0.44 and 0.5. The other 2 parameters, 𝛾 and 𝑔𝑐 also present uniform fields,
except for the regions around the crack where the total sensitivity indices are higher than 0.6.
     From these examples it is clear that geometry and the existence of initial cracks, notches or
singularities in the geometry plays an important role on sensitivity and uncertainty. In the notched
case, the existence of a crack and the application of tensile stress makes the crack propagate straight
to the right. The only uncertainty remains with the speed of damage increase ahead of the crack
path, hence the sensitivities of 𝑔𝑐 and 𝑐.
     In the tensile specimen without notch on singularity, the uncertainty is driven by 𝛾, 𝑔𝑐 and 𝑎.
From Eq. 4.1, 𝛾 and 𝑔𝑐 are multiplying the local interaction term, namely 𝑔𝑐 𝛾2 |∇𝜑| 2 which later
                                                      𝛾(𝜔)𝑔𝑐 (𝜔)
becomes the Laplacian in damage evolution,                   𝜆        Δ𝜑.    Moreover, they are also associated with
the tensor product on the equation of motion, also originated from the local interaction, and are
associated to fatigue potentials H and H 𝑓 , arbitrarily chosen. The complete symmetry of the tensile
                                                          99


specimen geometry with no surface roughness or material imperfections, associated with the local
operators and ad hoc modeling, leads to potential crack appearances in the 4 stress concentration
regions around the fillets. Any numerical effect such as artificial damping or residuals may lead to
perturbation in the solution and crack initiation at any corner.
    Furthermore, the traditional disregard of nonlocal interactions in phase-field models excludes the
possibility of modeling many phenomena experimentally observed, such as intermittent dislocation
avalanches [11–13, 181] and fractal characteristics of fracture[4, 118], which show scale-free
distributions and power-law scaling, that can be successfully modeled through fractional calculus
[182]. Recent works have addressed fractional-order Cahn-Hilliard equation [183, 184]. Phase-
field models derived from free-energy potentials with nonlocal effects were first discussed by
Giacomin and Lebowitz [185, 186] with recent contributions from Abels et al [187], and Ainsworth
and Mao [184]. Fractional-order models for structural analysis have been developed [46], to which
corresponding fractional uncertainty/sensitivity analyses can be formulated via operator-based
uncertainty quantification [188] and Fractional Sensitivity Equation Method (FSEM) [189].
                                                 100


                                             CHAPTER 5
 DATA-DRIVEN FAILURE PREDICTION IN BRITTLE MATERIALS: A PHASE-FIELD
                         BASED MACHINE LEARNING FRAMEWORK
5.1      Introduction
Predictability is essential to any mathematical model for failure and fracture. From early linear
elastic fracture mechanic models from [2], to failure analysis through damage mechanics by [153],
numerical models have improved in scope and complexity to provide realistic simulations of
material failure to meet industry goals of safety, and to reduce component weight and production
costs. The accurate simulation of the failure process, from crack initiation to propagation until final
failure, in a consistent way, while respecting the physics and developing robust numerical methods,
is still a challenging task.
     During the last decade, phase-field models have been successfully established as a powerful tool
in the study of damage and fatigue. By modeling sharp interfaces through smooth continuous fields,
the dynamics of moving boundaries using phase-fields emerged in diverse physical applications,
including fluid separation [54], solidification [55], tumor growth [56], two-phase complex fluid
flow [57], and fluid-structure interaction [58]. In failure analysis, crack sharpness is modeled
through a smooth phase-field indicating the state of the material among fractured, virgin, and
intermediate damaged zones, evolving through Allen-Cahn type equations. Examples ranging
from brittle [31, 156, 157], ductile [32, 158], and dynamic fracture [33, 159] successfully described
phenomenological effects such as crack initiation, branching and coalescence. The inclusion
of fatigue effects was initially attempted with Ginsburg-Landau free-energy potentials [160] and
fractional derivatives [161]. A more general framework for damage and fatigue was later developed
in a non-isothermal and thermodynamically consistent approach [34, 162, 163], followed by the
emergence of further phase-field models for fatigue [190, 191]. Within this myriad of different
models, solution uncertainty and parametric sensitivity are still influential, and the predictability
                                                  101


of phase-field models for arbitrary conditions is yet a withstanding effort [104]. One promising
approach to address the predictability of numerical models is to use artificial intelligence (AI),
which has been consistently expanding its applicability over the years.
    AI and machine learning (ML) have been widely used in different engineering applications such
as structural health monitoring [192–202] and fatigue crack detection [203–205]. ML algorithms,
in the context of failure analysis, have been used for numerous applications, including phase-field
models of polymer-based dielectrics [206], phase-field models of solidification [207], and crystal
plasticity [208]. Another interesting application of ML is to obtain a data-driven representation of
free-energy potentials in the atomic scale and upscale it to a phase-field model, using Integrable
Deep Neural Networks [209]. Specifically for brittle failure, ML has been recently used to build
surrogate models based on explicit crack representation [210], and in failure prediction using
a discrete crack representation model for high-fidelity simulations that feed an artificial neural
networks (ANN) algorithm [211, 212]. Nonetheless, the noted studies have only shown the
applicability of ML in failure analysis. Therefore, ML methods have not yet been explored in
the context of phase-field models for damage. It is noted that the use of ML leads to a new
paradigm of phase-field modeling, where we establish the basis for novel data-driven frameworks,
allowing systematic infusion of statistical information and corresponding uncertainty propagation
from micro-scale models and experiments into continuum macroscopic failure models.
    In this work, we develop an ML algorithmic framework for failure detection and classification
merging a pattern recognition (PR) scheme and ML algorithms applied to a damage and fatigue
phase-field model. We consider an isothermal, linear elastic and isotropic material under the
hypothesis of small deformations and brittle fracture. We simulate the phase-field model using
Finite Element Method (FEM) and a semi-implicit time-integration scheme to generate time-series
data of damage phase-field 𝜑 and degradation function 𝑔(𝜑) = (1 − 𝜑) 2 from virtual sensing nodes
positioned at different locations across a test specimen. We introduce a PR scheme as part of the
ML framework, in which time-series data from FEM node responses are considered as a pattern
with a corresponding label. We define multiple labels for “no failure”, “onset of failure” and
                                                 102


“failure” of the test specimen based on tensile test load-displacement curve and damage threshold
concept. Once the patterns representing different states of the material are identified, the proposed
ML framework employs 𝑘-nearest neighbor (𝑘-NN) and ANN algorithms to detect the presence
and location of failure using such patterns. The use of ML algorithms makes the failure prediction
framework practical even in cases of complex loading conditions, beyond the canonical example of
monotonic loads presented here.
    In this study, we consider different failure types to further assess the performance of the
framework. In addition, by introducing noise to the time-series data, we ascertain the robustness
of the proposed framework with noise-polluted data, leading to the effective use in failure analysis
under high sensitive/uncertain parameters and operators. The idea of propagating uncertainty
through the solution, combined with the inherent nonlinearity of the damage phase-field model make
the choice of ANN particularly attractive in the construction of the framework. The findings from
this study will pave a way for the development of novel data-driven failure prediction frameworks,
which are able to efficiently establish a link among the classification results (i.e., accuracy) and
different phase-field model parameters, thus enabling the computational framework to identify
those parameters affecting model’s accuracy and updating them to achieve the best performance.
    This chapter is organized as follows: in 5.2 we present the damage and fatigue phase-field
model, which is used to generate time-series data for the ML framework. We introduce the data
generation procedure and corresponding label definitions in 5.3. We present the ML framework in
5.4, where we describe the integration of a pattern recognition scheme with the applied classification
algorithms, 𝑘-NN and ANN. We present and discuss the numerical results in 5.5.
5.2     Damage and Fatigue Phase-Field Model
5.2.1   Governing Equations
We consider a isothermal phase-field framework for structural damage and fatigue, modeled by
a system of coupled differential equations for the evolution of displacement 𝒖, velocity 𝒗 = 𝒖,      ¤
damage 𝜑 and fatigue F . The damage phase-field 𝜑 describes the volumetric fraction of degraded
                                                  103


material, and takes 𝜑 = 0 for virgin material, 𝜑 = 1 for fractured material, varying between those
states, 0 ≤ 𝜑 ≤ 1, as a damaged material. The evolution equation for the damage field is of
Allen-Cahn type since the damage and aging effects are non-conservative and non-decreasing, and
is derived along with the equations of motion for 𝒖 and 𝒗 through the principle of virtual power
and entropy inequalities with thermodynamic consistency [34]. The fatigue field F is associated
to the presence of micro-cracks, and is treated as an internal variable, whose evolution equation is
obtained through constitutive relations that must satisfy the entropy inequality for all admissible
processes. The geometry is defined over a spatial domain Ω ⊂ R𝑑 , 𝑑 = 1, 2, 3, at time 𝑡 ∈ (0, 𝑇].
     The final form of the governing equations will be defined by the choice of free-energy potentials
related to elasticity, damage and fatigue. We consider a linear elastic isotropic material, where the
phase-field free-energy takes the usual gradient form:
                                       1                      𝛾
                        Ψ(𝑬, 𝜑, F ) =    (1 − 𝜑) 2 𝑬 𝑇 C𝑬 + 𝑔𝑐 |∇𝜑| 2 + K (𝜑, F ),               (5.1)
                                       2                       2
where 𝑬 = ∇𝑆 𝒖 is the strain tensor, where ∇𝑆 𝒒 = sym(∇𝒒) represents the symmetric part of the
gradient of a given vector field 𝒒. Also, C is the elasticity tensor written in terms of the Young
modulus 𝐸 and Poisson coefficient 𝜈, 𝑔𝑐 is the Griffith energy, 𝛾 > 0 is the phase-field layer width
parameter, and K (𝜑, F ) is a function that models the damage evolution due to fatigue effects. The
first term in 5.1 represents the degraded elastic response, modeled by the choice of degradation
function 𝑔(𝜑) = (1 − 𝜑) 2 . The final set of governing equations, defined over Ω × (0, 𝑇], becomes:
                 
                 
                 
                 
                   𝒖¤ = 𝒗,
                                           
                                        C
                 
                 
                                     2         𝑏             𝛾𝑔𝑐
                  𝒗¤ = div (1 − 𝜑) 𝜌 𝑬 + 𝜌 div (𝑫) − 𝜌 div (∇𝜑 ⊗ ∇𝜑) + 𝒇 ,
                 
                 
                 
                 
                         𝛾𝑔𝑐       1                     1                                       (5.2)
                 
                 
                   𝜑¤ =       Δ𝜑 + (1 − 𝜑)𝑬 𝑇 C𝑬 −        [𝑔𝑐 H ′ (𝜑) + F H ′𝑓 (𝜑)],
                 
                         𝜆        𝜆                    𝜆𝛾
                            𝐹ˆ
                 
                 
                  F¤ = − 𝛾 H 𝑓 (𝜑),
                 
                 
                 
                 
subjected to appropriate initial and boundary conditions, which depend on the physical problem.
Either displacement or stress are known at the boundaries, in addition to considering ∇𝜑 · 𝒏 =
0 on 𝜕Ω. Moreover, the ⊗ operator denotes the outer product, the infinitesimal strain rate tensor is
represented by 𝑫 = ∇𝑆 𝒗, and parameters 𝑏 and 𝜌 are the material’s viscous damping and density,
                                                    104


respectively. We construct 𝜆 such that the rate of change of damage increases with damage (see
e.g., [153]):
                                           1         𝑐
                                             =              ,                                  (5.3)
                                           𝜆 (1 + 𝛿 − 𝜑) 𝜍
where 𝑐, 𝜍 > 0 are material dependent, and 𝛿 > 0 is a small constant to avoid numerical singularity.
    The potentials H (𝜑) and H 𝑓 (𝜑) model the damage transition from 0 to 1 as fatigue changes
from zero to 𝑔𝑐 . We take their (ordinary) derivatives with respect to 𝜑 to obtain potentials H ′ (𝜑)
and H ′𝑓 (𝜑). Further details on fatigue potentials can be found in [34]. Choosing the transition to
be continuous and monotonically increasing, suitable choices for the potentials are:
                                       
                                       
                                         0.5𝜑2            for 0 ≤ 𝜑 ≤ 1,
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                                       
                             H (𝜑) = 0.5 + 𝛿(𝜑 − 1) for 𝜑 > 1,                                 (5.4)
                                       
                                       
                                       
                                       
                                       
                                       
                                        −𝛿𝜑              for 𝜑 < 0.
                                       
                                       
                                       
                                             
                                             
                                               −𝜑 for 0 ≤ 𝜑 ≤ 1,
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                   H 𝑓 (𝜑) = −1 for 𝜑 > 1,                                     (5.5)
                                             
                                             
                                             
                                             
                                             
                                             
                                              0 for 𝜑 < 0.
                                             
                                             
                                             
The evolution of fatigue F is controlled by 𝐹,    ˆ related to the formation and growth of micro-
cracks that occur in cyclic loadings. We note that being a measure of energy accumulated in the
microsctructure, fatigue variable F grows even under monotonic loading. The form of 𝐹ˆ depends
on the absolute value of the power related to stress in the virgin material:
                                  𝐹ˆ = 𝑎(1 − 𝜑) |(C𝑬 + 𝑏𝑫) : 𝑫 | ,                             (5.6)
where the parameter 𝑎 in this case is chosen to give a linear dependence of the power of stress.
5.2.2   Discretization
We discretize 5.2 in space using linear finite element method (FEM), where the semi-discrete form
is obtained through Galerkin method. For detailed derivation of the spatial discretization in 2D,
                                                  105


Figure 5.1 Description of geometry and boundary conditions for the tensile test specimen, along
with finite element mesh and sensor layout for time-series generation. We highlight two sensor
nodes that show different time-series behaviors.
please refer to Appendix A.
    We consider the tensile test specimen without notch depicted in Fig. 5.1. We discretize it
with a finite element mesh consisting of 3912 nodes and 7236 linear triangle elements, with
smallest element size of 0.614 𝑚𝑚. We constrain one end, and apply a prescribed displacement of
4.5 × 10−4 𝑚/𝑠, with time increments of Δ𝑡 = 5 × 10−4 𝑠, at the other end. We study a material with
Young modulus 𝐸 = 160 𝐺𝑃𝑎, Poisson coefficient 𝜈 = 0.3, and density 𝜌 = 7800 𝑘𝑔/𝑚 3 , under
plane stress conditions with thickness of ℎ = 5 𝑚𝑚. The rate of change of fatigue 𝑎 is 5 × 10−7 𝑚 2 ,
and viscous damping 𝑏 is 1 × 108 𝑁 𝑠/𝑚 2 . The remaining parameters 𝛾 (phase-field layer width),
𝑔𝑐 (Griffith energy), and 𝑐 (rate of change of damage), are chosen in order to construct a set of
different representative cases. We focus on those parameters to build the cases because they are the
most sensitive and give more uncertainty in damage evolution [104].
5.3    Data Processing
In this section, we highlight how to obtain time-series data from phase-field simulations to train
and test the learning algorithms. Further, we explore different possibilities of label definitions in
the context of failure prediction based on the simulation results.
5.3.1   Time-Series Data Generation
To generate time-series data, virtual sensing nodes are considered at different locations of the
specimen, as shown in Fig. 5.1. The sensor layout in our tests simply chosen to provide a coarse-
to-fine (variable) resolution for the ML framework to calibrate/train and classify time-series data.
    Table 5.1 presents the parameters used to construct each representative failure case or type, for
                                                 106


                            Table 5.1 Parameters used in the representative cases.
                                  Case     𝛾 (𝑚)        𝑔𝑐 (𝑁/𝑚)  𝑐 ( 𝑁𝑚𝑠 )
                                  1        3.00 × 10−4  2700      2.00 × 10−6
                                  2        2.00 × 10−3  2700      2.00 × 10−6
                                  3        5.00 × 10−4  5400      2.00 × 10−6
                                  4        5.00 × 10−4  10800     2.00 × 10−6
                                  5        2.00 × 10−3  5400      1.00 × 10−6
                                  6        2.50 × 10−4  5400      1.00 × 10−6
                   (a) Case 1, 𝑡 = 0.48 𝑠.                                  (b) Case 2, 𝑡 = 0.48 𝑠.
                   (c) Case 3, 𝑡 = 0.62 𝑠.                                  (d) Case 4, 𝑡 = 0.78 𝑠.
                   (e) Case 5, 𝑡 = 0.62 𝑠.                                  (f) Case 6, 𝑡 = 0.64 𝑠.
Figure 5.2 Damage phase-field for each representative failure case. By changing the parameters 𝛾,
𝑔𝑐 , and 𝑐, we observe different failure types (distinct crack positions and paths), as well as varying
dynamics.
which we plot the damage phase-field at failure time in Fig. 5.2. We also observe three different
failure types (i.e., around the fillets, at the middle of the specimen, and in an intermediate region
between those), where the effect of changing parameters is clearly noticeable. We observe the
different damage evolution in the highlighted sensor nodes shown in Fig. 5.1 from their time-series
data. We plot time-series data from sensor nodes 91 (at the middle of the specimen) and 143 (at
one of the fillets), for cases 1, 2 and 3, where we observe the different evolution profiles based on
each failure type (see Fig. 5.3).
     From the damage phase-field time-series data 𝜑 at sensing nodes, we then form a feature vector
of a pattern as the degradation function 𝑔(𝜑) = (1 − 𝜑) 2 . Thus, patterns are generated using 𝑔(𝜑),
extracted at sensing nodes for a given time step. Accordingly, a label is assigned to identified
                                                       107


             1                                                              1
                                                                                                                                        0.6
            0.8                                                            0.8
                                                                                                                                        0.5
   Damage                                                         Damage                                                       Damage
            0.6                                                            0.6                                                          0.4
            0.4                                                            0.4                                                          0.3
                                                                                                        Sensing node 91
                                                                                                                                        0.2
                                           Sensing node 91                                              Sensing node 143
            0.2                                                            0.2                                                                                       Sensing node 91
                                           Sensing node 143                                                                             0.1                          Sensing node 143
             0                                                              0                                                            0
                  0     0.2   0.4      0.6        0.8         1                  0   0.2    0.4     0.6        0.8         1                  0   0.2   0.4      0.6       0.8          1
                                Time (s)                                                     Time (s)                                                     Time (s)
                              (a) Case 1.                                                  (b) Case 2.                                                  (c) Case 3.
Figure 5.3 Damage phase-field time-series data for three cases, showing the different evolution of
𝜑 depending on the virtual sensor node position.
patterns at each time step.
Remark 5.3.1. Damage 𝜑 is a proper measure of material failure. However, by defining the feature
vector based on the degradation function 𝑔(𝜑) instead, we directly measure the material softening,
since 𝑔(𝜑) is the field variable that degrades the constitutive model, thus reducing the component’s
load bearing capabilities.
5.3.2                 Label Definitions
In the domain of ML, label is defined as the output of the classification algorithm. We outline
different criteria used to generate the labels for the supervised ML algorithms. In the context of
failure analysis, labels should reflect the material’s capacity to withstand loads, so a first rational
choice is to define labels based on load-displacement curve. Besides, we generate labels based on a
damage threshold concept with degradation function 𝑔(𝜑). Based on noted criteria, each pattern is
given a label corresponding to one of multiple classes, namely, no failure (class 1), onset of failure
(class 2), and failure (class 3).
5.3.2.1                Label definition according to load-displacement curve
We start by defining the labels in a binary fashion, in order to observe the damage phase-field corre-
sponding to a specific label transition. At each time step, a pattern is assigned 0 if there is no failure,
                                                                                             108


and 1 if the specimen has fractured, such that we have a label vector 𝐿 = {0 0 . . . 0 1 . . . 1 1}𝑇 .
We assign the labels based on load-displacement curve of the tensile test:
     • Label Type 1: labels are generated based on the maximum force at the load-displacement
       curve, which may induce to a failure criterion too soon.
     • Label Type 2: labels are generated according to the minimum derivative 𝑑𝑓 /𝑑𝑢, which could
       detect damage too late.
     • Label Type 3: labels are generated based on 85%, 90%, and 95% of maximum force at the
       load-displacement curve, which yields an intermediate behavior compared to Label types 1
       and 2.
    Fig. 5.4 illustrates the different points, where failure is defined according to the label types,
with corresponding damage phase-fields. The label types 1 and 2 are too extreme and lead to early
and late prediction of failure, respectively. To address this issue, we further improve binary label
type 3 by including an intermediate state, the onset of failure, based on percentages of maximum
load. Label type 3 then becomes a multiple label definition, stated here as:
     • Multiple Label Type 3: given the labels created based on 90% and 95% of maximum force
       at the load-displacement curve (𝑙90 and 𝑙95 ), a pattern 𝑥𝑖 is assigned to class 1 (no failure) if
       label of the pattern based on 𝑙90 and 𝑙95 is zero, class 2 (onset of failure) when label of the
       pattern is zero based on 𝑙90 and one based on 𝑙95 , and class 3 if label of the pattern based on
       𝑙90 and 𝑙95 is one.
5.3.2.2    Label definition according to damage threshold concept
We also propose a label definition based on a damage threshold of degradation function 𝑔(𝜑),
where three different thresholds (i.e., 𝑅1 =1, 𝑅2 =0.92, and 𝑅3 =0.85) are empirically selected based
on the simulations. Accordingly, we generate labels by tracking 𝑔(𝜑) on all sensing nodes, and
following the rule:
                                                  109


             12000
                                                  Label 1
             10000                                 Label 3                  (b) Label 1, 𝑡 = 0.39 𝑠.
              8000
  Load (N)
              6000
                                                     Label 2
              4000
                                                                            (c) Label 2, 𝑡 = 0.51 𝑠..
              2000
                0
                     0     1         2        3         4         5
                               Displacement (m)                10-4
                         (a) Load-displacement curve.                       (d) Label 3, 𝑡 = 0.47 𝑠..
Figure 5.4 (Left) Load-displacement curve for case 1, where we identify the three points where the
labels change from 0 to 1, according to different criteria. (Right) Respective damage phase-fields
corresponding to the positions indicated in the curve. We note that Label type 3, based on a
threshold of 90% of maximum force, lies between the first two criteria. In Label 1, damage field is
still too smooth, while in Label 2, failure is far too advanced.
       • Multiple Label Type 4: a given sensor node 𝑆𝑖 is shown with index 𝑎 when 𝑅1 ≥ 𝑔(𝜑) > 𝑅2 ,
              index 𝑏 if 𝑅2 ≥ 𝑔(𝜑) > 𝑅3 , and index 𝑐 if 𝑅3 ≥ 𝑔(𝜑). Once the noted indices for all sensor
              nodes at a given time step (i.e., features of pattern 𝑥𝑖 ) are determined, sum of each index 𝑎
              to 𝑐 is computed. Pattern 𝑥𝑖 is then classified as class 1, if summation of index 𝑎 is larger
              than that of indices 𝑏 and 𝑐, class 2 when summation of index 𝑏 is greater than summation
              of indices 𝑎 and 𝑐, etc. This label definition is motivated by the neighboring effect concept
              (i.e., group of sensor nodes), allowing to eliminate the effect of uncertainty and faulty sensor.
5.4            ML Algorithmic Framework
We develop a supervised ML algorithmic framework for interpretation of time-series data generated
from the phase-field model. The proposed ML framework presented in Fig. 5.5 is based on the
integration of a PR scheme and ML algorithms. According to PR scheme, sensor nodes responses
(i.e., time-series data of degradation function 𝑔(𝜑) = (1 − 𝜑) 2 ) at each time step are represented as
a pattern, along with corresponding label. The input to the learning framework is thus a matrix 𝑀
                                                                      110


Figure 5.5 Schematic illustration of the proposed ML framework. A pattern recognition scheme is
introduced to represent time-series data of damage degradation function 𝑔(𝜑) = (1 − 𝜑) 2 extracted
at sensing nodes as a pattern. The 𝑘-NN and ANN algorithms are employed for failure classification
using recognized patterns. In 𝑘-NN analysis the classification is performed by determining the 𝑘-
nearest vote vector. An ANN provides a map between the inputs and outputs through determination
of the weights using input and output patterns.
with dimension 𝑚 × 𝑛, where 𝑚 denotes the number of time steps and 𝑛 represents the number of
sensor nodes. Consequently, each row of the matrix denotes a pattern, where the dimension of the
PR problem is 𝑛 (i.e., a pattern with 𝑛 features).
    After generating the noted matrix 𝑀, ML algorithms, 𝑘-NN and ANN, are used for fail-
                                                 111


ure/damage classification with 𝑚 identified patterns. By introducing ML algorithms for pattern
recognition, as opposed to relying exclusively on the load-displacement curve traditionally em-
ployed in failure mechanics, we allow the framework to be generalized, allowing failure detection
in complex loading conditions, not restricted to the canonical examples presented, and in cases
where load-displacement data is not even present. For the same reason, we underline the importance
of Multiple Label Type 4, which exclusively depends on damage sensor data.
    It should be noted that 𝑘-NN and ANN are used in this research due to their effective and reliable
performance, while these algorithms are computationally efficient. ANN is specially advantageous
in this application, where we have a phase-field damage model that is nonlinear, and associated
with salient sources of parametric and model-form uncertainties which could be broadcast to the
solution. Therefore, we use ANN over other comparable methods to permit a systematic study of
uncertainty propagation and sensitivity of failure detection under noisy data in future works. The
theoretical and mathematical details of 𝑘-NN and ANN can be found in the published literature
[213–216].
    The dataset for the 𝑘-NN and ANN analysis is divided into three subsets; namely, training,
validation, and test. The training set is used to fit the ML classifiers, while the validation set is
used to compute the optimal learning parameters. Performance of the ML classifiers with optimal
parameters is then assessed on the test set. The performance of 𝑘-NN and ANN algorithms is
measured using the detection performance rate defined in following equation:
                                           Number of patterns correctly classified
                 Classification accuracy =                                          .            (5.7)
                                             Total number of identified patterns
    Different size of data subsets are considered herein to evaluate the effect of such factor on
the performance of the ML algorithms. Accordingly, five different combinations listed below are
defined, where the accuracy of 𝑘-NN and ANN is determined based on each combination.
                                                112


    • Comb 1: training & validation 65% and test 35%.
    • Comb 2: training & validation 70% and test 30%.
    • Comb 3: training & validation 75% and test 25%.
    • Comb 4: training & validation 80% and test 20%.
    • Comb 5: training & validation 85% and test 15%.
5.5    Results and Discussion
The performance of the developed ML framework in terms of predicting the presence and loca-
tion/pattern of failure is evaluated with time-series data of degradation function 𝑔(𝜑) = (1 − 𝜑) 2
generated from the phase-field model. To this aim, the framework is initially trained and tested
using each one of six representative failure cases (see 5.3.1), where the presence of failure is
detected for each case, along with corresponding accuracy. In the next analysis phase, to detect the
location of failure, ML algorithms (i.e., 𝑘-NN and ANN) are trained using data from all six failure
cases, and the classification accuracy is determined on test data, leading to identification of the
pattern of failure. The following subsections present the classification results of the ML framework
employing multiple labels generated according to multiple label types 3 and 4 (see 5.3.2).
5.5.1   Results with 𝑘-NN
5.5.1.1   Detection of the Presence of Failure
As noted in 5.4, the proposed ML framework is trained and tested using different size of data
subsets. For the 𝑘-NN analysis, 𝑘-fold cross validation with 𝑘 = 10 is used. Patterns identified with
the PR scheme, along with corresponding labels, are used as input to the algorithmic framework in
order to predict the presence of failure. To further explore the performance of the 𝑘-NN algorithm,
the optimal number of 𝑘 needs to be determined. In this context, cases 1 to 3 representing
                                                  113


                                                              Case 1                                    Case 2              Case 3                                                                                                        Case 1          Case 2          Case 3
  Classification Accuracy (%)                                                                                                                                          Classification Accuracy (%)
                                       100 100 100                 100                                                                                                                                        100 100                    100 100    100 100          100 100                 100
                                100                                                                                                                                                                  100
                                                     99.8                99.8                               99.8   99.8          99.8   99.8          99.8                                                                 99.8   99.8                        99.8                    99.8
                                99.8                                                                                                                                                                 99.8
                                                            99.7                                99.7                      99.7                                                                                                                                                 99.7                99.7
                                99.6                                                                                                                                                                 99.6
                                                                                                                                               99.5
                                99.4                                                                                                                                                                 99.4
                                99.2                                                                                                                                                                 99.2
                                 99                                                                                                                                                                   99
                                           1                 2                                          4                  8                   16                                                                     1                  2                4              8                   16
                                                                 Number of Neighbors (k)                                                                                                                                                     Number of Neighbors (k)
                                                                         (a)                                                                                                                                                                        (b)
Figure 5.6 𝐾-NN classification accuracy with different number of 𝑘: (a) Accuracy based on multiple
label Type 3, (b) Accuracy based on multiple label Type 4.
                                                                                                                                                        Label Type 3                                        Label Type 4
                                                                          Classification Accuracy (%)
                                                                                                                                          100         100       100       100                                             100        100      100
                                                                                                        100
                                                                                                                   99.9
                                                                                                        99.8                                                                                                   99.8
                                                                                                                            99.7
                                                                                                        99.6
                                                                                                        99.4
                                                                                                        99.2
                                                                                                            99
                                                                                                                   Comb 1                 Comb 2                Comb 3                                         Comb 4                Comb 5
                                                                                                                                           Different Sizes of Data Subsets
Figure 5.7 𝐾-NN classification results for failure case 3 with different size of data subsets and
multiple label Types 3 & 4.
different failure types/locations (see Fig. 5.2) are considered, based on which the performance of
the algorithm is evaluated with varying 𝑘. The 𝑘-NN classification results based on multiple label
types 3 and 4 are shown in Fig. 5.6. As can be seen, by increasing the number of neighbors (𝑘),
accuracy decreases. We choose 𝑘 = 2 for subsequent analyses, and we will later check that this is
the optimal number of neighbors in detection of failure location. Furthermore, the optimal distance
is found to be “Cosine", which results in better accuracy compared to other distance functions.
Classification results with different size of data subsets are presented in Fig. 5.7 for failure case 3,
from which it can be observed that the highest accuracy is achieved based on combinations 2, 3
and 6, so we choose Comb 2, i.e., training & validation 70% , and test 30% for all further results.
                          The performance of the proposed ML framework employing 𝑘-NN with multiple labels is
assessed for all failure cases (i.e., cases 1 to 6), where the classification accuracy on test data is
reported for each case (see Fig. 5.8). Clearly, the performance of the framework is acceptable
                                                                                                                                                             114


                                                                                      Label Type 3          Label Type 4
                        Classification Accuracy (%)
                                                                                100     100 100      100           100 100
                                                      100
                                                             99.8 99.8
                                                      99.8
                                                                         99.7                                                99.7 99.7
                                                      99.6
                                                                                                           99.5
                                                      99.4
                                                      99.2
                                                       99
                                                             Case 1      Case 2         Case 3       Case 4        Case 5    Case 6
                                                                                          Failure Cases
   Figure 5.8 𝐾-NN classification results for different failure cases based on label Types 3 & 4.
Figure 5.9 Confusion matrix on test data with 𝑘-NN: (a) Case 5 and multiple label Type 3, (b) Case
1 and multiple label Type 4.
such that the highest accuracy based on multiple label types 3 and 4 is 100% for failure cases 5
and 3, respectively. To better visualize the classification results, a confusion matrix containing
information about actual and predicted classification is determined. Each column of the confusion
matrix represents the patterns in a predicted class, whereas each row denotes the patterns in an
actual class. The confusion matrix containing detailed classification results for cases 5 and 1 are
depicted in Fig. 5.9. As can be seen, the 𝑘-NN method performs well on all classes, including class
2 denoting onset of failure, which is of primary interest for early detection of failure in real-world
applications.
                                                                                          115


                                                       100
                                                                                    Label Type 3          Label Type 4
                         Classification Accuracy (%)
                                                       99.5
                                                        99
                                                       98.5                 98.3 98.3
                                                                     98.2
                                                              98.1
                                                        98                                         97.9
                                                                                            97.7
                                                                                                                    97.5
                                                       97.5                                                  97.4
                                                        97
                                                       96.5                                                                       96.4
                                                                                                                           96.2
                                                        96
                                                                 1             2               4                8             16
                                                                                   Number of Neighbors (k)
Figure 5.10 𝐾-NN classification accuracy with different number of 𝑘 based on multiple label Types
3, and 4.
5.5.1.2   Detection of the Location/Pattern of Failure
An attempt is made to detect the location of failure, enabling the framework to predict the pattern
of failure. In regard to this, three different failure cases 1 to 3, representing different failure types,
are considered for the analysis. Accordingly, nine classes/labels are defined, as shown in Table 5.2,
using multiple label type 4 (multiple labels based on damage threshold concept). We also study
the effect of different number of neighbors 𝑘 in this context. We present the accuracy results in
Fig. 5.10, in which we observe that 𝑘 = 2 is indeed the optimal number of neighbors, corroborating
the choice made in the previous section.
   The confusion matrix showing the classification results is presented in Fig. 5.11. As can be
observed, the total accuracy is reported as 98.3% using multiple labels/classes shown in Table 5.2
(𝑘-NN detects the location of failure with 98.3% accuracy). Results indicate the overall efficient
performance of 𝑘-NN to detect the onset of failure (classes 2, 5 and 8) and failure (classes 3, 6 and
9). The incorrect classifications are more concentrated in classes denoting no failure (classes 1, 4
and 7), where the method incorrectly identifies the location of failure in a few data points, because
in early stages of the simulation the damage field is similar among the different cases. This is not an
issue, since the critical part is the onset of failure. Moreover, the lowest classification accuracy is
86.5% for class 7 (see Fig. 5.11), which is an acceptable performance for a classification algorithm.
                                                                                        116


        Table 5.2 Illustration of label/class definition for detection of location/pattern of failure.
 Label/Class                                                          Failure             Label based on             New Label/ Class
                                                                      Case                Label Type 4
                                                                      1                   0                          1
                                                                      1                   0.5                        2
                                                                      1                   1                          3
                                                                      2                   0                          4
 Label of a Pattern at a Given Time Step                              2                   0.5                        5
                                                                      2                   1                          6
                                                                      3                   0                          7
                                                                      3                   0.5                        8
                                                                      3                   1                          9
                                             117     0      0      3        0      0     12       0      0  88.6%
                                        1
                                            6.5%   0.0%   0.0%   0.2%     0.0%   0.0%   0.7%    0.0%   0.0% 11.4%
                                              0     32      1      0        0      0      0       0      0  97.0%
                                        2
                                            0.0%   1.8%   0.1%   0.0%     0.0%   0.0%   0.0%    0.0%   0.0% 3.0%
                                              0      0   430    0           0      0      0       0      0    100%
                                        3
                                            0.0%   0.0% 23.9% 0.0%        0.0%   0.0%   0.0%    0.0%   0.0%   0.0%
                                              4      0      0     107       0      0      6       0      0  91.5%
                                        4
                                            0.2%   0.0%   0.0%   5.9%     0.0%   0.0%   0.3%    0.0%   0.0% 8.5%
                         Output Class
                                              0      0      0      0       25      0      0       0      0    100%
                                        5
                                            0.0%   0.0%   0.0%   0.0%     1.4%   0.0%   0.0%    0.0%   0.0%   0.0%
                                              0      0      0      0        1   495    0          0      0  99.8%
                                        6
                                            0.0%   0.0%   0.0%   0.0%     0.1% 27.5% 0.0%       0.0%   0.0% 0.2%
                                              2      0      0      1        0      0     115      0      0  97.5%
                                        7
                                            0.1%   0.0%   0.0%   0.1%     0.0%   0.0%   6.4%    0.0%   0.0% 2.5%
                                              0      0      0      0        0      0      0      47      0    100%
                                        8
                                            0.0%   0.0%   0.0%   0.0%     0.0%   0.0%   0.0%    2.6%   0.0%   0.0%
                                              0      0      0      0        0      0      0       0   403 100%
                                        9
                                            0.0%   0.0%   0.0%   0.0%     0.0%   0.0%   0.0%    0.0% 22.4% 0.0%
                                            95.1% 100% 99.8% 96.4% 96.2% 100% 86.5% 100% 100% 98.3%
                                            4.9% 0.0% 0.2% 3.6% 3.8% 0.0% 13.5% 0.0% 0.0% 1.7%
                                             1      2      3      4        5      6      7       8      9
                                                                        Target Class
Figure 5.11 Confusion matrix with 𝑘-NN classification results for detection of location/pattern of
failure based on multiple label Type 4.
5.5.2     Results with ANN
5.5.2.1     Detection of the Presence of Failure
The performance of the proposed ML framework employing ANN algorithm is evaluated in terms
of detecting the presence of failure. On this basis, a two-layer (i.e., one hidden-layer with 5 neurons
                                                                          117


                                                                                       Label Type 3             Label Type 4
                        Classification Accuracy (%)
                                                                    100          100      100                                100          100
                                                      100
                                                             99.8                               99.8   99.8           99.8
                                                      99.8
                                                                          99.7                                99.7                 99.7
                                                      99.6
                                                      99.4
                                                      99.2
                                                       99
                                                             Case 1       Case 2         Case 3        Case 4         Case 5       Case 6
                                                                                           Failure Cases
Figure 5.12 ANN classification results for different failure cases based on multiple label Types 3 &
4.
Figure 5.13 Confusion matrix on test data with ANN: (a) Case 5 and multiple label Type 3, (b)
Case 1 and multiple label Type 4.
and output layer) feed-forward neural network with Sigmoid classifier/activation function is used.
Classification results based on multiple label types 3 and 4 are presented in Fig. 5.12, from which
it can be seen that ANN leads to comparable accuracy compared to 𝑘-NN (see Fig. 5.8). Detailed
classification results with ANN is presented in a confusion matrix shown in Fig. 5.13. Results
indicate that ANN effectively detects the presence of failure with total accuracy of 99.90% and
100% using multiple label types 3 and 4, respectively.
                                                                                            118


                                          102     0      0      0       0      0      0      0      0    100%
                                     1
                                         5.7%   0.0%   0.0%   0.0%    0.0%   0.0%   0.0%   0.0%   0.0%   0.0%
                                           2     29      0      0       0      0      0      0      0  93.5%
                                     2
                                         0.1%   1.6%   0.0%   0.0%    0.0%   0.0%   0.0%   0.0%   0.0% 6.5%
                                           0      0   444    0          0      0      0      0      0    100%
                                     3
                                         0.0%   0.0% 24.7% 0.0%       0.0%   0.0%   0.0%   0.0%   0.0%   0.0%
                                           0      0      0     65       2      0      0      0      0  97.0%
                                     4
                                         0.0%   0.0%   0.0%   3.6%    0.1%   0.0%   0.0%   0.0%   0.0% 3.0%
                      Output Class
                                           0      0      0      0      23      0      0      0      0    100%
                                     5
                                         0.0%   0.0%   0.0%   0.0%    1.3%   0.0%   0.0%   0.0%   0.0%   0.0%
                                           0      0      0      0       4   473    0         0      0  99.2%
                                     6
                                         0.0%   0.0%   0.0%   0.0%    0.2% 26.3% 0.0%      0.0%   0.0% 0.8%
                                          39      0      0     51       0      0     133     0      0  59.6%
                                     7
                                         2.2%   0.0%   0.0%   2.8%    0.0%   0.0%   7.4%   0.0%   0.0% 40.4%
                                           0      0      0      0       0      0      4     36      0  90.0%
                                     8
                                         0.0%   0.0%   0.0%   0.0%    0.0%   0.0%   0.2%   2.0%   0.0% 10.0%
                                           0      0      0      0       0      0      0      0   394 100%
                                     9
                                         0.0%   0.0%   0.0%   0.0%    0.0%   0.0%   0.0%   0.0% 21.9% 0.0%
                                         71.3% 100% 100% 56.0% 79.3% 100% 97.1% 100% 100% 94.3%
                                         28.7% 0.0% 0.0% 44.0% 20.7% 0.0% 2.9% 0.0% 0.0% 5.7%
                                          1      2      3      4       5      6      7      8      9
                                                                     Target Class
Figure 5.14 Confusion matrix with ANN classification results for detection of location/pattern of
failure based on multiple label Type 4.
5.5.2.2   Detection of the Location/Pattern of Failure
Once the presence of failure is detected, ANN algorithm is employed to identify the location/pattern
of failure. On this basis, multiple labels defined in Table 5.2 are used for supervised classification
with ANN algorithm. A two-layer (i.e., one hidden-layer with 5 neurons and output layer) feed-
forward neural network with Sigmoid activation function is used as the ANN architecture to detect
the location of failure. The confusion matrix with detailed classification results is presented in
Fig. 5.14, from which it can observed that the location/pattern of failure can be successfully
detected using ANN with high accuracy. Similarly to 𝑘-NN, the majority of misclassifications in
ANN belong to classes representing no failure (classes 1 and 4 in Fig. 5.14), due to the similarity
of damage field prior to damage localization and crack initiation. In the other classes, ANN still
performs successfully, with minimum accuracy of 79.3% when detecting the onset of failure (class
5).
                                                                     119


5.5.3    Discussion of Deterministic Results
From observation of classification results employing 𝑘-NN and ANN to predict failure in damage
phase-field models, we see consistent results with high accuracy. This is specially the case of
𝑘-NN, which shows less uncertainty in classifying the patterns. Yet, we must highlight that the
nature of data used in this work plays a major role in such satisfactory results. Several nuances
and complications of classification algorithms, such as variance-bias trade-off, sensitivity, and
distribution/number of data points, are either justified, or less critical, due to characteristics of
phase-field models.
     In essence, phase-field models are smooth representations of sharp, discontinuous interfaces. In
the case of damage modeling, cracks are smoothened in the phase field as an intrinsic approximation
of the model. Therefore, as opposed to real data measurements, phase-field solutions evaluated
at virtual sensing nodes are naturally smooth, introduce low noise, and consequently lead to less
variance in the predictions.
     In this regard, the choice of number and position of virtual sensing nodes was performed by
an a priori study of failure patterns from Fig. 5.2. Different configurations of training data led to
little change of classification results. The distribution of sensors considered expert opinion, as to
provide the framework with data points associated with different modes of failure. Such assessment
considers that placement of sensors in regions around the bases of the specimen will not contribute
to accuracy, since there is no damage accumulation in such areas. We remark that for a specific
failure case, with model parameters appropriately inferred from experiments, fewer sensor nodes
would be required, placed preferably in critical areas (e.g. around the corners, and in the central
region).
     With respect to the specific algorithms used in this work, the nature of phase-field data aids in
the trade-off between bias and variance. Both 𝑘-NN and ANN have been extensively studied in the
literature, and the bias-variance behavior is well understood. For 𝑘-NN, it has been demonstrated
analytically that bias increases, and variance decreases with increasing number of neighbors [217,
218]. Since the smoothness of phase-field solutions does not introduce substantial variance, the
                                                  120


choice of 𝑘 = 2 makes the trade-off by reducing the bias. In the case of ANN, we reduce bias by
increasing the number of hidden units [217], so the number of neurons used is also justified.
    Another source of potential errors is related to the number of time-steps, and their distribution
among the classes. This issue shows specifically when predicting onset of failure, the most critical
part of the failure process. Classification errors in this case may be due to fewer data points available,
since the onset of failure occurs rapidly. A possible remedy would be to refine the time-integration
steps around the onset of failure. Additionally, we have shown that the algorithm incorrectly
classifies cases of no failure between different fracture patterns due to enhanced smoothness early
on the simulation. However, this issue is less catastrophic when compared to erroneously detecting
the onset of failure, regardless of the fracture pattern.
    Overall, the framework worked effectively with deterministic data, due to the nature of phase-
field models, and reasonable choice of 𝑘-NN and ANN parameters to compromise the relation
between bias and variance. Since phase-field models present smooth features with low variance,
we further assess the robustness of the framework by adding noise to the damage field solutions.
5.5.4    Uncertainty Quantification
The results presented in previous sections consisted in smooth, deterministic input data in a single
run of the ML algorithms. In this section we propagate the uncertainty associated to data sampling
and randomness related to the algorithms, and we assess the robustness and accuracy of both
methods to variability in data through the addition of Gaussian noise. This approach aims to verify
the effectiveness of the framework to handle real-world data.
5.5.4.1    Algorithmic randomness
We first study the propagation of uncertainties related to the algorithms, still using deterministic
data. Such randomness appears in both 𝑘−NN and ANN due to the random division of time-series
data into training, validation and test sets. We need to choose random division to avoid bias,
specially in damage data that shows pronounced temporal evolution trends. Furthermore, ANN
                                                    121


Table 5.3 Total classification accuracy mean and standard deviation from algorithmic randomness
(%).
                                                      𝑘-NN                 ANN
                                                Mean    Std. Dev     Mean    Std. Dev
                Failure presence - Case 1       99.86      0.16      99.82      1.53
                Failure presence - Case 2       99.86      0.16      99.88      0.14
                Failure presence - Case 3       99.87      0.14      99.84      0.27
                Failure presence - Case 4       99.86      0.16      99.82      0.31
                Failure presence - Case 5       99.87      0.14      99.89      0.15
                Failure presence - Case 6       99.86      0.15      99.88      0.16
                Failure location - Cases 1/2/3  98.28      0.31      84.58      6.00
also presents another source of uncertainty, related to random initialization of weights and biases
in each neuron.
    We use Monte Carlo (MC) method to run multiple classification problems for each algorithm,
and compute the expected total classification accuracy and standard deviation. We use multiple
label Type 3, and run 1000 simulations for detection of failure presence in each case (cases 1 to
6), and run additional 1000 classifications for the detection of failure location, using classes from
Table 5.2. We show the results in Table 5.3 . We observe that 𝑘-NN performs better than ANN
in this setting, since ANN incorporates another level of uncertainty from the random guesses of
neuron parameters. The randomness of data division does not affect the performance of neither
method in the failure location problem. For failure location, ANN is less accurate, but still within
acceptable range.
5.5.4.2   Noisy data
To assess the performance of the proposed ML framework with noisy data, time-series data are
corrupted by adding Gaussian distributed noise with different standard deviations to damage 𝜑
data. We run 1000 MC simulations for 𝑘-NN and ANN algorithms using multiple label Type 3,
and compute the total accuracy expectation and standard deviation. We propagate uncertainty for
failure presence detection in case 3, and for detection of failure location, and show the results in
                                                 122


                   100
                                                                   k-NN                              80                                             k-NN
                                                                   ANN                                                                              ANN
                    95
                                                                                                     70
                    90
    Accuracy (%)                                                                      Accuracy (%)
                                                                                                     60
                    85
                                                                                                     50
                    80
                    75                                                                               40
                    70                                                                               30
                         0   0.1   0.2   0.3    0.4   0.5   0.6   0.7     0.8                             0   0.1   0.2   0.3    0.4   0.5   0.6   0.7     0.8
                                     Noise Standard Deviation                                                         Noise Standard Deviation
                                               (a)                                                                              (b)
Figure 5.15 Mean total classification accuracy and standard deviation for (a) Detection of failure
presence, case 3 and , (b) Detection of failure location.
Fig. 5.15.
   We observe that increasing noise levels decreases the mean accuracy, while enlarging the
uncertainty range, for both methods. We see that even small intensity noise drops the accuracy to
lower levels when compared to mean values based only on algorithmic randomness. In other words,
input data dominates over algorithms’ uncertainty. For failure presence detection in case 3, we see
that ANN performs better than 𝑘-NN for low noise levels, while 𝑘-NN is more robust under higher
noise magnitudes, resulting in higher mean accuracy and lower standard deviation. For case 3, the
lowest accuracy was still above 75%, showing robustness with 3 classes. Similarly, when we look
at detection of failure location, ANN is superior 𝑘-NN for low noise levels, and shows higher mean
accuracy even for high noise intensity, yet with more uncertainty. However, we cannot claim good
performance of the algorithms with 9 classes under high noise levels, under this specific choice of
algorithmic setup. This motivates a more systematic approach to uncertainty and sensitivity of ML
algorithms under noisy phase-field data, and it shall be the focus of future studies.
                                                                                123


                                           CHAPTER 6
                               SUMMARY AND FUTURE WORKS
The main objective of this work was to develop a multi-scale, data-driven framework for the
modeling, analysis, and simulation of material failure. The framework considers the stochastic
processes occurring in the small scales as sources of uncertainty that must be propagated and
quantified at the component level. We developed a methodology to obtain fast estimates of
uncertainty in dislocation mobility at the nano-scale through a surrogate model that learns a
stochastic process from high-fidelity molecular dynamics simulations. At the meso-scale, we
studied the collective dislocation dynamics as a probabilistic model governed by nonlocal equations.
We developed a machine learning framework to learn the parameters of the nonlocal kernel using
discrete dislocation dynamics simulation data, confirming through the resulting operator that failure
processes have anomalous character. We confirmed the need for alternative operators through the
inspection of uncertainties from stochastic damage and fatigue phase-field models at the continuum
level, where parameters associated with free-energy density show more sensitivity in the solution.
Then, we developed a Machine Learning framework at the macro-scale that predicts failure even
under the presence of noise.
6.1    Concluding Remarks
In Chapter 2, we developed a data-driven framework for constructing a surrogate model of dis-
location glide. Atomistic simulations of dislocation motion provide the statistics that inform the
underlying stochastic process of the surrogate. This is achieved firstly through the coarse-graining
of MD domain using a graph-theoretical representation. Over this network, the dislocation is
idealized as a random walker jumping between the nodes, where the waiting time distribution is
parameterized directly from time-series data obtained in MD simulation. The random walk over the
network is simulated through a KMC algorithm based on the waiting times obtained empirically.
By tracking the dislocation position over we computed the dislocation velocity for each applied
                                                 124


shear stress, which in turn leads to the estimation of dislocation mobility.
    We highlight the following observations from the model and its numerical results:
     • The construction followed the assumption of a memoryless, Markovian process governing
       the dislocation motion, which was a sufficient description based on an estimate of average
       waiting times from empirical data.
     • The estimation of rate constants, often a major difficulty in the application of KMC, was
       performed directly through MD data. We compared three different methods that yielded
       nearly identical results.
     • From the computed rates, the actual simulation of the stochastic process resulted in dislocation
       motion in agreement with trajectories simulated by MD. Next, computation of mobility
       through the surrogate also had excellent agreement with original atomistic estimates.
     • Simulation through the surrogate achieved remarkable speedup compared with MD compu-
       tation times.
     • Uncertainty levels dependent on the number of data points used to construct the surrogate.
       We provided uncertainty estimates for the mobility through the surrogate, taking into account
       the variance of the underlying stochastic process.
    The proposed framework establishes a meaningful bridge for coupling scales, where not only the
value of mobility is provided, but its associated uncertainty. Through the description of dislocation
motion as a stochastic process informed by high-fidelity data, we can propagate the uncertainty
associated with mobility estimations or any other quantity of interest, even with a limited number
of MD samples. As a consequence, this framework acts as a tool for more predictive multi-scale
material characterization.
    In Chapter 3, we proposed a data-driven nonlocal model for the simulation of dislocation
position probability densities. We generated dislocation shifted position data in the form of particle
trajectories from high-fidelity two-dimensional DDD simulations under creep with different load
                                                  125


levels, with and without multiplication mechanisms. From the Lagrangian particle trajectories we
estimated the evolution of PDFs through and Adaptive Kernel Density Estimation method. Last, we
developed a bi-level ML algorithm to obtain the kernel parameterization for the proposed nonlocal
operator that describes that PDF’s evolution. The developed approach integrates the high-fidelity
dynamics of dislocations at the meso-scale with a continuum probabilistic frame in a fluid-limit
sense.
    We make the following observations from the integrated framework:
    • We recovered the dislocation velocity statistics available in the literature from our two-
       dimensional DDD simulations. We identified the same exponent of around 2.4 for the
       power-law decay velocity distribution tail when dislocation multiplication is present, and a
       sharper, close to 3 exponent without the effect of multiplication mechanisms. The statistics
       from DDD show similarities among the studied cases.
    • The PDF estimation from dislocation trajectories makes evident that the presence of mul-
       tiplication sources greatly impacts the probability distributions, increasing the heaviness of
       the tails and implying greater nonlocality.
    • Our bi-level algorithm based on a Least-Squares approach for the computation of the optimal
       nonlocal diffusion coefficient for every pair of 𝛼 and 𝛿 performed well with a manufactured
       solution, and proved to be robust for data-driven PDFs, considering different train-test splits
       and initial guess combinations.
    • The large horizon parameter 𝛿 found among the cases confirms the nonlocal nature of
       dislocation dynamics, even from a probabilistic perspective. Furthermore, the nonlocal kernel
       power-law exponent obtained matches the tail decay from dislocation velocity distributions
       computed in DDD simulations. This establishes a well-defined path between the anomalous
       behavior observed in particle meso-scale dynamics and the upscaling of anomalous effects
       to a continuum, macro-scale frame of reference.
                                                 126


    Although we used single-glide mechanisms, we note that shifted particle positions may be
obtained in more general, multi-slip systems. Since the goal of this framework is to obtain the
probability densities in the fluid-limit, the same procedure could be applied to each slip-system in
a complex crystal. We further point to the fact that the bulk dynamics adopted here can also be
extended to dislocation motion near free surfaces, crack tips, or grain boundaries. In such cases,
one could expect the PDFs to show non-zero skewness, which could be naturally accommodated by
a different choice of (nonsymmetric) kernel in the nonlocal operator, suitable for skewed, possibly
one-sided distributions.
    The nonlocal model of dislocation motion at the meso-scale proposed in this work opens
up the opportunity of fast computations of quantities of interest compared to the high-fidelity
simulations. The implications of nonlocal dislocation models are readily applicable to the study of
visco-elasticity and visco-plasticity, where fractional-order models have been successfully applied
to model the power-law relaxation including damage effects [47]. One of the main connections to
be established is the ultimate effect of different regimes of dislocation dynamics on the evolution of
macro-scale free-energy potentials during failure, in phase-field models [104] for instance. Around
crack tips and other dislocation generation objects, such as holes, pores, or other micro-cracks, we
expect the macro-scale behavior to also be anomalous. Substantially, the methods proposed here
can be essential tools to connect other physical processes from a wider range of applications to the
generation of corresponding nonlocal operators.
    Finally, the proposed bi-level optimization approach is an effective way of reducing the compu-
tational burden of optimizing in a high-dimensional parameter space and proves to be robust with
both manufactured and simulated datasets.
    In Chapter 4, we developed an uncertainty quantification and sensitivity analysis framework
for stochastic damage and fatigue phase-field equations. We used Monte Carlo sampling and
Probabilistic Collocation to compute expectation and standard deviation of damage field, and
expected local sensitivity. To compute the local sensitivity at each collocation point, complex-
step differentiation was used. Probabilistic Collocation method poses a great advantage over
                                                   127


random sampling methods such as MC, reducing significant computational costs with a simple
implementation.
    We presented two representative examples to study the uncertainty propagation in the model.
We detected two different behaviors of the model based on geometry:
    • In the single-edge notched tensile test case, where we already know the crack location and
       direction, the uncertainty is reduced to the speed of crack propagation. Interestingly, that is
       not only controlled by the rate of change of damage parameter 𝑐, but also indirectly by the
       Griffith energy 𝑔𝑐 . Uncertainty can be inferred by local sensitivity analysis, which shows
       the same order of parameter influence. When we compute the global sensitivity indices,
       uncertainty around the crack tip is also controlled by 𝑔𝑐 and 𝑐;
    • In a geometry with no unique crack initiation location nor a determined crack path, such as
       the tensile test specimen, fatigue coefficient 𝑎, and most importantly 𝑔𝑐 and 𝛾 are the most
       sensitive parameters. High uncertainty is dominated by influence of 𝑔𝑐 and 𝛾, which help
       determine speed and mostly direction of damage transport, due to the lack of unique and
       well-known crack path.
    The framework has shown that in undefined crack path or location, uncertainty is concentrated
around parameters involved with local interactions. Specifically, 𝛾 and 𝑔𝑐 are multiplying the local
interaction term in the free-energy potential, and affect the equation of motion, the Laplacian in
damage evolution and are also related to fatigue potentials, which are chosen arbitrarily. The higher
sensitivity and uncertainty of those parameters related to local terms motivate the use of different
operators that include nonlocal interactions as a way to mitigate model form uncertainty.
    In Chapter 5, we presented a phase-field based machine learning (ML) framework developed
to predict failure of brittle materials. Time-series data are generated according to nodal damage
results from finite element simulations of a tensile test specimen. We assessed the performance
of the proposed ML framework employing PR scheme and ML algorithms (𝑘-NN and ANN) for
                                                  128


different failure types, and with multiple labels generated based on load-displacement curve and
damage threshold concept. We draw the following conclusions from the carried out study:
     • Results indicate the acceptable performance of the proposed framework with multiple labels,
       in which a PR scheme is effectively used to represent time-series data of degradation function
       𝑔(𝜑) = (1 − 𝜑) 2 as a pattern. This choice of time-series data is effective since it directly
       complies with the material softening behavior.
     • Both 𝑘-NN and ANN were efficient to predict the presence and location of failure. The
       majority of errors in detection of failure location were concentrated in classes representing
       no failure, due to smoothness and similarity of damage field early in the simulations.
     • Uncertainty related to input data noise dominates over algorithmic randomness uncertainty.
       The framework showed robustness to noise when detecting failure presence, and showed
       acceptable accuracy with low noise levels when predicting failure location. In general, with
       noisy data ANN outperforms 𝑘-NN.
    Results of this study demonstrate the satisfactory performance of the developed algorithmic
framework and the applicability of ML for failure prediction with damage phase-field time-series
data.
6.2     Future Works
Following the research directions initiated in this work, we believe that the proposed framework has
potential applications in multi-scale failure analysis in different disciplines under the perspective of
uncertainty propagation and stochastic modeling. A potential direction is to use the probabilistic
model of meso-scale dislocation dynamics to learn macro-scale free-energy potentials associated
to failure processes. A natural consequence of learning free-energy functions from the nonlocal
dynamics is the development of variable-order nonlocal methods, where the nonlocality only
manifests in critical regions with presence of failure precursors, such as dislocation multiplication.
This approach has the potential of mitigating the model-form uncertainty observed in the global
                                                  129


sensitivity of damage models. Finally, the propagation of uncertainty across the scales has the
potential of solving practical engineering problems where the knowledge of lower-scale statistics
provide more predictable failure models. The combination of multi-scale uncertainty propagation,
with machine learning algorithms for failure detection and reliability analysis using real-time data
could enhance the accuracy of failure predictions in diverse applications.
                                               130


APPENDIX
   131


                                              APPENDIX
DISCRETIZATION OF THE 2-D PHASE-FIELD MODEL OF DAMAGE AND FATIGUE
A.1      Spatial Discretization
We approximate a deterministic solution of the damage and fatigue phase-field model over its
spatial domain Ω𝑑 with finite element method, where the semi-discrete form of Equations (4.2) is
obtained from the weak Galerkin form after multiplication by test functions and integration over
the domain. A more detailed derivation of the spatial discretization in 2D can be found in Chiarelli
et al (2017) [162]. Denoting u¥̂ = v¤̂ we write the semi-discrete form for an element 𝑘 as
                                   𝑘 ¥̂ 𝑘
                                             𝑘 𝑘      𝑘 𝑘      𝑘      𝑘 𝑘
                              M u = K𝑢 û + K𝑣 v̂ + w𝑎 + M f̂ ,
                             
                             
                             
                             
                             
                                                    
                                M𝜑𝑘 𝜑¤̂ 𝑘 = P𝜑𝑘 + K𝑐𝑘 𝜑ˆ 𝑘 + w𝑏𝑘 + w𝑐𝑘 ,                      (A.1)
                             
                             
                              𝑘 ¤̂ 𝑘
                             
                              MF F = w𝑑𝑘 ,
                             
                             
where M, M𝜑 and MF are mass matrices related to displacement, damage and fatigue. In the
equation of motion, K𝑢 is the elasticity stiffness matrix degraded by damage, K𝑣 is associated
to viscous damping and w𝑎 is a term related to gradient of damage that affects the displacement
field. Term P𝜑 in the damage evolution equation includes the Laplacian and potential H ′ (𝜑). The
influence of displacement in damage is represented by K𝑐 and w𝑏 . Effect of potential H ′𝑓 (𝜑) is
considered in term w𝑐 . w𝑑 is the operator on the right-hand side of fatigue evolution equation.
     We apply the standard assembly operation to obtain the global form of the operator matrices
and we will drop the superscript 𝑘 in the global sense.
     From the semi-discrete system of equations (A.1), the solution at each element is written as a
linear combination of local nodal basis functions such that
    u 𝑘 = N 𝑘 û 𝑘 ,    v 𝑘 = N 𝑘 v̂ 𝑘 ,       𝜑 𝑘 = N𝜑𝑘 𝜑ˆ 𝑘 ,    F 𝑘 = NF𝑘 F̂ 𝑘 ,
               (A.2)               (A.3)                  (A.4)              (A.5)
                                                   132


   Constructing a mesh of linear triangles, the nodal solutions are defined as:
                                        h                                          i𝑇
                               û 𝑘 = 𝑢 𝑘         𝑘
                                                𝑢 1𝑦        𝑘
                                                         𝑢 2𝑥       𝑘
                                                                 𝑢 2𝑦      𝑘
                                                                        𝑢 3𝑥    𝑘
                                                                             𝑢 3𝑦       ,  (A.6)
                                            1𝑥
                                        h                                         i𝑇
                               v̂ 𝑘 = 𝑣 𝑘         𝑘
                                                𝑣 1𝑦       𝑘
                                                         𝑣 2𝑥      𝑘
                                                                 𝑣 2𝑦     𝑘
                                                                        𝑣 3𝑥   𝑘
                                                                             𝑣 3𝑦     ,    (A.7)
                                            1𝑥
                                        h                   i𝑇
                               𝜑ˆ 𝑘 = 𝜑 𝑘         𝑘
                                                𝜑2 𝜑3 ,   𝑘                                (A.8)
                                             1
                                          h                     i𝑇
                               F̂ 𝑘 = F 𝑘        F2 F3𝑘       𝑘                            (A.9)
                                              1
                                                                                          (A.10)
with interpolation matrices
                                                                                 
                                             𝑁1 0 𝑁2 0 𝑁3 0 
                                  N𝑘 =                                                   (A.11)
                                                                                 
                                                                                   ,
                                            0 𝑁             0    𝑁       0  𝑁    
                                                     1               2         3
                                                   h                   i         
                                  N𝜑𝑘 = NF𝑘 = 𝑁1 𝑁2 𝑁3 ,                                  (A.12)
                                                                                          (A.13)
   Gradients of displacement, velocity and damage are approximated by linear combinations of
shape function derivatives
   E 𝑘 = B𝑢𝑘 û 𝑘 ,    D 𝑘 = B𝑣𝑘 v̂ 𝑘 ,         ∇𝜑 𝑘 = B𝜑𝑘 𝜑ˆ 𝑘 ,
            (A.14)              (A.15)                          (A.16)
Derivative matrices are defined as:
                                                        133


                                                                               
                            𝑁1,𝑥
                                       0      𝑁  2,𝑥      0       𝑁 3,𝑥     0  
                                                                                
                     𝑘
                                                                               
                   B𝑢 =  0 𝑁1,𝑦                0 𝑁2,𝑦              0 𝑁3,𝑦  ,                      (A.17)
                                                                               
                                                                               
                            𝑁1,𝑦 𝑁1,𝑥 𝑁2,𝑦 𝑁2,𝑥 𝑁3,𝑦 𝑁3,𝑥 
                                                                               
                                                                                                 
                            𝑁1,𝑥
                                             0          𝑁2,𝑥            0        𝑁3,𝑥        0   
                                                                                                  
                                                                                                 
                   B𝑣𝑘 =  0               𝑁1,𝑦            0          𝑁2,𝑦         0       𝑁3,𝑦  , (A.18)
                                                                                                 
                            √ 𝑁1,𝑦 √1 𝑁1,𝑥 √1 𝑁2,𝑦 √1 𝑁2,𝑥 √1 𝑁3,𝑦                       √1 𝑁3,𝑥 
                           1                                                                     
                            2              2            2             2          2         2     
                                                     
                            𝑁1,𝑥 𝑁2,𝑥 𝑁3,𝑥 
                   B𝜑𝑘 =                                                                             (A.19)
                                                     
                                                       .
                           𝑁                         
                            1,𝑦 𝑁2,𝑦 𝑁3,𝑦 
                                                     
   From those definitions, we can express the mass, stiffness and remaining operator matrices from
Equation (A.1) as
                               ∫
                         𝑘
                      M =             N𝑇 N𝑑𝛺 𝑘 ;                                                      (A.20)
                                   𝑘
                               ∫𝛺
                      M𝜑𝑘 =           N𝑇𝜑 N𝜑 𝑑𝛺 𝑘 ;                                                   (A.21)
                                ∫𝛺𝑘
                      MF𝑘    =         N𝑇F NF 𝑑𝛺 𝑘 ;                                                  (A.22)
                                 ∫𝛺𝑘
                                         1                    2  𝑇
                        𝑘
                      K𝑢 = −                  1 − N𝜑𝑘 𝜑ˆ 𝑘         B𝑢𝑘 CB𝑢𝑘 𝑑𝛺 𝑘 ;                    (A.23)
                                      𝑘 𝜌 0
                                 ∫𝛺
                                         𝑏  𝑘 𝑇 𝑘
                        𝑘
                      K𝑣 = −                  B𝑣 B𝑣 𝑑𝛺 𝑘 ;                                            (A.24)
                                      𝑘 𝜌
                                 ∫ 𝛺 0  𝑇                            ∫
                       𝑘                𝛾𝑔𝑐 𝑘             𝑘       𝑘           𝑔𝑐  𝑘  𝑇 𝑘
                      P𝜑 = −                   B𝜑 B𝜑 𝑑𝛺 −                         N𝜑 N𝜑 𝑑𝛺 𝑘 ;        (A.25)
                                      𝑘  𝜆                               𝛺 𝑘 𝜆𝛾
                                 ∫𝛺
                                        1           𝑇                  𝑇
                      K𝑐𝑘 = −                B𝑢𝑘 û 𝑘 C B𝑢𝑘 û 𝑘 N𝜑𝑘 N𝜑𝑘 𝑑𝛺 𝑘 ;                       (A.26)
                                      𝑘 𝜆
                              ∫ 𝛺
                                      𝛾𝑔𝑐  𝑘 𝑘                     
                      w𝑎𝑘 =                   B𝜑 𝜑ˆ ⊗ B𝜑𝑘 𝜑ˆ 𝑘 : B𝑢𝑘 𝑑𝛺 𝑘 ;                           (A.27)
                                  𝑘 𝜌
                              ∫𝛺 0
                        𝑘             1 𝑘 𝑘 𝑇  𝑘 𝑘  𝑘
                      w𝑏 =               B𝑢 û        C B𝑢 û N𝜑 𝑑𝛺 𝑘 ;                               (A.28)
                                  𝑘 𝜆
                              ∫𝛺
                                     −1  𝑘  𝑇 𝑘 𝑘
                      w𝑐𝑘 =                NF NF F̂ 𝑑𝛺 𝑘 ;                                            (A.29)
                                  𝑘  𝜆𝛾
                              ∫𝛺
                                      𝑎                 
                      w𝑑𝑘 =               1 − N𝜑𝑘 𝜑ˆ 𝑘
                                𝛺𝑘 𝛾                                                                  (A.30)
                                      h                                 i         
                             N𝜑𝑘 𝜑ˆ 𝑘      C B𝑢𝑘 û 𝑘 + 𝑏 B𝑣𝑘 v̂ 𝑘 : B𝑣𝑘 v̂ 𝑘 𝑑𝛺 𝑘 .
                                                          134


     We consider the plane stress elasticity matrix C given by
                                                                   
                                                       1 𝜈     0   
                                                                   
                                                   𝐸              
                                        C=               𝜈 1    0                                    (A.31)
                                                1 − 𝜈2 
                                                                   
                                                                    
                                                       0 0 (1−𝜈)
                                                                   
                                                               2   
                                                                    
     We express the second-order tensor in Equation A.27 as a vector using Voigt notation.
A.2      Time Discretization
We adopt a semi-implicit time integration scheme, where we solve each equation separately using
a suitable implicit method, treating nonlinear terms and other variable fields explicitly. The
methodology is based on the work by Haveroth et al (2018) [163], where a detailed derivation can
be found. We split the solution time interval [0, 𝑇] in discrete time steps 𝑡 𝑛 with time increments
given by Δ𝑡 = 𝑡 𝑛+1 − 𝑡 𝑛 > 0, 𝑛 = 0, 1, . . . . We denote the global approximations for the variables at
𝑡 𝑛+1 as
    u𝑛+1 = û(𝑡 𝑛+1 ),              ¤̂ 𝑛+1 ),
                            v𝑛+1 = u(𝑡                        ˆ 𝑛+1 ),
                                                       𝜑𝑛+1 = 𝜑(𝑡                 F𝑛+1 = F̂ (𝑡 𝑛+1 ).
                 (A.32)                 (A.33)                    (A.34)                       (A.35)
     We first discuss damage time integration. We use a backward Euler scheme to compute 𝜑𝑛+1
from Equation (A.1). Parameter 𝜆, displacement and fatigue are treated explicitly using values
from time step 𝑡 𝑛 . This simplifies the solution and avoids the use of iterative methods to treat the
nonlinearity. Evolution of damage is then obtained by solving the linear system
                                                 
                            M𝜑 − Δ𝑡 P𝜑 + K𝑐 𝜑𝑛+1 = M𝜑 𝜑𝑛 + Δ𝑡 (w𝑏 + w𝑐 ).                             (A.36)
     With the updated damage field, we use Newmark method to solve displacement and velocity in
the equation of motion. In Newmark scheme, acceleration and velocity at time 𝑡 𝑛+1 are approximated
by
                                   u¥ 𝑛+1 = 𝛼1 (u𝑛+1 − u𝑛 ) − 𝛼2 u¤ 𝑛 − 𝛼3 u¥ 𝑛                       (A.37)
                                   u¤ 𝑛+1 = 𝛼4 (u𝑛+1 − u𝑛 ) + 𝛼5 u¤ 𝑛 + 𝛼6 u¥ 𝑛 ,                     (A.38)
                                                      135


Algorithm A.1 Semi-implicit time integration scheme
 1: for 𝑛 = 0 → 𝑁 − 1 do
 2:      Given u𝑛 , v𝑛 and 𝜆 𝑛 , solve Eq. (A.36) for 𝜑𝑛+1 .
 3:      Solve Eq. (A.45) for u𝑛+1 .
 4:      Update acceleration a𝑛+1 and velocity v𝑛+1 using Eq. (A.37) and (A.38).
 5:      Update the fatigue F𝑛+1 .
 6:      Update the time step by adding the time increment Δ𝑡.
 7: end for
with 𝛼𝑖 , 𝑖 = 1, 2, . . . , 6 written in terms of standard Newmark coefficients 𝛾˜ and 𝛽:       ˜
                                            1 − 2 𝛽˜
                                                                                                        
               1                1                              𝛾˜                𝛾˜                  𝛾˜
      𝛼1 =          , 𝛼2 =           , 𝛼3 =          , 𝛼4 =       , 𝛼5 = 1 −         and 𝛼6 = 1 −          Δ𝑡.
              ˜ 2
             𝛽Δ𝑡               ˜
                               𝛽Δ𝑡            2 𝛽˜            ˜
                                                             𝛽Δ𝑡                 𝛽˜                 2 𝛽˜
                                                                                                      (A.39-A.44)
    The discrete form is then
                 [𝛼1 M − K𝑢 − 𝛼4 K𝑣 ] u𝑛+1 = M [𝛼3 u¥ 𝑛 + 𝛼2 u¤ 𝑛 + 𝛼1 u𝑛 ]
                                                                                                             (A.45)
                                                   + K𝑣 [𝛼6 u¥ 𝑛 + 𝛼5 u¤ 𝑛 − 𝛼4 u𝑛 ] + w𝑎 + Mf𝑛+1 .
with coefficients 𝛼𝑖 , 𝑖 = 1, 2, . . . , 6.
    After the solution of Equation (A.45) we update the current acceleration and velocity fields
using Equations (A.37) and (A.38), respectively. When imposing prescribed displacement ū(𝑡 𝑛+1 )
we should also prescribe appropriate velocity and acceleration at the boundaries using
                                           𝑑2                               𝑑
                                 u¥̄ 𝑛+1 = 2 ū(𝑡 𝑛+1 )  and u¤̄ 𝑛+1 =        ū(𝑡 𝑛+1 ),             (A.46-A.47)
                                           𝑑𝑡                              𝑑𝑡
where the bar symbol represents the prescribed degrees of freedom.
    Finally, we update the fatigue variable using a Trapezoidal method given by
                                        Δ𝑡 −1
                      F𝑛+1 = F𝑛 +         M [w𝑑 (u𝑛+1 , v𝑛+1 , 𝜑𝑛+1 ) + w𝑑 (u𝑛 , v𝑛 , 𝜑𝑛 )] .                (A.48)
                                         2 F
    Algorithm A.1 presents the final semi-implicit time integration scheme.
                                                         136


BIBLIOGRAPHY
     137


                                       BIBLIOGRAPHY
[1]  Derek Hull and David J Bacon. Introduction to dislocations. Butterworth-Heinemann, 2001.
[2]  Alan Arnold Griffith. Vi. the phenomena of rupture and flow in solids. Philosophical
     transactions of the royal society of london. Series A, containing papers of a mathematical or
     physical character, 221(582-593):163–198, 1921.
[3]  Daniel Bonamy. Intermittency and roughening in the failure of brittle heterogeneous mate-
     rials. Journal of Physics D: Applied Physics, 42(21):214014, November 2009.
[4]  Benoit B. Mandelbrot, Dann. E. Passoja, and Alvin J. Paullay. Fractal character of fracture
     surfaces of metals. Nature, 308:721, April 1984.
[5]  J. J. Mecholsky, D. E. Passoja, and K. S. Feinberg-Ringel. Quantitative Analysis of Brittle
     Fracture Surfaces Using Fractal Geometry. Journal of the American Ceramic Society,
     72(1):60–65, January 1989.
[6]  E Bouchaud, G Lapasset, and J Planès. Fractal Dimension of Fractured Surfaces: A Universal
     Value? Europhysics Letters (EPL), 13(1):73–79, September 1990.
[7]  Q Y Long, Li Suqin, and C W Lung. Studies on the fractal dimension of a fracture surface
     formed by slow stable crack propagation. Journal of Physics D: Applied Physics, 24(4):602–
     607, April 1991.
[8]  R.E. Williford. Fractal fatigue. Scripta Metallurgica et Materialia, 25(2):455–460, February
     1991.
[9]  Carpinteri, A., Chiaia, B. Multifractal scaling law for the fracture energy variation of
     concrete structures. Fracture mechanics of concrete structures, 1995.
[10] Carpinteri, A and Yang, GP. Fractal dimension evolution of microcrack net in disordered
     materials. Theoretical and applied fracture mechanics, 25(1):73–81, 1996.
[11] M.-Carmen Miguel, Alessandro Vespignani, Stefano Zapperi, Jérôme Weiss, and Jean-
     Robert Grasso. Intermittent dislocation flow in viscoplastic deformation. Nature,
     410(6829):667–671, April 2001.
[12] Marisol Koslowski, Richard LeSar, and Robb Thomson. Avalanches and Scaling in Plastic
     Deformation. Physical Review Letters, 93(12), September 2004.
[13] Thiebaud Richeton, Jérôme Weiss, and François Louchet. Breakdown of avalanche critical
     behaviour in polycrystalline plasticity. Nature Materials, 4(6):465–469, June 2005.
                                               138


[14] D. M. Dimiduk. Scale-Free Intermittent Flow in Crystal Plasticity. Science, 312(5777):1188–
     1190, May 2006.
[15] Jérôme Weiss, Thiebaud Richeton, François Louchet, Frantisek Chmelik, Patrick Dobron,
     Denis Entemeyer, Mikhail Lebyodkin, Tatiana Lebedkina, Claude Fressengeas, and Russell J.
     McDonald. Evidence for universal intermittent crystal plasticity from acoustic emission and
     high-resolution extensometry experiments. Physical Review B, 76(22), December 2007.
[16] A. Petri, G. Paparo, A. Vespignani, A. Alippi, and M. Costantini. Experimental Evidence
     for Critical Dynamics in Microfracturing Processes. Physical Review Letters, 73(25):3423–
     3426, December 1994.
[17] A. Garcimartín, A. Guarino, L. Bellon, and S. Ciliberto. Statistical Properties of Fracture
     Precursors. Physical Review Letters, 79(17):3202–3205, October 1997.
[18] D. Bonamy, S. Santucci, and L. Ponson. Crackling Dynamics in Material Failure as the Signa-
     ture of a Self-Organized Dynamic Phase Transition. Physical Review Letters, 101(4):045501,
     July 2008.
[19] Ashivni Shekhawat, Stefano Zapperi, and James P. Sethna. From Damage Percolation to
     Crack Nucleation Through Finite Size Criticality. Physical Review Letters, 110(18), April
     2013.
[20] Purusattam Ray. Statistical physics perspective of fracture in brittle and quasi-brittle ma-
     terials. Philosophical Transactions of the Royal Society A: Mathematical, Physical and
     Engineering Sciences, 377(2136):20170396, January 2019.
[21] SJ Zhou, DM Beazley, PS Lomdahl, and BL Holian. Large-scale molecular dynamics
     simulations of three-dimensional ductile failure. Physical Review Letters, 78(3):479, 1997.
[22] Vasily Bulatov, Farid F Abraham, Ladislas Kubin, Benoit Devincre, and Sidney Yip. Con-
     necting atomistic and mesoscale simulations of crystal plasticity. Nature, 391(6668):669–
     672, 1998.
[23] Arttu Lehtinen, Fredric Granberg, Lasse Laurson, Kai Nordlund, and Mikko J Alava. Mul-
     tiscale modeling of dislocation-precipitate interactions in fe: from molecular dynamics to
     discrete dislocations. Physical Review E, 93(1):013309, 2016.
[24] Saikumar R Yeratapally, Michael G Glavicic, Christos Argyrakis, and Michael D Sangid.
     Bayesian uncertainty quantification and propagation for validation of a microstructure sen-
     sitive model for prediction of fatigue crack initiation. Reliability Engineering & System
     Safety, 164:110–123, 2017.
[25] Anh V Tran and Yan Wang. Reliable molecular dynamics: Uncertainty quantification
     using interval analysis in molecular dynamics simulation. Computational Materials Science,
                                              139


     127:141–160, 2017.
[26] Aleksandr Chernatynskiy, Simon R Phillpot, and Richard LeSar. Uncertainty quantification
     in multiscale simulation of materials: A prospective. Annual Review of Materials Research,
     43:157–182, 2013.
[27] A. Arsenlis, W. Cai, M. Tang, M. Rhee, T. Oppelstrup, G. Hommes, T.G. Pierce, and V.V.
     Bulatov. Enabling strain hardening simulations with dislocation dynamics. Modeling and
     Simulation in Materials Science and Engineering, 15(6), 2007.
[28] Hythem Sidky and Jonathan K. Whitmer. Learning free energy landscapes using artificial
     neural networks. The Journal of Chemical Physics, 148(10):104111, March 2018.
[29] Xiaoxuan Zhang and Krishna Garikipati. Machine learning materials physics: Multi-
     resolution neural networks learn the free energy and nonlinear elastic response of evolving
     microstructures. Computer Methods in Applied Mechanics and Engineering, 372:113362,
     December 2020.
[30] Hengxu Song, Nina Gunkelmann, Giacomo Po, and Stefan Sandfeld. Data-mining of dis-
     location microstructures: concepts for coarse-graining of internal energies. Modelling and
     Simulation in Materials Science and Engineering, 29(3):035005, 2021.
[31] Christian Miehe, Martina Hofacker, and Fabian Welschinger. A phase field model for
     rate-independent crack propagation: Robust algorithmic implementation based on operator
     splits. Computer Methods in Applied Mechanics and Engineering, 199(45-48):2765–2778,
     November 2010.
[32] M. Ambati, T. Gerasimov, and L. De Lorenzis. Phase-field modeling of ductile fracture.
     Computational Mechanics, 55(5):1017–1040, May 2015.
[33] Michael J. Borden, Clemens V. Verhoosel, Michael A. Scott, Thomas J.R. Hughes, and
     Chad M. Landis. A phase-field description of dynamic brittle fracture. Computer Methods
     in Applied Mechanics and Engineering, 217-220:77–95, April 2012.
[34] J.L. Boldrini, E.A. Barros de Moraes, L.R. Chiarelli, F.G. Fumes, and M.L. Bittencourt. A
     non-isothermal thermodynamically consistent phase field framework for structural damage
     and fatigue. Computer Methods in Applied Mechanics and Engineering, 312:395–427,
     December 2016.
[35] Mikko J Alava, Phani KVV Nukala, and Stefano Zapperi. Statistical models of fracture.
     Advances in Physics, 55(3-4):349–476, 2006.
[36] VV Mourzenko, J-F Thovert, and PM Adler. Percolation of three-dimensional fracture
     networks with power-law size distribution. Physical Review E, 72(3):036103, 2005.
                                              140


[37] Qiang Du, Max Gunzburger, Richard B Lehoucq, and Kun Zhou. Analysis and approximation
     of nonlocal diffusion problems with volume constraints. SIAM review, 54(4):667–696, 2012.
[38] Qiang Du, Max Gunzburger, Richard B Lehoucq, and Kun Zhou. A nonlocal vector calculus,
     nonlocal volume-constrained problems, and nonlocal balance laws. Mathematical Models
     and Methods in Applied Sciences, 23(03):493–540, 2013.
[39] Marta D’Elia and Max Gunzburger. The fractional laplacian operator on bounded do-
     mains as a special case of the nonlocal diffusion operator. Computers & Mathematics with
     Applications, 66(7):1245–1260, 2013.
[40] S.A. Silling. Reformulation of elasticity theory for discontinuities and long-range forces.
     Journal of the Mechanics and Physics of Solids, 48(1):175–209, January 2000.
[41] John T Foster, Stewart Andrew Silling, and Wayne W Chen. Viscoplasticity using peri-
     dynamics. International journal for numerical methods in engineering, 81(10):1242–1258,
     2010.
[42] Guanfeng Zhang, Quang Le, Adrian Loghin, Arun Subramaniyan, and Florin Bobaru.
     Validation of a peridynamic model for fatigue cracking. Engineering Fracture Mechanics,
     162:76–94, 2016.
[43] Wenke Hu, Youn Doh Ha, and Florin Bobaru. Peridynamic model for dynamic fracture in
     unidirectional fiber-reinforced composites. Computer Methods in Applied Mechanics and
     Engineering, 217:247–261, 2012.
[44] Pranesh Roy, Deepak Behera, and Erdogan Madenci. Peridynamic simulation of finite elastic
     deformation and rupture in polymers. Engineering Fracture Mechanics, 236:107226, 2020.
[45] Yan Gao and Selda Oterkus. Non-local modeling for fluid flow coupled with heat transfer
     by using peridynamic differential operator. Engineering Analysis with Boundary Elements,
     105:104–121, 2019.
[46] JL Suzuki, M Zayernouri, ML Bittencourt, and GE Karniadakis. Fractional-order uniaxial
     visco-elasto-plastic models for structural analysis. Computer Methods in Applied Mechanics
     and Engineering, 308:443–467, 2016.
[47] Jorge Suzuki, Yongtao Zhou, Marta D’Elia, and Mohsen Zayernouri. A thermodynamically
     consistent fractional visco-elasto-plastic model with memory-dependent damage for anoma-
     lous materials. Computer Methods in Applied Mechanics and Engineering, 373:113494,
     2021.
[48] Guofei Pang, Marta D’Elia, Michael Parks, and George E Karniadakis. npinns: nonlocal
     physics-informed neural networks for a parametrized nonlocal universal laplacian operator.
     algorithms and applications. Journal of Computational Physics, 422:109760, 2020.
                                                141


[49] Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural net-
     works: A deep learning framework for solving forward and inverse problems involving
     nonlinear partial differential equations. Journal of Computational Physics, 378:686–707,
     2019.
[50] Xiao Xu, Marta D’Elia, and John T Foster. A machine-learning framework for peridynamic
     material models with physical constraints. arXiv preprint arXiv:2101.01095, 2021.
[51] Huaiqian You, Yue Yu, Nathaniel Trask, Mamikon Gulian, and Marta D’Elia. Data-driven
     learning of nonlocal physics from high-fidelity synthetic data. Computer Methods in Applied
     Mechanics and Engineering, 374:113553, 2021.
[52] Huaiqian You, Yue Yu, Stewart Silling, and Marta D’Elia. Data-driven learning of
     nonlocal models: from high-fidelity simulations to constitutive laws. arXiv preprint
     arXiv:2012.04157, 2020.
[53] Huaiqian You, Yue Yu, Stewart Silling, and Marta D’Elia. A data-driven peridynamic
     continuum model for upscaling molecular dynamics. arXiv preprint arXiv:2108.04883,
     2021.
[54] John W. Cahn and John E. Hilliard. Free Energy of a Nonuniform System. I. Interfacial Free
     Energy. The Journal of Chemical Physics, 28(2):258–267, February 1958.
[55] Samuel M. Allen and John W. Cahn. A microscopic theory for antiphase boundary motion
     and its application to antiphase domain coarsening. Acta Metallurgica, 27(6):1085–1095,
     June 1979.
[56] E. A. B. F. Lima, J. T. Oden, and R. C. Almeida. A hybrid ten-species phase-field model of
     tumor growth. Mathematical Models and Methods in Applied Sciences, 24(13):2569–2599,
     December 2014.
[57] Pengtao Yue, James J. Feng, Chun Liu, and Jie Shen. A diffuse-interface method for
     simulating two-phase flows of complex fluids. Journal of Fluid Mechanics, 515:293–317,
     2004.
[58] Pengtao Sun, Jinchao Xu, and Lixiang Zhang. Full Eulerian finite element method of a phase
     field model for fluid–structure interaction problem. Computers & Fluids, 90:1 – 8, 2014.
[59] Ralph C Smith. Uncertainty quantification: theory, implementation, and applications, vol-
     ume 12. Siam, 2013.
[60] Zeev Schuss. Singular perturbation methods in stochastic differential equations of mathe-
     matical physics. Siam Review, 22(2):119–155, 1980.
[61] Ivo Babuška and Panagiotis Chatzipantelidis. On solving elliptic stochastic partial differential
                                               142


     equations. Computer Methods in Applied Mechanics and Engineering, 191(37-38):4093–
     4122, 2002.
[62] Roger G Ghanem and Pol D Spanos. Stochastic finite elements: a spectral approach. Courier
     Corporation, 2003.
[63] Dongbin Xiu and George Em Karniadakis. Modeling uncertainty in steady state diffusion
     problems via generalized polynomial chaos. Computer methods in applied mechanics and
     engineering, 191(43):4927–4948, 2002.
[64] Ivo Babuska, Raúl Tempone, and Georgios E Zouraris. Galerkin finite element approx-
     imations of stochastic elliptic partial differential equations. SIAM Journal on Numerical
     Analysis, 42(2):800–825, 2004.
[65] Ivo Babuška, Raúl Tempone, and Georgios E Zouraris. Solving elliptic boundary value
     problems with uncertain coefficients by the finite element method: the stochastic formulation.
     Computer methods in applied mechanics and engineering, 194(12-16):1251–1294, 2005.
[66] Dongbin Xiu and Jan S Hesthaven. High-order collocation methods for differential equations
     with random inputs. SIAM Journal on Scientific Computing, 27(3):1118–1139, 2005.
[67] Ivo Babuška, Fabio Nobile, and Raul Tempone. A stochastic collocation method for elliptic
     partial differential equations with random input data. SIAM Journal on Numerical Analysis,
     45(3):1005–1034, 2007.
[68] Sergei Abramovich Smolyak. Quadrature and interpolation formulas for tensor products of
     certain classes of functions. In Doklady Akademii Nauk, volume 148, pages 1042–1045.
     Russian Academy of Sciences, 1963.
[69] Anindya Ghoshal, Muthuvel Murugan, Michael J. Walock, Luis Bravo, Jeffrey J. Swab,
     Clara Hofmeister-Mock, Samuel G. Hirsch, Robert J. Dowding, M. Pepi, Andy Nieto, Larry
     Fehrenbacher, Kelvin Wong, Acree Technologies, Victor Grubsky, Matthew T. Webster,
     Nishan Jain, Alison B. Flatau, Andrew Wright, and Jian Luo. Advanced high temperature
     propulsion materials research project: An update. 2019.
[70] Ed Habtour, Daniel P Cole, Jaret C Riddick, Volker Weiss, Mark Robeson, Raman Sridharan,
     and Abhĳit Dasgupta. Detection of fatigue damage precursor using a nonlinear vibration
     approach. Structural Control and Health Monitoring, 23(12):1442–1463, 2016.
[71] J. Chang, W. Cai, V. Bulatov, and S. Yip. Dislocation motion in bcc metals by molecular
     dynamics. Materials Science and Engineering: A, 309-310:160–163, 2001.
[72] F. Maresca, D. Dragoni, G. Csányi, N. Marzari, and W.A. Curtin. Screw dislocation structure
     and mobility in body centered cubic fe predicted by a gaussian approximation potential. npj
     Computational Materials, 4, 2018.
                                               143


[73] S.J. Zhou, D.L. Preston, P.S. Lomdahl, and D.M. Beazley. Large-scale molecular dynamics
     simulations of dislocation intersection in copper. Science, 279(5356):1525–1527, 1998.
[74] B. Chen, Suzhi. Li, H. Zong, X. Ding, J. Sun, and E. Ma. Unusual activated processes
     controlling dislocation motion in body-centered-cubic high-entropy alloys. Proceedings of
     the National Academy of Sciences of the United States of America, 117(28):16199–16206,
     2020.
[75] S. Queyreau, J. Marian, M.R. Gilbert, and B.D. Wirth. Edge dislocation mobilities in bcc fe
     obtained by molecular dynamics. Physical Review B, 84:064106, 2011.
[76] A. Lehtinen, L. Laurson, F. Granberg, K. Nordlund, and M.J. Alava. Effects of precipitates
     and dislocation loops on the yield stress of irradiated iron. Scientific Reports, 8(1):6914,
     2018.
[77] V. Bulatov and W. Cai. Computer Simulations of Dislocations. Oxford University Press,
     2006.
[78] V. Bulatov, L. Hsiung, M. Tang, A. Arsenlis, M. Bartelt, W. Cai, J. Florando, M. Hiratani,
     M. Rhee, G. Hommes, T. Pierce, and T. Diaz de la Rubia. Dislocation multi-junctions and
     strain hardening. Nature, 440:1174–1178, 2006.
[79] Arthur F Voter. Introduction to the kinetic monte carlo method. In Radiation effects in
     solids, pages 1–23. Springer, 2007.
[80] Tim P Schulze. Efficient kinetic monte carlo simulation. Journal of Computational Physics,
     227(4):2455–2462, 2008.
[81] WM Young and EW Elcock. Monte carlo studies of vacancy migration in binary ordered
     alloys: I. Proceedings of the Physical Society, 89(3):735, 1966.
[82] Alfred B Bortz, Malvin H Kalos, and Joel L Lebowitz. A new algorithm for monte carlo
     simulation of ising spin systems. Journal of Computational Physics, 17(1):10–18, 1975.
[83] B Meng and WH Weinberg. Dynamical monte carlo studies of molecular beam epitaxial
     growth models: interfacial scaling and morphology. Surface Science, 364(2):151–163,
     1996.
[84] Stephan A Baeurle, Takao Usami, and Andrei A Gusev. A new multiscale modeling ap-
     proach for the prediction of mechanical properties of polymer-based nanomaterials. Polymer,
     47(26):8604–8617, 2006.
[85] Mie Andersen, Chiara Panosetti, and Karsten Reuter. A practical guide to surface kinetic
     monte carlo simulations. Frontiers in chemistry, 7:202, 2019.
                                               144


[86] Wei Cai, Vasily V Bulatov, Sidney Yip, and Ali S Argon. Kinetic monte carlo modeling
     of dislocation motion in bcc metals. Materials Science and Engineering: A, 309:270–273,
     2001.
[87] Wei Cai, Vasily V Bulatov, and Sidney Yip. Kinetic monte carlo method for dislocation
     glide in silicon. Journal of computer-aided materials design, 6(2-3):175–183, 1999.
[88] Wei Cai, Vasily V Bulatov, João F Justo, Ali S Argon, and Sidney Yip. Intrinsic mobility of
     a dissociated dislocation in silicon. Physical review letters, 84(15):3346, 2000.
[89] S Scarle, CP Ewels, MI Heggie, and N Martsinovich. Linewise kinetic monte carlo study of
     silicon dislocation dynamics. Physical Review B, 69(7):075209, 2004.
[90] Yue Zhao and Jaime Marian. Direct prediction of the solute softening-to-hardening transition
     in w–re alloys using stochastic simulations of screw dislocation motion. Modelling and
     Simulation in Materials Science and Engineering, 26(4):045002, 2018.
[91] Shuhei Shinzato, Masato Wakeda, and Shigenobu Ogata. An atomistically informed kinetic
     monte carlo model for predicting solid solution strengthening of body-centered cubic alloys.
     International Journal of Plasticity, 122:319–337, 2019.
[92] Alexander Stukowski, David Cereceda, Thomas D Swinburne, and Jaime Marian. Thermally-
     activated non-schmid glide of screw dislocations in w using atomistically-informed kinetic
     monte carlo simulations. International Journal of Plasticity, 65:108–130, 2015.
[93] Wei Cai, Vasily V Bulatov, João F Justo, Ali S Argon, and Sidney Yip. Kinetic monte carlo
     approach to modeling dislocation mobility. Computational materials science, 23(1-4):124–
     130, 2002.
[94] Pak Yuen Chan, Georgios Tsekenis, Jonathan Dantzig, Karin A Dahmen, and Nigel Gold-
     enfeld. Plasticity and dislocation dynamics in a phase field crystal model. Physical review
     letters, 105(1):015502, 2010.
[95] Ebrahim Asadi and Mohsen Asle Zaeem. A review of quantitative phase-field crystal
     modeling of solid–liquid structures. Jom, 67(1):186–201, 2015.
[96] Mohsen Asle Zaeem and Ebrahim Asadi. Phase-field crystal modeling: Integrating density
     functional theory, molecular dynamics, and phase-field modeling. Integrated Computational
     Materials Engineering (ICME) for Metals: Concepts and Case Studies, page 49, 2018.
[97] Mark Ainsworth and Zhiping Mao. Fractional phase-field crystal modelling: analysis,
     approximation and pattern formation. IMA Journal of Applied Mathematics, 85(2):231–
     262, 2020.
[98] Douglas Brent West et al. Introduction to graph theory, volume 2. Prentice hall Upper
                                               145


      Saddle River, NJ, 1996.
[99] Michael A Webb, Jean-Yves Delannoy, and Juan J De Pablo. Graph-based approach to sys-
      tematic molecular coarse-graining. Journal of chemical theory and computation, 15(2):1199–
      1208, 2018.
[100] Michail Stamatakis and Dionisios G Vlachos. A graph-theoretical kinetic monte carlo frame-
      work for on-lattice chemical kinetics. The Journal of chemical physics, 134(21):214115,
      2011.
[101] Naoki Masuda, Mason A Porter, and Renaud Lambiotte. Random walks and diffusion on
      networks. Physics reports, 716:1–58, 2017.
[102] Liming Xiong, Garritt Tucker, David L McDowell, and Youping Chen. Coarse-grained
      atomistic simulation of dislocations. Journal of the Mechanics and Physics of Solids,
      59(2):160–177, 2011.
[103] Jorge L Suzuki, Ehsan Kharazmi, Pegah Varghaei, Maryam Naghibolhosseini, and Mohsen
      Zayernouri. Anomalous nonlinear dynamics behavior of fractional viscoelastic beams.
      Journal of Computational and Nonlinear Dynamics, 16(11):111005, 2021.
[104] Eduardo A Barros de Moraes, Mohsen Zayernouri, and Mark M Meerschaert. An integrated
      sensitivity-uncertainty quantification framework for stochastic phase-field modeling of ma-
      terial damage. International Journal for Numerical Methods in Engineering, 122(5):1352–
      1377, 2021.
[105] Eduardo A Barros de Moraes, Hadi Salehi, and Mohsen Zayernouri. Data-driven failure
      prediction in brittle materials: A phase field-based machine learning framework. Journal of
      Machine Learning for Modeling and Computing, 2(1), 2021.
[106] S. Plimpton. Fast parallel algorithms for short-range molecular dynamics. J. Comp. Phys.,
      117:1–19, 1995.
[107] KOE Henriksson, C Björkas, and Kai Nordlund. Atomistic simulations of stainless steels:
      a many-body potential for the fe–cr–c system. Journal of Physics: Condensed Matter,
      25(44):445401, 2013.
[108] P.M. Larsen, S. Schmidt, and J. Schiøtz. Robust structural identification via polyhedral
      template matching. Modelling Simul. Mater. Sci. Eng., 24(5), 2016.
[109] A. Stukowski. Visualization and analysis of atomistic simulation data with ovito-the open
      visualization tool. Modelling Simul. Mater. Sci. Eng., 18:015012, 2007.
[110] A Pérez Riascos and José L Mateos. Fractional dynamics on networks: Emergence of
      anomalous diffusion and lévy flights. Physical Review E, 90(3):032809, 2014.
                                                146


[111] Aric Hagberg, Pieter Swart, and Daniel S Chult. Exploring network structure, dynamics, and
      function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos,
      NM (United States), 2008.
[112] Geoffrey Grimmett and Dominic Welsh. Probability: an introduction. Oxford University
      Press, 2014.
[113] Michael J Evans and Jeffrey S Rosenthal. Probability and statistics: The science of
      uncertainty. Macmillan, 2004.
[114] Mark M Meerschaert and Alla Sikorskii. Stochastic models for fractional calculus, vol-
      ume 43. Walter de Gruyter, 2011.
[115] Ting Zhu, Ju Li, and Sidney Yip. Atomistic study of dislocation loop emission from a crack
      tip. Physical review letters, 93(2):025503, 2004.
[116] K. Tanaka and T. Mura. A Dislocation Model for Fatigue Crack Initiation. Journal of Applied
      Mechanics, 48(1):97, 1981.
[117] Michael Zaiser. Scale invariance in plastic flow of crystalline solids. Advances in Physics,
      55(1-2):185–245, January 2006.
[118] Peter Hähner, Karlheinz Bay, and Michael Zaiser. Fractal Dislocation Patterning During
      Plastic Deformation. Physical Review Letters, 81(12):2470–2473, September 1998.
[119] David L. Holt. Dislocation Cell Formation in Metals.           Journal of Applied Physics,
      41(8):3197–3201, July 1970.
[120] Daniel Walgraef and Elias C Aifantis. Dislocation patterning in fatigued metals as a result
      of dynamical instabilities. Journal of applied physics, 58(2):688–691, 1985.
[121] P Hähner. A theory of dislocation cell formation based on stochastic dislocation dynamics.
      Acta materialia, 44(6):2345–2352, 1996.
[122] Olga Kapetanou, Vasileios Koutsos, Efstathios Theotokoglou, Daniel Weygand, and Michael
      Zaiser. Statistical analysis and stochastic dislocation-based modeling of microplasticity.
      Journal of the Mechanical Behavior of Materials, 24(3-4):105–113, 2015.
[123] Thomas Hochrainer, Stefan Sandfeld, Michael Zaiser, and Peter Gumbsch. Continuum dis-
      location dynamics: towards a physical theory of crystal plasticity. Journal of the Mechanics
      and Physics of Solids, 63:167–178, 2014.
[124] Thomas Hochrainer. Multipole expansion of continuum dislocations dynamics in terms of
      alignment tensors. Philosophical Magazine, 95(12):1321–1367, 2015.
                                               147


[125] Qiang Du, Zhan Huang, and Richard B Lehoucq. Nonlocal convection-diffusion volume-
      constrained problems and jump processes. Discrete & Continuous Dynamical Systems-B,
      19(2):373, 2014.
[126] Marta D’Elia, Qiang Du, Max Gunzburger, and Richard Lehoucq. Nonlocal convection-
      diffusion problems on bounded domains and finite-range jump processes. Computational
      Methods in Applied Mathematics, 17(4):707–722, 2017.
[127] Qiang Du, Robert Lipton, and Tadele Mengesha. Multiscale analysis of linear evolution equa-
      tions with applications to nonlocal models for heterogeneous media. ESAIM: Mathematical
      Modelling and Numerical Analysis, 50(5):1425–1455, 2016.
[128] Ali Akhavan-Safaei, Mehdi Samiee, and Mohsen Zayernouri. Data-driven fractional subgrid-
      scale modeling for scalar turbulence: A nonlocal les approach. Journal of Computational
      Physics, 446:110571, 2021.
[129] S Hadi Seyedi and Mohsen Zayernouri. A data-driven dynamic nonlocal subgrid-scale model
      for turbulent flows. Physics of Fluids, 34(3):035104, 2022.
[130] Mehdi Samiee, Ali Akhavan-Safaei, and Mohsen Zayernouri. A fractional subgrid-scale
      model for turbulent flows: Theoretical formulation and a priori study. Physics of Fluids,
      32(5):055102, 2020.
[131] Ali Akhavan-Safaei and Mohsen Zayernouri. A nonlocal spectral transfer model and new
      scaling law for scalar turbulence. arXiv preprint arXiv:2111.06540, 2021.
[132] Mehdi Samiee, Ali Akhavan-Safaei, and Mohsen Zayernouri. Tempered fractional les
      modeling. Journal of Fluid Mechanics, 932, 2022.
[133] Rina Schumer, David A Benson, Mark M Meerschaert, and Stephen W Wheatcraft. Eu-
      lerian derivation of the fractional advection–dispersion equation. Journal of contaminant
      hydrology, 48(1-2):69–88, 2001.
[134] Xiao Xu, Marta D’Elia, Christian Glusa, and John T Foster. Machine-learning of nonlo-
      cal kernels for anomalous subsurface transport from breakthrough curves. arXiv preprint
      arXiv:2201.11146, 2022.
[135] Jorge Suzuki, Mamikon Gulian, Mohsen Zayernouri, and Marta D’Elia. Fractional model-
      ing in action: A survey of nonlocal models for subsurface transport, turbulent flows, and
      anomalous materials. arXiv preprint arXiv:2110.11531, 2021.
[136] SA Silling. Dynamic fracture modeling with a meshfree peridynamic code. In Computational
      Fluid and Solid Mechanics 2003, pages 641–644. Elsevier, 2003.
[137] Stewart Andrew Silling and Abe Askari. Peridynamic model for fatigue cracking. Technical
                                                148


      Report SAND2014-18590, 1160289, October 2014.
[138] Marta D’Elia, Mamikon Gulian, George Karniadakis, and Hayley Olson. A unified theory of
      fractional nonlocal and weighted nonlocal vector calculus. Technical report, Sandia National
      Lab.(SNL-NM), Albuquerque, NM (United States), 2020.
[139] Henri Salmenjoki, Mikko J Alava, and Lasse Laurson. Machine learning plastic deformation
      of crystals. Nature communications, 9(1):1–7, 2018.
[140] Mika Sarvilahti, Audun Skaugen, and Lasse Laurson. Machine learning depinning of
      dislocation pileups. APL Materials, 8(10):101109, 2020.
[141] Henri Salmenjoki et al. Predicting the behaviour of dislocation systems with machine
      learning methods. 2017.
[142] Dominik Steinberger, Hengxu Song, and Stefan Sandfeld. Machine learning-based classifi-
      cation of dislocation microstructures. Frontiers in Materials, 6:141, 2019.
[143] Eduardo A Barros de Moraes, Jorge L Suzuki, and Mohsen Zayernouri. Atomistic-to-
      meso multi-scale data-driven graph surrogate modeling of dislocation glide. Computational
      Materials Science, 197:110569, 2021.
[144] Joseph Bakarji and Daniel M Tartakovsky. Data-driven discovery of coarse-grained equa-
      tions. Journal of Computational Physics, 434:110219, 2021.
[145] Seungjoon Lee, Mahdi Kooshkbaghi, Konstantinos Spiliotis, Constantinos I Siettos, and
      Ioannis G Kevrekidis. Coarse-scale pdes from fine-scale observations via machine learning.
      Chaos: An Interdisciplinary Journal of Nonlinear Science, 30(1):013141, 2020.
[146] Samuel H Rudy, Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. Data-driven
      discovery of partial differential equations. Science Advances, 3(4):e1602614, 2017.
[147] Rohit Supekar, Boya Song, Alasdair Hastewell, Alexander Mietke, and Jörn Dunkel. Learn-
      ing hydrodynamic equations for active matter from particle simulations and experiments.
      arXiv preprint arXiv:2101.06568, 2021.
[148] Peter M Anderson, John P Hirth, and Jens Lothe. Theory of dislocations. Cambridge
      University Press, 2017.
[149] Erik Van der Giessen and Alan Needleman. Discrete dislocation plasticity: a simple planar
      model. Modelling and Simulation in Materials Science and Engineering, 3(5):689, 1995.
[150] Bernard W Silverman. Density estimation for statistics and data analysis. Routledge, 2018.
[151] Daniele Pedretti and Daniel Fernàndez-Garcia. An automatic locally-adaptive method to
                                                149


      estimate heavily-tailed breakthrough curves from particle distributions. Advances in water
      Resources, 59:52–65, 2013.
[152] Ian S Abramson. On bandwidth variation in kernel estimates-a square root law. The annals
      of Statistics, pages 1217–1223, 1982.
[153] Jean Lemaitre and Rodrigue Desmorat. Engineering damage mechanics: ductile, creep,
      fatigue and brittle failures. Springer Science & Business Media, 2005.
[154] Ted L Anderson. Fracture mechanics: fundamentals and applications. CRC press, 2017.
[155] Melvin F Kanninen and Carl L Popelar. Advanced fracture mechanics. 1985.
[156] C. Miehe, F. Welschinger, and M. Hofacker. Thermodynamically consistent phase-field
      models of fracture: Variational principles and multi-field FE implementations. International
      Journal for Numerical Methods in Engineering, 83(10):1273–1311, September 2010.
[157] Michael J. Borden, Thomas J.R. Hughes, Chad M. Landis, and Clemens V. Verhoosel. A
      higher-order phase-field model for brittle fracture: Formulation and analysis within the iso-
      geometric analysis framework. Computer Methods in Applied Mechanics and Engineering,
      273:100–118, May 2014.
[158] Marreddy Ambati, Roland Kruse, and Laura De Lorenzis. A phase-field model for duc-
      tile fracture at finite strains and its experimental verification. Computational Mechanics,
      57(1):149–167, January 2016.
[159] M. Hofacker and C. Miehe. A phase field model of dynamic fracture: Robust field updates
      for the analysis of complex crack patterns:. International Journal for Numerical Methods in
      Engineering, 93(3):276–301, January 2013.
[160] G. Amendola, M. Fabrizio, and J. M. Golden. Thermomechanics of damage and fatigue by
      a phase field model. Journal of Thermal Stresses, 39(5):487–499, May 2016.
[161] Michele Caputo and Mauro Fabrizio. Damage and fatigue described by a fractional derivative
      model. Journal of Computational Physics, 293:400–408, July 2015.
[162] L.R. Chiarelli, F.G. Fumes, E.A. Barros de Moraes, G.A. Haveroth, J.L. Boldrini, and M.L.
      Bittencourt. Comparison of high order finite element and discontinuous Galerkin methods
      for phase field equations: Application to structural damage. Computers & Mathematics with
      Applications, 74(7):1542–1564, October 2017.
[163] G.A. Haveroth, E.A. Barros de Moraes, J.L. Boldrini, and M.L. Bittencourt. Comparison of
      semi and fully-implicit time integration schemes applied to a damage and fatigue phase field
      model. Latin American Journal of Solids and Structures, 15(5):1–16, May 2018.
                                                 150


[164] George Fishman. Monte Carlo: concepts, algorithms, and applications. Springer Science &
      Business Media, 2013.
[165] Dongbin Xiu and George Em Karniadakis. The wiener–askey polynomial chaos for stochastic
      differential equations. SIAM journal on scientific computing, 24(2):619–644, 2002.
[166] OM Knio and OP Le Maitre. Uncertainty propagation in cfd using polynomial chaos
      decomposition. Fluid dynamics research, 38(9):616, 2006.
[167] George Stefanou. The stochastic finite element method: past, present and future. Computer
      methods in applied mechanics and engineering, 198(9-12):1031–1051, 2009.
[168] Paul G Constantine, Eric Dow, and Qiqi Wang. Active subspace methods in theory and prac-
      tice: applications to kriging surfaces. SIAM Journal on Scientific Computing, 36(4):A1500–
      A1524, 2014.
[169] Paul G Constantine, Michael Emory, Johan Larsson, and Gianluca Iaccarino. Exploiting
      active subspaces to quantify uncertainty in the numerical simulation of the hyshot ii scramjet.
      Journal of Computational Physics, 302:1–20, 2015.
[170] Paul G Constantine and Paul Diaz. Global sensitivity metrics from active subspaces.
      Reliability Engineering & System Safety, 162:1–13, 2017.
[171] Khader M Hamdia, Mohammed A Msekh, Mohammad Silani, Nam Vu-Bac, Xiaoying
      Zhuang, Trung Nguyen-Thoi, and Timon Rabczuk. Uncertainty quantification of the frac-
      ture properties of polymeric nanocomposites based on phase field modeling. Composite
      Structures, 133:1177–1190, 2015.
[172] Ernesto A. B. F. Lima, Regina C. Almeida, and J. Tinsley Oden. Analysis and numeri-
      cal solution of stochastic phase-field models of tumor growth: Stochastic Tumor Growth.
      Numerical Methods for Partial Differential Equations, 31(2):552–574, March 2015.
[173] Charbel Farhat, Adrien Bos, Philip Avery, and Christian Soize. Modeling and quantification
      of model-form uncertainties in eigenvalue computations using a stochastic reduced model.
      AIAA Journal, 56(3):1198–1210, 2018.
[174] Christian Soize and Charbel Farhat. Probabilistic learning for modeling and quantifying
      model-form uncertainties in nonlinear computational mechanics. International Journal for
      Numerical Methods in Engineering, 117(7):819–843, 2019.
[175] Joaquim RRA Martins, Peter Sturdza, and Juan J Alonso. The complex-step derivative
      approximation. ACM Transactions on Mathematical Software (TOMS), 29(3):245–262,
      2003.
[176] Andrea Saltelli, Paola Annoni, Ivano Azzini, Francesca Campolongo, Marco Ratto, and
                                                151


      Stefano Tarantola. Variance based sensitivity analysis of model output. design and estimator
      for the total sensitivity index. Computer Physics Communications, 181(2):259–270, 2010.
[177] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics
      and intelligent laboratory systems, 2(1-3):37–52, 1987.
[178] Hervé Abdi and Lynne J Williams. Principal component analysis. Wiley interdisciplinary
      reviews: computational statistics, 2(4):433–459, 2010.
[179] Mathilde Chevreuil, Régis Lebrun, Anthony Nouy, and Prashant Rai. A least-squares
      method for sparse low rank approximation of multivariate functions. SIAM/ASA Journal on
      Uncertainty Quantification, 3(1):897–921, 2015.
[180] Ilya M Sobol. Sensitivity estimates for nonlinear mathematical models. Mathematical
      modelling and computational experiments, 1(4):407–414, 1993.
[181] J. Weiss. Three-Dimensional Mapping of Dislocation Avalanches:                  Clustering and
      Space/Time Coupling. Science, 299(5603):89–92, January 2003.
[182] Alberto Carpinteri and Francesco Mainardi. Fractals and fractional calculus in continuum
      mechanics, volume 378. Springer, 2014.
[183] Mark Ainsworth and Zhiping Mao. Analysis and Approximation of a Fractional Cahn–
      Hilliard Equation. SIAM Journal on Numerical Analysis, 55(4):1689–1718, January 2017.
[184] Mark Ainsworth and Zhiping Mao. Well-posedness of the Cahn–Hilliard equation with
      fractional free energy and its Fourier Galerkin approximation. Chaos, Solitons & Fractals,
      102:264–273, September 2017.
[185] Giambattista Giacomin and Joel L. Lebowitz. Phase segregation dynamics in particle systems
      with long range interactions. I. Macroscopic limits. Journal of Statistical Physics, 87(1-2):37–
      61, April 1997.
[186] Giambattista Giacomin and Joel L. Lebowitz. Phase Segregation Dynamics in Particle
      Systems with Long Range Interactions II: Interface Motion. SIAM Journal on Applied
      Mathematics, 58(6):1707–1729, December 1998.
[187] Helmut Abels, Stefano Bosia, and Maurizio Grasselli. Cahn–Hilliard equation with nonlocal
      singular free energies. Annali di Matematica Pura ed Applicata (1923 -), 194(4):1071–1106,
      August 2015.
[188] Ehsan Kharazmi and Mohsen Zayernouri. Operator-based uncertainty quantification of
      stochastic fractional partial differential equations. Journal of Verification, Validation and
      Uncertainty Quantification, 4(4), 2019.
                                                 152


[189] Ehsan Kharazmi and Mohsen Zayernouri. Fractional sensitivity equation method: Appli-
      cation to fractional model construction. Journal of Scientific Computing, 80(1):110–140,
      2019.
[190] P Carrara, M Ambati, R Alessi, and L De Lorenzis. A framework to model the fatigue
      behavior of brittle materials based on a variational phase-field approach. Computer Methods
      in Applied Mechanics and Engineering, page 112731, 2019.
[191] Martha Seiler, Thomas Linse, Peter Hantschke, and Markus Kästner. An efficient phase-field
      model for fatigue fracture in ductile materials. arXiv preprint arXiv:1903.06465, 2019.
[192] Yi-Yan Liu, Yong-Feng Ju, Chen-Dong Duan, and Xue-Feng Zhao. Structure damage
      diagnosis using neural network and feature fusion. Engineering applications of artificial
      intelligence, 24(1):87–92, 2011.
[193] Adam Santos, Eloi Figueiredo, MFM Silva, CS Sales, and JCWA Costa. Machine learning
      algorithms for damage detection: Kernel-based approaches. Journal of Sound and Vibration,
      363:584–599, 2016.
[194] Dia Al Azzawi, Hever Moncayo, Mario G Perhinschi, Andres Perez, and Adil Togayev.
      Comparison of immunity-based schemes for aircraft failure detection and identification.
      Engineering Applications of Artificial Intelligence, 52:181–193, 2016.
[195] Moisés Silva, Adam Santos, Eloi Figueiredo, Reginaldo Santos, Claudomiro Sales, and
      João CWA Costa. A novel unsupervised approach based on a genetic algorithm for structural
      damage detection in bridges. Engineering Applications of Artificial Intelligence, 52:168–
      180, 2016.
[196] Adam J Wootton, John B Butcher, Theocharis Kyriacou, Charles R Day, and Peter W
      Haycock. Structural health monitoring of a footbridge using echo state networks and narmax.
      Engineering Applications of Artificial Intelligence, 64:152–163, 2017.
[197] Arash Saeidpour, Mi G Chorzepa, Jason Christian, and Stephan Durham. Parameterized
      fragility assessment of bridges subjected to hurricane events using metamodels and multiple
      environmental parameters. Journal of Infrastructure Systems, 24(4):04018031, 2018.
[198] Hadi Salehi and Rigoberto Burgueno. Emerging artificial intelligence methods in structural
      engineering. Engineering structures, 171:170–189, 2018.
[199] Hadi Salehi, Saptarshi Das, Shantanu Chakrabartty, Subir Biswas, and Rigoberto Burgueño.
      Structural damage identification using image-based pattern recognition on event-based bi-
      nary data generated from self-powered sensor networks. Structural Control and Health
      Monitoring, 25(4):e2135, 2018.
[200] Hadi Salehi, Saptarshi Das, Shantanu Chakrabartty, Subir Biswas, and Rigoberto Burgueño.
                                                153


      Damage identification in aircraft structures with self-powered sensing technology: A ma-
      chine learning approach. Structural Control and Health Monitoring, 25(12):e2262, 2018.
[201] Hadi Salehi, Saptarshi Das, Shantanu Chakrabartty, Subir Biswas, and Rigoberto Burgueño.
      An algorithmic framework for reconstruction of time-delayed and incomplete binary signals
      from an energy-lean structural health monitoring system. Engineering Structures, 180:603–
      620, 2019.
[202] Hadi Salehi, Subir Biswas, and Rigoberto Burgueño. Data interpretation framework integrat-
      ing machine learning and pattern recognition for self-powered data-driven damage identifi-
      cation with harvested energy variations. Engineering Applications of Artificial Intelligence,
      86:136–153, 2019.
[203] Andrea Rovinelli, Michael D Sangid, Henry Proudhon, and Wolfgang Ludwig. Using
      machine learning and a data-driven approach to identify the small fatigue crack driving force
      in polycrystalline materials. npj Computational Materials, 4(1):35, 2018.
[204] Hyung Jin Lim, Hoon Sohn, and Yongtak Kim. Data-driven fatigue crack quantifica-
      tion and prognosis using nonlinear ultrasonic modulation. Mechanical Systems and Signal
      Processing, 109:185–195, 2018.
[205] Hyung Jin Lim and Hoon Sohn. Online fatigue crack quantification and prognosis using non-
      linear ultrasonic modulation and artificial neural network. In Sensors and Smart Structures
      Technologies for Civil, Mechanical, and Aerospace Systems 2018, volume 10598, page
      105981L. International Society for Optics and Photonics, 2018.
[206] Zhong-Hui Shen, Jian-Jun Wang, Jian-Yong Jiang, Sharon X Huang, Yuan-Hua Lin,
      Ce-Wen Nan, Long-Qing Chen, and Yang Shen. Phase-field modeling and machine
      learning of electric-thermal-mechanical breakdown of polymer-based dielectrics. Nature
      communications, 10(1):1843, 2019.
[207] Yuksel C Yabansu, Philipp Steinmetz, Johannes Hötzer, Surya R Kalidindi, and Britta Nestler.
      Extraction of reduced-order process-structure linkages from phase-field simulations. Acta
      Materialia, 124:182–194, 2017.
[208] Stefanos Papanikolaou, Michail Tzimas, Andrew CE Reid, and Stephen A Langer. Spatial
      strain correlations, machine learning, and deformation history in crystal plasticity. Physical
      Review E, 99(5):053003, 2019.
[209] GH Teichert, AR Natarajan, A Van der Ven, and K Garikipati. Machine learning materials
      physics: Integrable deep neural networks enable scale bridging by learning free energy
      functions. Computer Methods in Applied Mechanics and Engineering, 353:201–216, 2019.
[210] Abigail Hunter, Bryan A Moore, Maruti Mudunuru, Viet Chau, Roselyne Tchoua, Chan-
      dramouli Nyshadham, Satish Karra, Daniel O’Malley, Esteban Rougier, Hari Viswanathan,
                                               154


      et al. Reduced-order modeling through machine learning and graph-theoretic approaches for
      brittle fracture applications. Computational Materials Science, 157:87–98, 2019.
[211] Bryan A. Moore, Esteban Rougier, Daniel O’Malley, Gowri Srinivasan, Abigail Hunter, and
      Hari Viswanathan. Predictive modeling of dynamic fracture growth in brittle materials with
      machine learning. Computational Materials Science, 148:46–53, June 2018.
[212] Max Schwarzer, Bryce Rogan, Yadong Ruan, Zhengming Song, Diana Y Lee, Allon G
      Percus, Viet T Chau, Bryan A Moore, Esteban Rougier, Hari S Viswanathan, et al. Learn-
      ing to fail: Predicting fracture evolution in brittle material models using recurrent graph
      convolutional neural networks. Computational Materials Science, 162:322–332, 2019.
[213] James M Keller, Michael R Gray, and James A Givens. A fuzzy k-nearest neighbor algorithm.
      IEEE transactions on systems, man, and cybernetics, (4):580–585, 1985.
[214] Mohamad H Hassoun et al. Fundamentals of artificial neural networks. MIT press, 1995.
[215] Guoqiang Zhang, B Eddy Patuwo, and Michael Y Hu. Forecasting with artificial neural
      networks:: The state of the art. International journal of forecasting, 14(1):35–62, 1998.
[216] Kilian Q Weinberger and Lawrence K Saul. Distance metric learning for large margin nearest
      neighbor classification. Journal of Machine Learning Research, 10(Feb):207–244, 2009.
[217] Stuart Geman, Elie Bienenstock, and René Doursat. Neural networks and the bias/variance
      dilemma. Neural computation, 4(1):1–58, 1992.
[218] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning,
      volume 1. Springer series in statistics New York, 2001.
                                                155