MACHINE-LEARNING-BASED MULTI-SCALE MODELING FOR COMPLEX FLUIDS By Pei Ge A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Computational Mathematics, Science and Engineering – Doctor of Philosophy 2025 ABSTRACT Multi-scale modeling presents a long-standing challenge in computational mathematics and is pertinent to a wide range of applications in materials science, fluid physics, and chemical engineering. Predicting collective behaviors typically necessitates the integration of modeling dynamics across micro-scale (atomistic), meso-scale (kinetic), and macro-scale (continuum) levels, with the vast range of spatiotemporal scales posing a fundamental obstacle. Existing methods often rely on certain empirical constitutive closures or micro-macro coupling approaches. Despite their broad applications, modeling accuracy and efficiency are often challenged in real applications. This dissertation aims to develop data-driven approaches for constructing accurate and reliable reduced models of multi-scale systems based on first-principle descriptions. The first part, including Chapters 2 and 3, focuses on constructing meso-scale reduced models of polymer kinetics directly from the full molecular dynamics. Chapter 2 discussed the many-body effect on conservative force, which is important to accurately reproduce both the probability density function of the void formation in bulk and the spectrum of the capillary wave across the fluid interface. Chapter 3 discussed the state-dependence on memory kernel and demonstrated the essential role of the broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation relaxation and transition. The second part, Chapter 4, focuses on building accurate macroscale models from microscale polymer kinetics through meso-scale Langevin dynamics. A non-Newtonian hydrodynamic model is given as an example, which shows some success in systematically passing the micro-scale heterogeneous polymer structural mechanics to the macro-scale hydrodynamics without human intervention. Copyright by PEI GE 2025 ACKNOWLEDGEMENTS Completing this dissertation has been a journey of intellectual growth, perseverance, and support from many individuals. I am deeply grateful to all those who have guided and encouraged me throughout this process. First and foremost, I would like to express my deepest gratitude to my advisor, Dr. Huan Lei, for his unwavering support, insightful guidance, and patience throughout my academic journey. His mentorship has been invaluable in shaping my research skills and fostering my academic growth. His dedication and encouragement have inspired me to persevere through challenges and strive for excellence. I am truly fortunate to have had the opportunity to learn from him. I am also sincerely thankful to my dissertation committee members, Dr. Daniel Appelö, Dr. Michael Murillo, and Dr. Yimin Xiao, for their valuable feedback and encouragement. Their expertise and perspectives have greatly enriched this work. I am especially grateful to my parents for their steadfast belief in me. Their endless love and encouragement have been my greatest source of strength, providing the foundation for my resilience. I would also like to thank my friends Liyao Lyu, Zhiyuan She, Shijun Liang, and many others, including Lidong Fang, Haishen Dai, Yue Zhao, and Siyu Guo, for their support and camaraderie throughout this journey. This dissertation is as much a product of my efforts as it is of the collective support and encouragement of all these individuals. I am truly grateful. iv TABLE OF CONTENTS CHAPTER 1 OVERVIEW . . . 1.1 Background . . . 1.2 Dissertation Contributions 1.3 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 3 CHAPTER 2 2.1 . . Introduction . . 2.2 Methods and Models 2.3 Numerical Results . . 2.4 Summaryntroduction . 3.2 Model Derivation . 3.3 Numerical Results . 3.4 Summary . . 3.5 Other Details . . . . . . . . . . CHAPTER 4 DEEP LEARNING-BASED NON-NEWTONIAN FLUID MODEL . . Introduction . . . 4.1 4.2 Methods . . . . 4.3 Numerical results . . 4.4 Discussion . . . . . . . . 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 . 71 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 5 5.1 High Dimensional Stochastic Reduced Model 5.2 Variational-informed Structure-preserving Macro-scale Reduced Model CONCLUSION AND OUTLOOK . . . . . . . . . . . . . . . . . . . . . 73 . . . . . . . . . . . . . . . . . . 74 . 74 . . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 v CHAPTER 1 OVERVIEW 1.1 Background Multi-scale modeling is pertinent to a wide range of applications in materials science, fluid physics, and chemical engineering. Predicting collective behaviors typically necessitates the integration of modeling dynamics across micro-scale (atomistic), meso-scale (kinetic), and macro- scale (continuum) levels, with the vast range of spatiotemporal scales posing a fundamental obstacle. Empirical models are often based on oversimplified micro-scale information but do not have effective algorithms to extract the relevant information from micro-scale (E et al., 2023). Machine learning present as been successful in dealing with high dimensional problems in scientific computing, such as physics-informed neural networks (Raissi et al., 2019a), deep operator networks (Lu et al., 2021), and Fourier neural operator (Li et al., 2021). For multi-scale modeling, a tremendous amount of progress has been made, such as deep learning-based coarse-grained molecular dynamics (Zhang et al., 2018b), machine learning-based moment closure models for the kinetic equation (Han et al., 2019a), etc. Despite the overwhelming success during the past years, some challenges as the collection of data, the generalization of the neural network, and preservation of physical constraints remain less illustrating. By addressing these challenges, this dissertation aims to utilize machine learning to construct reliable and structure-preserved reduced models with interpretable micro-scale to macro-scale mapping. 1.2 Dissertation Contributions This dissertation focuses on both micro-to-meso and micro-to-macro modeling. 1.2.1 Micro to Meso: Data-driven Stochastic Reduced Model Predicting the collective behavior of complex multiscale systems is often centered around projecting the full-dimensional dynamics onto a set of resolved variables. However, an accurate construction of such a reduced model remains a practical challenge for real applications such as molecular modeling. While model reduction frameworks such as the Koopman operator (Koopman, 1931) and the Mori-Zwanzig projection formalism (Mori, 1965; Zwanzig, 1961) enable us to write 1 down the dynamic equations in terms of the resolved variables, the reduced model generally becomes non-Markovian with a memory term that may further depend on the resolved variables; the direct numerical evaluation involves solving the expensive full-dimensional orthogonal dynamics. Consider the systems whose microscopic state is determined by the instantaneous positions Z𝑞 and momenta Z𝑝 of the N atoms (3D system). Denote the collection of these variables by Z(𝑡) = (Z𝑝, Z𝑞), which is a vector of 6𝑁 components. The Hamiltonian dynamics can be written as: 𝑑Z(𝑡) 𝑑𝑡 = J 𝜕𝐻 (Z(𝑡)) 𝜕z , Z(0) = z, (1.1) where 𝐻 is the Hamiltonian and J = (cid:169) (cid:173) (cid:173) (cid:171) 0 I −I 0 (cid:170) (cid:174) (cid:174) (cid:172) Let (q, p) ∈ R2𝑚 represent the resolved variables of a high-dimensional Hamiltonian system, where q denotes the coarse-grained (CG) coordinates as a function of Z𝑞, and p denotes the CG momenta. Following the Zwanzig’s formalism (Zwanzig, 2001; Hijón et al., 2010), the reduced dynamics takes the form (cid:164)q = M−1p, (cid:164)p = −∇𝑈 (q) − ∫ 𝑡 0 K(q(𝜏), 𝑡 − 𝜏)v(𝜏)d𝜏 + R𝑡, (1.2) where M is the mass matrix, 𝑈 (q) is the free energy, v := (cid:164)q is the velocity, K(q, 𝑡) is the memory, and R𝑡 is the noise whose covariance function is related to the memory following the second FDT (Vroylandt and Monmarché, 2022). The construction of the free energy will be discussed in Chapter 2 and the construction of the memory will be discussed in Chapter 3. 1.2.2 Micro to Macro: A Deep Learning-Based Non-Newtonian Hydrodynamic Model A long-standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time, the complex molecular structure, and the heterogeneous interaction. The empirical continuum hydrodynamic model of incompressible non-Newtonian fluids is given 2 as follows: ∇ · u = 0 𝜌 du d𝑡 = −∇𝑝 + ∇ · (𝜏𝑠 + 𝜏𝑝) + fext (1.3) where u is the velocity, 𝜌 is the dendity, 𝑝 is the pressure, 𝜏𝑠 ∝ (∇u + ∇u𝑇 ) is the solvent stress, fext is the external force, 𝜏𝑝 is the polymer stress which is generally unknown. The construction of 𝜏𝑝 will be discussed in Chapter 4. 1.3 Dissertation Structure The dissertation is organized as follows. Chapter 2 constructs the free energy in CG models of both single- and two-component of polymeric fluid systems based on the recently developed deep coarse-grained potential (DeePCG) Zhang et al. (2018b) scheme, where each polymer molecule is modeled as a CG particle. In section 2.2, the free energy of the CG models is constructed by only using the training samples of the instantaneous force under the thermal equilibrium state. In section 2.3, we show that the constructed CG models can accurately reproduce both the probability density function of the void formation in bulk and the spectrum of the capillary wave across the fluid interface. More importantly, the CG models accurately predict the volume-to-area scaling transition for the apolar solvation energy, illustrating the effectiveness to probe the meso-scale collective behaviors encoded with molecular-level fidelity. Chapter 3 focuses on constructing the memory kernel. We present a data-driven method to learn stochastic reduced models of complex systems that retain a state-dependent memory beyond the standard generalized Langevin equation with a homogeneous kernel. In section 3.2, we show that the constructed model naturally encodes the heterogeneous energy dissipation by jointly learning a set of state features and the non-Markovian coupling among the features. Section 3.3 demonstrate the limitation of the standard GLE and the essential role of the broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation relaxation and transition. Chapter 4 focuses on a micro-to-macro model named DeePN2. In section 4.2 presents that 3 the model retains a multi-scaled nature by mapping the polymer configurations into a set of symmetry-preserving macro-scale features. The extended constitutive laws for these macro-scale features can be directly learned from the kinetics of their micro-scale counterparts. Section 4.3 shows that DeePN2 can faithfully capture the broadly overlooked viscoelastic differences arising from the specific molecular structural mechanics without human intervention. Finally, Chapter 5 summarizes the discussed modeling and discusses the potential future work. 4 CHAPTER 2 COARSE-GRAINED MOLECULAR DYNAMICS MODELING In this chapter, we focus on how to construct the conservative force. Conservative forces are essential in constructing CG models, as they describe the interactions between CG particles. Accurate conservative forces are critical for obtaining accurate memory and noise terms in CG models. Therefore, it is crucial to have a good approximation of the conservative forces. In Ge et al. (2023), we construct the CG models of both single- and two-component of polymeric fluid systems based on the recently developed deep coarse-grained potential (DeePCG) (Zhang et al., 2018b) scheme, where each polymer molecule is modeled as a CG particle. By only using the training samples of the instantaneous force under the thermal equilibrium state, the constructed CG models can accurately reproduce both the probability density function of the void formation in bulk and the spectrum of the capillary wave across the fluid interface. More importantly, the CG models accurately predict the volume-to-area scaling transition for the apolar solvation energy, illustrating the effectiveness to probe the meso-scale collective behaviors encoded with molecular-level fidelity. 2.1 Introduction Molecular dynamics (MD) simulations provide a promising avenue to establish the atomistic- level understanding of many complex systems relevant to biological and materials science. Despite the overwhelming success during the past decades, a remaining bottleneck roots in the limitation of the achievable spatio-temporal scales; the gap between the micro-scale atomistic motions and many meso-scale emerging phenomena remains large. One important problem is the nano-scale interfacial fluids, which play a crucial role in the hydration and the assembly of the biomolecules and functional nano-materials (Chandler, 2005; Berne et al., 2009). However, it is well-known that such fluid systems generally exhibit complex and multifaceted nature on different scales. On the small scale (i.e., the fluid molecule correlation length), the solvation energy is determined by the molecular reorganization and scales with the volume of the void space. On the large scale, the solvation energy is determined by the free energy for maintaining a fluid-void interface and scales with the surface area. The scale-dependent behavior indicates an cross-over regime of the entropy-enthalpy transition. 5 While theoretical understandings (Lum et al., 1999; Rein ten Wolde et al., 2001; Hummer et al., 1996, 1998) of this ubiquitous phenomenon have been developed, computational modeling often relies on full micro-scale MD simulations to retain the multifaceted properties, which, however, remain too expensive to achieve the resolved scale for applications such as nano-scale assembly. To accelerate the full MD simulations, many coarse-grained (CG) models have been developed. By modeling the dynamics in terms of a set of CG variables with reduced dimensionality, the coarse-grained molecular dynamics (CGMD) simulations, in principle, enable us to probe the collective behaviors on a broader scale. However, in practice, the construction of truly reliable CG models can be highly non-trivial, especially for the meso-scale interfacial fluids. There are two major challenges. The first challenge arises from the many-body nature of CG interactions. Specifically, the equilibrium density distribution of the CG model needs to match the marginal density distribution of the CG variables of the full model. Due to the unresolved atomistic degrees of freedom, the CG potential generally encodes the many-body interactions even if the full MD force field is governed by two-body interactions (Noid et al., 2008). Existing approaches often rely on various physical intuitions as well as empirical approximations (Izvekov and Voth, 2005; Noid, 2013; Lei et al., 2010; Hijón et al., 2010; Rudd and Broughton, 1998; Pagonabarraga and Frenkel, 2001; Nielsen et al., 2004; Shinoda et al., 2008; Molinero and Moore, 2009; Larini et al., 2010; Das and Andersen, 2012; Dinpajooh and Guenza, 2017; Sanyal and Shell, 2016; Moore et al., 2016) that reproduce certain target thermodynamic quantities and/or structural distributions. For example, the pairwise additive decomposition based on direct ensemble averaging (Lei et al., 2010; Hijón et al., 2010) can recover the thermodynamic pressure but often fail to recover the pair distribution function. Conversely, the Monte Carlo and Boltzmann inverse approaches (Lyubartsev and Laaksonen, 1995; Soper, 1996; Reith et al., 2003) can reproduce the pairwise distribution function, which, however, lead to the biased predictions of the equation of state. Several studies account for the many-body effects by introducing the configuration-independent volume potential (Das and Andersen, 2010; Dunn and Noid, 2015, 2016) and the local density (Allen and Rutledge, 2008, 2009; Izvekov et al., 2010; Moore et al., 2016; Sanyal and Shell, 2016; Shahidi et al., 2020) into the pairwise interactions. 6 On the other hand, the accuracy of the high-order structural correlations as well as the direct applications to interfacial systems remains under-explored. Besides the many-body effect, the fluid molecules also exhibit heterogeneous density at the interfacial vicinity. What further complicates the problem is the fact that the interfacial fluid density distribution is scale-dependent. On the small scale, the molecular reorganization generally leads to a wet interface with larger density than the bulk value. On the large scale, the fluid-void phase separation generally leads to a dry interface with lower density. The crossover implies complex molecular correlations near the interface. To capture this multi-faceted property, the constructed CG potential needs to properly embody the local particle distribution other than the homogeneous bulk distribution. Conventional structural-based CG potential functions generally show limitations to incorporate such information. Similar to the many-body dissipative particle dynamics (Pagonabarraga and Frenkel, 2001), recent studies employed the local density (Wagner et al., 2017; DeLyser and Noid, 2017; Sanyal and Shell, 2018; Jin and Voth, 2018; DeLyser and Noid, 2019, 2020; Berressem et al., 2021) as well as the density gradient (DeLyser and Noid, 2022) as the auxiliary field variables to construct the CG potential functions. While the CG models show significant improvement to reproduce the interfacial density profile, the scale-dependent interfacial energy and fluctuations have not been systematically investigated. In Ref. (Lei et al., 2015), interfacial energy is integrated into the continuum fluctuation hydrodynamic equation (Landau and Lifshitz, 1987) from the top-down perspective. Fluid particles essentially represent the Lagrangian discretization points based on the smoothed dissipative particle hydrodynamics (Serrano and Español, 2001) instead of the CG molecules; the meso-scale fluid structural properties can not be retained. Currently, the construction of reliable bottom-up CGMD models that faithfully encode the multifaceted molecular interactions remains largely open. In this work, we aim to address the above challenges by constructing CG models of meso- scale interfacial fluids based on the deep molecular dynamics (DeePMD) scheme (Zhang et al., 2018a,c). DeePMD is initially developed for learning the many-body interactions from the ab initio molecular dynamics, and has been applied to construct the deep coarse-grained (DeePCG) model 7 (Zhang et al., 2018b) of liquid water in bulk. Unlike the conventional forms of the inter-molecular potential function, the DeePMD represents each particle as an agent and the relative positions of its neighboring particles as the local environment. Rather than approximating the total potential of the full system by an unified parametric function, the DeePMD directly maps the local environment of each agent to the potential energy of that particle through a neural network that strictly preserves the spatial symmetries and the particle permutation invariance. Accordingly, the construction does not rely on the empirical decomposition (e.g., pairwise, three-body) of the high-dimensional particle configuration space. This unique feature is particularly suited for modeling the many-body potential of CGMD models, where the ensemble-averaged interaction between two CG particles further depends on the other neighboring CG particles and can not be represented by a pairwise additive function. Moreover, the heterogeneous particle density distribution across the fluid interface can be naturally incorporated into the CG potential function as the local environment of each particle. Accordingly, the constructed CG models can accurately model the multifaceted, scale-dependent interfacial fluctuations and apolar solvation without additional human intervention. We demonstrate the effectiveness of the CG models by considering both the single- and two- component fluids in presence of thermal interfacial fluctuations. As discussed in Ref. (Chandler, 2005), the scale-dependent hydrophobic effects can be general for solvent molecules with attractive interactions; polymeric liquids are therefore used as the benchmark problem. We compare the numerical results from the full MD simulations and the CG description that represents each molecule as a single particle located at the center of mass. By merely using training samples under equilibrium thermal fluctuations, the constructed CG models accurately predict the high-order correlations, the local compressibility and the interfacial capillary wave for both single- and two-component fluids. In contrast, the empirical CG potential constructed based on the pairwise approximation shows apparent deviations. Furthermore, we conduct the rare-event sampling simulations to estimate the probability of the void formulation in bulk. The predictions of CG model show good agreement with the full MD results. More importantly, the CG models accurately predict the volume-to-area scaling transition for the solvation energy, and therefore, pave the way for modeling the nanoscale 8 assembly in aqueous environment. Before wrapping up this section, we note that the present work focuses on the collective, quasi-equilibrium properties determined by the conservative potential function of a set of extensive CG variables; see Refs. (John and Csányi, 2017; Chan et al., 2019) for relevant work. For the conformational free energy of non-extensive CG variables, several machine-learning based approaches (Stecher et al., 2014; Mones et al., 2016; Lemke and Peter, 2017; Galvelis and Sugita, 2017; Schneider et al., 2017; Zhang et al., 2018d; Zavadlav et al., 2018; Wang et al., 2019) have been developed; see also a recent review (Noé et al., 2020) and the references therein. Furthermore, to accurately predict the dynamic properties, memory and coherent noise terms (Mori, 1965; Zwanzig, 1973) arising from the unresolved variables need to be properly introduced into the CG model (Lei et al., 2010; Hijón et al., 2010; Lei et al., 2016; Lei and Li, 2021), which are left to future investigations. 2.2 Methods and Models 2.2.1 Full Model of the Polymeric Fluids We consider the micro-scale models of the star polymer melt similar to Ref. (Hijón et al., 2010). The full system consists of 𝑀 molecules with a total number of 𝑁 atoms. Each polymer molecule consists of a “center” atom connected by 𝑁𝑎 arms with 𝑁𝑏 atoms per arm. The positions of the atoms are denoted by q = [q1, q2, · · · , q𝑁 ], where q𝑖 represents the position of the 𝑖-th atom. The potential function is governed by the pairwise and bond interactions, i.e., 𝑉 (q) = ∑︁ 𝑖≠ 𝑗 𝑉𝑝 (𝑞𝑖 𝑗 ) + 𝑉𝑏 (𝑙𝑘 ), ∑︁ 𝑘 (2.1) where 𝑉𝑝 is the pairwise interaction between both the intra- and inter-molecular atoms except the bonded pairs. 𝑞𝑖 𝑗 = ∥q𝑖 − q 𝑗 ∥ is the distance between the 𝑖-th and 𝑗-th atoms. 𝑉𝑏 is the bond interaction between the neighboring particles of each polymer arm and 𝑙𝑘 is the length of the 𝑘-th bond. The bond potential 𝑉𝑏 is chosen to be the harmonic potential, i.e., 𝑉𝑏 (𝑙) = 1 2 𝑘 𝑠 (𝑙 − 𝑙0)2, 9 (2.2) where 𝑘 𝑠 and 𝑙0 represent the elastic coefficient and the equilibrium length 𝑙0, respectively. In this study, all the physical quantities take the reduced unit. The atom mass is chosen to be unity. We investigate three fluid systems with micro-scale potential governed by Eq. (2.1). In Sec. 2.3.1, we consider the polymeric fluids in bulk and examine if the CG models can retain the many-body interactions and the local compressibility. In particular, we choose 𝑁𝑎 = 12, 𝑁𝑏 = 6, 𝜎 = 2.415, 𝜖 = 1.0, 𝑘 𝑠 = 1.714, 𝑙0 = 2.77 similar to Ref. (Hijón et al., 2010). 𝑉𝑝 takes the form of the Lennard–Jones potential with cut-off 𝑟𝑐, i.e., 𝑉p(𝑟) =  𝑉LJ(𝑟) − 𝑉LJ(𝑟𝑐), 𝑟 < 𝑟𝑐   0, 𝑟 ≥ 𝑟𝑐 𝑉LJ(𝑟) = 4𝜖 (cid:17) 12 (cid:20)(cid:16) 𝜎 𝑟 (cid:17) 6(cid:21) , − (cid:16) 𝜎 𝑟 (2.3) where 𝜖 = 1.0 is the dispersion energy and 𝜎 = 2.415 is the hardcore distance. Also we choose 𝑟𝑐 = 21/6𝜎 so that 𝑉𝑝 recovers the Weeks-Chandler-Andersen potential. The full system consists of 𝑁 = 2120 polymer molecules in a cubic domain 180 × 180 × 180 (in reduced unit) with periodic boundary condition imposed along each direction. The Nosé-Hoover thermostat is employed to conduct the canonical ensemble simulation with 𝑘 𝐵𝑇 = 3.96. In Sec. 2.3.2, we consider the polymeric fluid in presence of fluid-void interface. Micro-scale model parameters are similar to Sec. 2.3.1 except that 𝑟𝑐 = 2.5𝜎 and 𝑘 𝐵𝑇 = 1.7. Simulations are conducted in a domain 180 × 180 × 200 with periodic boundary condition imposed along the 𝑥- and 𝑦-direction. At the equilibrium, the fluid shows a clear fluid-void interface near 𝑧 = 20 and 𝑧 = 180, respectively. In Sec. 2.3.3, we consider a two-component polymeric fluid. Micro-scale model of the polymer molecule is similar to the single-component fluid system with 𝑁𝑎 = 15, 𝑁𝑏 = 12, 𝑘 𝑠 = 20.0, 𝑙0 = 1.5, 𝑘 𝐵𝑇 = 0.5. The full system consists of 3488 molecules in a domain 200 × 200 × 120 with periodic boundary condition imposed along each direction. The pairwise interaction 𝑉𝑝 is chosen to be quadratic, i.e., 𝑉p(𝑟) =    𝑎 2𝑟𝑐 (𝑟 − 𝑟𝑐)2 , 𝑟 < 𝑟𝑐 . 0, 𝑟 ≥ 𝑟𝑐 10 (2.4) Specifically, we consider two sets of the pairwise interaction: (I) 𝑎11 = 6.0, 𝑎12 = 3.0, 𝑎22 = 6.0, 𝑟𝑐 = 1.5 , where 𝑎12 represents the pairwise interaction between the component-1 and component-2 atoms. (II) 𝑎11 = 3.0, 𝑎12 = 60.0, 𝑎11 = 3.0, 𝑟 11 𝑐 = 1.5, 𝑟 12 𝑐 = 2.5, 𝑟 11 𝑐 = 1.5. The fluid shows a full mixture and interfacial separated state for the two cases respectively. 2.2.2 Coarse-grained Models For all of the three systems, we construct the CG models by representing each molecule as an individual particle. The positions of the CG particles are denoted by Q = [Q1, Q2, · · · , Q𝑀], where Q𝑖 = Q𝑖 (q) represents the center of mass (COM) of the 𝑖-th molecule. The conservative potential 𝑈 (Q) is determined by the marginal density function of Q with respect to the equilibrium density function of the full model, i.e., ∫ 𝜌(Q) = 𝑒−𝑉 (q)/𝑘 𝐵𝑇 𝑀 (cid:214) 𝑖=1 𝑈 (Q) = −𝑘 𝐵𝑇 ln 𝜌(Q). 𝛿(Q𝑖 (q) − Q𝑖)dq/ ∫ 𝑒−𝑉 (q)/𝑘 𝐵𝑇 dq, (2.5) In DeePCG, a neural network ˜𝑈 (Q; 𝚯)is used to represent the CG potential 𝑈 (Q), where 𝚯 represents the neural network parameters. To keep the extensive property, the total energy is decomposing into local contributions of the individual CG particles: ˜𝑈 (Q; 𝚯) = 𝑀 ∑︁ 𝑖=1 ˜𝑈𝑛𝑛 (D( ˜Q𝑖); 𝚯), (2.6) where ˜𝑈𝑛𝑛 is the local potential of an individual particle, ˜Q𝑖 ∈ R𝑁𝑖×4 is the generalized co- ordinates of the 𝑖-th particle. It represents the local environment of the 𝑖-th particle relative to its 𝑁𝑖 neighboring particles within cutoff 𝑅𝑐. In particular, the 𝑗-th row is defined as ˜Q𝑖 𝑗 = (𝑠(𝑟 𝑗 ), 𝑠(𝑟 𝑗 )𝑥 𝑗 /𝑟 2 𝑗 , 𝑠(𝑟 𝑗 )𝑦 𝑗 /𝑟 2 𝑗 , 𝑠(𝑟 𝑗 )𝑧 𝑗 /𝑟 2 𝑗 ), where r 𝑗 = (𝑥 𝑗 , 𝑦 𝑗 , 𝑧 𝑗 ) denotes the relative position between the 𝑖-th particle and its 𝑗-th local neighbor. 𝑠(𝑟) is a smooth differentiable function that decays to 0 at 𝑟 = 𝑅𝑐, which ensures the force also smoothly decays to zero at the cut-off. D ∈ 𝑅𝑀1×𝑀2 is the symmetry preserving features of each particle. The entry of 𝐷 can be written as: 𝐷 𝑗,𝑙 ( ˜Q𝑖) = (cid:32) 𝑁𝑖∑︁ 𝑘=1 𝑔1, 𝑗 (𝑠(𝑟𝑘 ); 𝚯) ˜Q𝑖 𝑘 𝑔2,𝑙 (𝑠(𝑟𝑘 ); 𝚯) ˜Q𝑖 𝑘 (cid:33)𝑇 , (cid:33) (cid:32) 𝑁𝑖∑︁ 𝑘=1 (2.7) 11 where (cid:8)𝑔1, 𝑗 (𝑠(𝑟); 𝚯)(cid:9) 𝑀1 𝑗=1 are neural networks mapping from the scalar 𝑟 to multiple features, and 𝑀1 and 𝑀2 are the number of customized features. 𝐷 𝑗,𝑙 preserves the 𝑗=1 and (cid:8)𝑔2,𝑙 (𝑠(𝑟); 𝚯)(cid:9) 𝑀2 translational and rotational invariance; the summation over index 𝑘 ensures the permutational symmetry. In this study, 𝑠(𝑟) is chosen as 𝑠(𝑟) = 𝑟 , 1 𝑟 ≤ 𝑅𝑐𝑠 (cid:16) (cid:104) 1 2 cos 1 𝑟 𝜋 𝑟−𝑅𝑐𝑠 𝑅𝑐−𝑅𝑐𝑠 (cid:17) (cid:105) + 1 2 , 𝑅𝑐𝑠 < 𝑟 ≤ 𝑅𝑐 0, 𝑟 > 𝑅𝑐 (2.8)    where 𝑅𝑐𝑠 = 0.97𝑅𝑐 is a smooth cut-off parameter. In principle, ˜𝑈 (Q; 𝚯) can be trained by minimizing the difference of the predicted force terms between the full micro-scale and the CG models, i.e., (cid:10)∥∇ ˜𝑈 (Q; 𝚯) − ∇𝑈 (Q) ∥2(cid:11) represents the conditional expectation with respect to the constraints of Q, i.e., (cid:206)𝑀 𝑖=1 Q, where ⟨·⟩Q 𝛿(Q𝑖 (q) − Q𝑖). However, the evaluation of the force term −∇𝑈 (Q) relies on the constraint sampling with respect to 𝛿(Q (q) − Q), which can be computational expensive. On the other hand, we note that the instantaneous force F (Q) follows F (Q) = −∇𝑈 (Q) + R (Q), where R (Q) is the zero-mean fluctuation force. Therefore, we have (cid:10)∥∇ ˜𝑈 (Q; 𝚯) − ∇𝑈 (Q) ∥2(cid:11) (cid:10)∥R (Q)∥2(cid:11) Q, where the last term does not involve in the training. Accordingly, we can transform Q = (cid:10)∥∇ ˜𝑈 (Q; 𝚯) + F (Q) ∥2(cid:11) + Q the training by minimizing the empirical loss 𝐿 = 𝑆 ∑︁ 𝑀 ∑︁ 𝑖=1 𝑗=1 (cid:13) (cid:13) (cid:13) ∇ ˜𝑈 (Q(𝑖); 𝚯) + F𝑗 (Q(𝑖)) (cid:13) 2 (cid:13) (cid:13) , (2.9) where the superscript represents the index of 𝑆 configurations. For the three micro-scale models specified in Sec. 2.2.1, we collect training samples from 50-, 200-, 250-long (in reduced unit) trajectories from the full MD simulations. 5000 snapshots are used to train the CG potential function for each case. The networks are trained by the Adam stochastic gradient descent method (Kingma and Ba, 2015). In particular, we emphasize that all the training samples are collected from thermal equilibrium states. As shown in Sec. 2.3, the constructed CG potentials naturally encode the 12 many-body and heterogeneous interfacial interactions, which enable us to accurately predict rare events such as the probability of the void formation and scale-dependent apolar solvation energy. 2.3 Numerical Results 2.3.1 Bulk Fluids Let us start with the CG model of fluids in bulk. Due to the constraint terms in Eq. (2.5), the marginal probability density function 𝜌(Q) generally can not be represented in form of the simple two-point correlation 𝜌(2) (Q𝑖, Q 𝑗 ). Accordingly, the CG potential function 𝑈 (Q) generally exhibits the many-body nature and can not be exactly constructed in form of the pairwise interaction. This limitation was verified in earlier studies on the CG modeling of polymeric fluids (Lei et al., 2010; Hijón et al., 2010), where the CG interactions are constructed based on the pairwise decomposition, i.e., 𝑈 (Q) ≈ ∑︁ 𝑖≠ 𝑗 𝑈 (2) (𝑄𝑖 𝑗 ) d𝑈 (2) (𝑟) d𝑟 = − (cid:10)F𝑖 𝑗 (Q𝑖 𝑗 ) · e𝑖 𝑗 (cid:11) , 𝑄𝑖 𝑗 =𝑟 (2.10) where e𝑖 𝑗 = Q𝑖 𝑗 /𝑄𝑖 𝑗 represents the unit vector between the 𝑖-th and 𝑗-th particle. To examine the model accuracy, we simulate the CG models with 𝑈 (Q) constructed in form of both Eq. (2.6) and Eq. (2.10). Fig. 2.1 shows the obtained radial distribution functions (RDFs). Predictions from the full MD and the reduced model based on the DeePCG potential (2.6) show good agreement. In contrast, the pairwise CG potential (2.10) yields pronounced over-estimations of the peak value near 𝑟 = 16 due to the over-simplification of the many-body CG potential using the two-body interaction; see also Refs. (Lei et al., 2010; Hijón et al., 2010). The many-body nature of 𝑈 (Q) is also manifested in the angular distribution functions (ADFs) 𝑝(𝜃), where 𝜃 is the angle determined by relative positions of three molecules. 𝑃(𝜃; 𝐴𝑟𝑐) = (cid:42) 1 𝑊 ∑︁ ∑︁ ∑︁ 𝑖 𝑗≠𝑖 𝑘 > 𝑗 (cid:43) 𝛿(𝜃 − 𝜃 𝑗𝑖𝑘 ) (2.11) where 𝜃 𝑗𝑖𝑘 is the angle between Q 𝑗𝑖 and Q𝑘𝑖, and 𝑊 is a normalization factor. The summation is over all the triplet 𝑖, 𝑗, 𝑘, such that ∥Q𝑖 − Q 𝑗 ∥ ≤ 𝐴𝑟𝑐 and ∥Q𝑖 − Q𝑘 ∥ ≤ 𝐴𝑟𝑐. Fig. 2.2 shows the 13 ADFs within four different cut-off regimes. Similar to the RDF, predictions of the DeePCG model agree well with the full MD model while the pairwise approximation yields apparent deviations. Besides the equilibrium correlations, we further examine the fluid local compressibility. While this property plays an important role in the nano-scale hydrophobicity, canonical solvation theories generally refer to the fluids at the proximity of the vapor-liquid coexistent phase. Here we examine this property of bulk fluids for the validation of the constructed many-body CG potential ˜𝑈 (Q); the discussion of the apolar solvation energy is postponed to Sec. 2.3.2. Specifically, we examine the rare event of the void formation in bulk. Following Ref. (Patel et al., 2010), we define the smoothed molecule number within a probing spherical volume centered at Q𝑐 by (cid:16) ˆ𝑛 {Q𝑖}𝑀 𝑖=1 (cid:17) = (cid:18) 𝑀 ∑︁ 𝑖=1 1 2 1 + 2 tanh (cid:18) 𝑅 − 𝑄𝑖 ℎ (cid:19)(cid:19) , (2.12) where 𝑅 is the radius of the probing sphere, 𝑄𝑖 = ∥Q𝑖 − Q𝑐 ∥ is the distance between the COM of molecule 𝑖 (or equivalently, the CG particle) and the spherical center, and ℎ = 1.0 represents the smooth length. By Eq. (2.12), particle number ˆ𝑛 is differentiable with respect to the individual molecule position Q𝑖. Similar to Ref. (Patel et al., 2010), we can probe the probability of the void formation by establishing a replica of umbrella sampling by imposing the bias potential 𝑈bias( ˆ𝑛; 𝑛 𝑗 ) = 𝑘𝑛 2 ( ˆ𝑛 − 𝑛 𝑗 )2, (2.13) where 𝑘𝑛 is the magnitude of the bias potential and 𝑛 𝑗 is the target value of the particle number inside the domain, as shown in Fig. 2.3(a). We set 𝑘𝑛 = 21.9 and establish 40 independent simulations with 𝑛 𝑗 evenly distributed between 0 and 7.5. For each replica, we collect 8 × 105 samples of ˆ𝑛 from a 1600-long trajectory. By using the weighted histogram analysis method (Kumar et al., 1992b), we can stitch the joint probability density 𝜌( ˆ𝑛, 𝑛 𝑗 ) to construct 𝜌( ˆ𝑛). Fig. 2.3(b) shows the probability density 𝜌( ˆ𝑛) obtained from the full MD and the reduced model. The predictions of the DeePCG model agree well with the full MD model over the full regime of ˆ𝑛. Finally, we examine the normalized density fluctuation 𝛿𝑛/⟨𝑛⟩ within a spherical volume of √︃(cid:10)( ˆ𝑛 − ⟨𝑛⟩)2(cid:11) is the standard various sizes, where ⟨𝑛⟩ is the average particle number and 𝛿𝑛 = 14 Figure 2.1 Radial distribution function 𝑔(𝑟) of the molecule COM obtained from the full MD simulation, the CG model using the pairwise force approximation by Eq. (2.10), and the DeePCG model. deviation. Specifically, we define the particle number by Eq. (2.12) with two different smooth length ℎ = 1.0 and ℎ = 0.1, respectively. The latter case essentially represents each molecule as a simple point and counts the particle number as integers, and therefore, yields larger density fluctuations. As shown in Fig. 2.3(c), the full MD and CG model show good agreement for both cases, indicating that the CG model can faithfully capture the high-order correlations and the local compressibility beyond the continuum thermodynamic limit. 2.3.2 Single-component Interfacial Fluids Besides the many-body interactions, another hallmark of interfacial fluids is the heterogeneous molecular distribution across the fluid interface, which leads to scale-dependent interfacial inter- actions and fluctuations. On the macro-scale level, the interfacial interactions can be generally described by continuum models such as the Young-Laplace equation (Rowlinson and Widom, 2002); the apolar solvation energy is proportional to the interfacial area and characterized by the surface tensor. However, on the length scale comparable to the correlation length of the fluid molecules, the interfacial energy often exhibits a cross-over regime representing the volume-dependent to area-dependent scaling transition. Therefore, the meso-scale interfacial energy provides a crucial metric to validate the accuracy of the CG model. First, we examine the interfacial thermal fluctuations. With the micro-scale model specified in Sec. 2.2.1, the fluid molecule interaction consists of both the short-range repulsion and long-range 15 010203040r00.511.52g(r)Full MDPairwise CGDeePCG Figure 2.2 Angular distribution function 𝑝(𝜃) of the molecule COM obtained from the full MD simulation, the pairwise CG model and the DeePCG model with different cut-off regimes 𝐴𝑟𝑐. attraction. Under the thermal equilibrium states, the fluid system exhibits the fluid-void interfaces near 𝑧 = 20 and 𝑧 = 180. The periodic boundary condition is imposed along the 𝑥- and 𝑦-direction. To quantify the molecule distribution near the interface at 𝑧 = 𝑧0, we define the smoothed density field 𝜌𝑠 (R) by 𝜌𝑠 (R) = 𝑀 ∑︁ 𝑖=1 𝑊 (∥R − Q𝑖 ∥, ℎ) (2.14) on the 𝑁𝑥 × 𝑁𝑦 × 𝑁𝑧 lattice grids. Specifically, R(𝑖, 𝑗,𝑘) := (𝑥𝑖, 𝑦 𝑗 , 𝑧𝑘 ), where (𝑥𝑖, 𝑦 𝑗 ) = (𝑖, 𝑗) × 𝑑𝑙, 𝑑𝑙 = 𝐿/𝑁𝑥 and 𝑧𝑘 = 𝑧0 − ℎ + 𝑘 × 𝑑𝑧, 𝑑𝑧 = 2ℎ/𝑁𝑧. Q𝑖 represents the COMs of the neighboring molecules for each grid point. 𝑊 (𝑟, ℎ) represents the quintic spline kernel function (Morris et al., 1997) with finite support ℎ. In this study, we set ℎ = 30.0, 𝑑𝑙 = 1.8 and 𝑑𝑧 = 0.2. The smoothed density field 𝜌𝑠 (R) enables us to define the instantaneous surface (IS) height 16 00.20.40.60.8100.20.40.60.81P()Arc=13Full MDPairwise CGDeePCG00.20.40.60.8100.20.40.60.81P()Arc=20Full MDPairwise CGDeePCG00.20.40.60.8100.10.20.30.40.50.6P()Arc=30Full MDPairwise CGDeePCG00.20.40.60.8100.10.20.30.40.50.6P()Arc=35Full MDPairwise CGDeePCG (a) (c) (b) (d) Figure 2.3 The density fluctuation and the molecule number distribution within a spherical probing volume. (a) A sketch of the star polymer with the black atom as the center. Atoms in the same arm have the same color. The transparent particle represents the coarse-grained molecule. (b) A sketch of the instantaneous molecule position under bias potential (2.13). The iso-surface in blue color represents the interface of the void space. (c) The probability density function of the molecule number within a spherical volume of radius 𝑅 = 16.0. The vertical dashed line represents the average molecule number under equilibrium. (d) The normalized density fluctuations within a spherical volume of radius 𝑅 between 8.0 and 16.0. The particle number is defined by Eq. (2.12) with the resolution length ℎ set to be 0.1 (solid lines) and 1.0 (dashed lines). ˜ℎ(𝑥, 𝑦) as the iso-surface of the fluid density (Willard and Chandler, 2010), i.e., 𝜌𝑠 (𝑥, 𝑦, ˜ℎ(𝑥, 𝑦)) = 𝜌0/2, (2.15) where 𝜌0 is the bulk fluid density, as shown in Fig. 2.4(a). Accordingly, we can compute the IS density distribution ˜𝜌(𝑧) along the 𝑧-direction, where the reference position is chosen to be ˜ℎ(𝑥, 𝑦) for each grid point (𝑥, 𝑦). As shown in Fig. 2.4(c), ˜𝜌(𝑧) exhibits apparent oscillations across the instantaneous surface. The peaks near 𝑧 = 6 and 𝑧 = 16 represent the first and the second layer of the fluid molecule near the interface. Alternatively, we can compute the density distribution 𝜌(𝑧) with respect to the plane at the average of the instantaneous height ⟨ ˜ℎ(𝑥, 𝑦)⟩, i.e., the Gibbs dividing 17 02468n-35-30-25-20-15-10-50log P(n)Full MDDeePCG0.010.020.030.040.05R-3/20.10.20.30.40.50.60.7n/Full MD dh=1.0DeePCG dh=1.0Full MD dh=0.1DeePCG dh=0.1 surface. Different from ˜𝜌(𝑧), 𝜌(𝑧) shows a smooth transition from 0 to the bulk value across the interface. For both definitions, the predictions from the CG model agree well with the full MD simulations. We emphasize that the learning of the DeePCG potential does not involve any human intervention such as the definitions of the density field and the interface height. The consistent predictions between the MD and CG models validate that constructed DeePCG potential ˜𝑈 (Q; 𝚯) faithfully captures the intrinsic fluid structure near the interface. (a) (c) (b) (d) Figure 2.4 The fluid density and the fluctuating interface of the single-component interfacial fluid system. (a) The interface defined by Eq. (2.15) with molecules (red) and interface (green). (b) A sketch of the instantaneous density field defined by Eq. (2.14). (c) The average density profile across the Gibbs dividing surface (GDS) and the instantaneous interface (IS) defined by Eq. (2.15). (d) The ensemble average of the capillary wave spectrum of the fluctuating interface. The solid line in red represents the CWT fitting using Eq. (2.17) at the low wave number. To further examine the interfacial fluctuations, we evaluate the Fourier spectrum of the instantaneous height ˜ℎ(𝑥, 𝑦), i.e., ˆℎ(k) = 1 𝐿2 ∫ 𝐿 ∫ 𝐿 0 0 ˜ℎ(𝑥, 𝑦)𝑒−𝑖𝑘 𝑥𝑥−𝑖𝑘 𝑦 𝑦d𝑥d𝑦, (2.16) where k = (𝑘𝑥, 𝑘 𝑦) is the 2D wave number. Fig. 2.4 shows the ensemble average of the spectrum (cid:10)| ˆℎ(k)|2(cid:11). On the low wave number limit, the interfacial energy is governed by the surface tensor 18 00.10.20.30.40.50.60.70.80.91-10010203040z00.511.52Full MD ISDeepCG ISFull MD GDSDeepCG GDS10-1100|k|10-5100Full MDCWTDeePCG with equi-partition distribution among the individual Fourier modes following the capillary wave theory (CWT) (Buff et al., 1965; Evans, 1979), i.e., (cid:68)(cid:12) 2(cid:69) (cid:12) ˆℎ(k)(cid:12) (cid:12) = 𝑘 𝐵𝑇 𝐿2 𝛾|k|2 , (2.17) where 𝛾 is the surface tension. At low wave number, (cid:10)| ˆℎ(k)|2(cid:11) obtained from numerical simulations shows good agreement with the CWT theory. As the wave number increases, the spectrum gradually deviates from the CWT prediction, indicating that there exists strong correlations between the height fluctuations of neighboring sites on the molecular scales. Nevertheless, the predictions from the CG model agree well with the MD results over the entire wave number regime. In particular, the good agreement in the high wave number regime shows that the CG model can accurately capture the local roughness of the interface, which is extremely sensitive to the molecule spatial correlations and the many-body interactions. Next, we examine the meso-scale, size-dependent apolar solvation energy. Similar to the bulk system considered in Sec. 2.3.1, we examine the probability density function of the number of molecule 𝑃( ˆ𝑛) within a spherical volume of radius 𝑅 = 25.0. As shown in Fig. 2.5(a), the predictions from the full MD and the CG model agree well over the full regime of ˆ𝑛. In particular, at the quasi-equilibrium regime, the interfacial energy is mainly determined by the fluid compressibility; 𝑃( ˆ𝑛) and ˆ𝑛 follow the quadratic relationship, i.e., 𝑃( ˆ𝑛) ∝ ( ˆ𝑛 − ⟨𝑛⟩)2/𝛿𝑛2. Since both ˆ𝑛 and 𝛿𝑛2 scale with the volume, the free energy −𝑘 𝐵𝑇 ln 𝑃( ˆ𝑛) scale with the volume near ⟨𝑛⟩. In contrast, 𝑃( ˆ𝑛) deviates from the quadratic relationship as ˆ𝑛 decreases and yields a larger value of 𝑃(0). The fat tail arises from the formation of a clear void-fluid interface. In particular, on the scale beyond the correlation length of fluid molecules, the local molecular reorganization is insufficient to accommodate the phase separation. Accordingly, the interfacial energy scales with the surface area of the void space. The multi-faceted nature of the interface energy can be further examined by computing the apolar solvation free energy Δ𝐺 = −𝑘 𝐵𝑇 ln 𝑃(0) for the different sizes of the void space. By the theory of Pratt and his co-worker (Hummer et al., 1996), for the small void space, Δ𝐺 is governed 19 by the molecule number fluctuations with the Gaussian distribution, i.e., Δ𝐺 ≈ 1 2 𝑘 𝐵𝑇 ˆ𝑛2/𝛿𝑛2 + 1 2 𝑘 𝐵𝑇 ln 2𝜋𝛿𝑛2, (2.18) where ˆ𝑛2/𝛿𝑛2 scales with the space volume 4/3𝜋𝑅3. On the large scale, Δ𝐺 is determined by the macro-scale surface tensor 𝛾, i.e., Δ𝐺 ≈ 4𝜋𝑅2𝛾. (a) (b) Figure 2.5 (a) The probability density function of the molecule number within a spherical volume of radius 𝑅 = 25.0. The red line represents the quadratic fitting; the deviation near 𝑛 = 0 arises from the formation of a clear fluid-void interface, where free energy approximately scales with the area of the interface. (b) Normalized solvation free energy Δ𝐺 (𝑅)/4𝜋𝑅2 obtained from the thermal integration sampling by Eq. (2.19). The transition from the volume- to area-scaling occurs between 𝑅 = 15 and 25. The two symbols represent the predictions from the probability of the void space −𝑘 𝐵𝑇 ln 𝑃(0) for 𝑅 = 25 in (a). The dashed horizontal line represents the macro-scale limit with the surface tensor 𝛾 obtained from the fluctuating interface using CWT (Eq. (2.17)) presented in Fig. 2.4. To quantify the cross-over regime, we conduct the thermal integration sampling of Δ𝐺 (𝑅) with 𝑅 between 0 and 34. The integration force dΔ ˜𝐺 d𝑅 is estimated by imposing the biased potential, i.e., dΔ𝐺 (𝑅) d𝑅 = (cid:42) 𝑀 ∑︁ 𝑖=1 ∇Q𝑖𝑈𝑏𝑖𝑎𝑠 · Q𝑖 − Q𝑐 ∥Q𝑖 − Q𝑐 ∥ (cid:43) , (2.19) where 𝑈bias( ˆ𝑛; 0) is defined by Eq. (2.13) with 𝑘𝑛 = 29.20 and ℎ = 0.4. Fig. 2.5(b) shows the obtained solvation energy Δ𝐺 (𝑅) normalized by the surface area. The predictions of the CG and the full MD models show good agreement. In particular, at small value of 𝑅, Δ𝐺 (𝑅)/4𝜋𝑅2 grows with 𝑅 and implies the volume-scaling regime. The transition from the volume- to the area-scaling occurs between 𝑅 = 15 and 𝑅 = 25. For 𝑅 > 30, Δ𝐺 (𝑅)/4𝜋𝑅2 approaches the value of the macro-scale surface tensor 𝛾 estimated from the interfacial fluctuations by the CWT theory (2.17) shown in Fig. 2.4. 20 010203040n-150-100-500log P(n)Full MDQuadratic FitDeePCG51015202530R00.0050.010.0150.020.0250.03G(R)/(4R2)Full MDDeePCG (a) (b) Figure 2.6 The average equilibrium fluid density with a distance 𝑟 + 𝑅, where 𝑅 is the radius of the spherical void space with 𝑅 = 10 (left) and 𝑅 = 30 (right). The scale-dependent interfacial energy is also manifested in the solvent density distribution near the vicinity of the void space. Fig. 2.6 shows the normalized radial distribution function 𝑔(𝑟 + 𝑅) adjacent to the interface. For 𝑅 = 10, solvation is governed by the local compressibility and molecule re-organization, leading to the high fluid density adjacent to the interface. For 𝑅 = 30, solvation leads to the clear fluid-void interface and fluid density is closer to the bulk value. The CG model accurately captures the transition and agrees well with the full MD results for both cases. 2.3.3 Two-component Fluids We first consider a two-component fluid system that takes the parameter set (I) specified in Sec. 2.2.1. Therefore, the full MD system can maintain a full mixture state. The reduced model is represented by the CG particles of two different types. The equilibrium state reaches a full mixture state as well. Fig. 2.7 shows the radial distribution functions of the COM of the molecules. Due to the “hydrophilic” interactions between type-1 and -2 molecules, the pair distribution between type 1-2 shows more a pronounced peak at 𝑅 = 11.5 as compared with the distribution between type 1-1 at 𝑅 = 12.5. Similar to Sec. 2.3.1, we compute the angular distribution functions among the molecules of both types. For all of the correlation functions, the CG and full MD models show good agreement. Next, we consider the parameter set (II) specified in Sec. 2.2.1. Due to the “hydrophobic” interaction between the two molecule types, the system develops into an immiscible state with a clear interface between the two components, as shown in Fig. 2.8(a). To examine the heterogeneous 21 051015202530r0.511.522.533.5g(R+r)R=10Full MDDeePCG051015202530r0.511.522.533.5g(R+r)R=30Full MDDeePCG (a) (b) Figure 2.7 (a) Radial distribution function 𝑔(𝑟) of the two-component, miscible polymer fluid system among the molecule COM of type 1-1, type 1-2. (b) Angular distribution function 𝑃(𝜃) of the same system among the molecule COM of type 1. fluid particle distribution, we analyze the radial distribution functions of the fluid particle on the 𝑥-𝑦 plane at different regimes. Fig. 2.8(b) shows the planar RDFs sampled at 𝑧 = 60 (interface) and 𝑧 = 30 (bulk). In particular, the planar RDF near the interfacial regime shows more pronounced peaks and structural oscillations compared with the RDF in the bulk regime. For both cases, the predictions from the CG model show good agreement with the full MD simulations. To further quantify the fluid density across the interface, we define the density field 𝜌𝑠 (R) by Eq. (2.14) on the lattice grids across the average interface of the two components (i.e., GDS) and the instantaneous height ˜ℎ(𝑥, 𝑦) as the iso-surface of the fluid density of a single component (i.e., IS). For this system, we set ℎ = 40, 𝑑𝑙 = 2.0 and 𝑑𝑧 = 1.0. Fig. 2.8(c) shows the density profiles ˜𝜌(𝑧) and 𝜌(𝑧) across the interface based on the definition of IS and GDS, respectively. Similar to the single-component fluid system, ˜𝜌(𝑧) shows pronounced oscillations that represent the intrinsic multi-layer fluid structure across the interface. In contrast, 𝜌(𝑧) shows a smooth transition across the interface due to the ensemble-averaged definition of the interface plane. The consistent predictions between the MD and CG models validate the accuracy of the constructed DeePCG potential. Finally, we examine the thermal fluctuations across the interface. Fig. 2.8(d) shows the ensemble average of the Fourier spectrum density (cid:10)| ˆℎ(k)|2(cid:11) of the instantaneous height ˜ℎ(𝑥, 𝑦) defined by Eq. (2.16). Similar to the single-component interfacial fluid system, (cid:10)| ˆℎ(k)|2(cid:11) agrees well with the CWT theory at the low wave number and deviates from the 1/|k|2 scaling at high wave number 22 010203040r00.511.52g(r)Full MDFull MDDeePCG, type 1-1DeePCG, type 1-200.511.522.5300.10.20.30.40.50.60.7P()Arc=18Full MDFull MDFull MDDeePCG, 1-1-1DeePCG, 1-1-2DeePCG, 2-1-2 (a) (c) (b) (d) Figure 2.8 The fluid density and the fluctuating interface of the two-component, immiscible fluid system. (a) The interface defined by Eq. (2.16) with type-1 (blue) and type-2 molecules (red). (b) Radial distribution function 𝑔(𝑟) of type-2 molecules on the 𝑥-𝑦 plane near the bulk (𝑧 = 30) and the interface (𝑧 = 60). (c) The average density profile across the Gibbs dividing surface and the instantaneous surface defined by Eq. (2.15). (d) The capillary wave spectrum of the fluctuating interface. due to the local spatial correlations between the molecules. The predictions from CG and full MD models show good agreement over the full regime. 2.4 Summary In this study, we constructed coarse-grained models of meso-scale interfacial polymeric fluids based on the DeePCG scheme (Zhang et al., 2018b). In particular, the constructed CG potential can accurately encode the many-body interactions arising from the unresolved atomistic interactions, as well as the heterogeneous molecule distributions near the interface. This unique feature ensures that the constructed CG models can retain the consistent invariant distribution with the full MD model and faithfully capture the multi-facted, scale-dependent interfacial energy without additional human intervention. The training process only requires the MD samples of the instantaneous force field without further ad hoc assumptions and approximations of the CG potential functions. 23 010203040r00.511.522.5g(r)Full MD BulkFull MD InterfaceDeePCG BulkDeePCG Interface-5051015202530z00.511.522.5Full MD ISDeePCG ISFull MD GDSDeePCG GDS10-1100|k|10-1010-5100Full MDDeePCG While we focus on the polymeric fluids in this study, the present CG models can be generalized for complex fluids and soft matter systems where the many-body and heterogeneous effects are often pronounced. In particular, the constructed CG potential functions accurately reproduce the pairwise and high-order correlation functions while the empirical approximations show limitations. Moreover, the accurate predictions of the local compressibilty and the full-range spectrum of the interfacial fluctuations demonstrate the validity of the CG models to probe the collective behaviors across the molecular and continuum scales. More importantly, the CG models successfully predict the probability of the void formation as a rare event and the transition of the volume- to area-scaling of solvation energy. The accurate predictions on such properties show the promise of the present models to study the challenging problems relevant to nanoscale assembly processes (Miller et al., 2007), where the full MD simulations often show limitation to achieve the resolved spatio-temporal scale. Finally, we note that the present study focuses on the quasi-equilibrium properties of the reduced model. The zero-rate shear viscosity predicted by the DeePCG model is 55.56% less than the value of the full MD model. The predictive modeling of the dynamic properties further relies on the accurate construction of the memory and fluctuation terms that represent the unresolved energy-dissipation processes (Hijón et al., 2010; Lei et al., 2016; Lei and Li, 2021; She et al., 2023). Also, it is worth exploring the construction of CG potential function with certain generalization abilities that account for the different temperature (Zhang et al., 2020) and model resolution (Empereur-mot et al., 2022). We will pursue these problems in future studies. 24 CHAPTER 3 DATA-DRIVEN LEARNING OF THE STATE-DEPENDENT MEMORY KERNEL In this chapter, we focus on how to construct the memory kernel. In Ge et al. (2024), we present a data-driven method to learn stochastic reduced models of complex systems that retain a state- dependent memory beyond the standard generalized Langevin equation (GLE) with a homogeneous kernel. The constructed model naturally encodes the heterogeneous energy dissipation by jointly learning a set of state features and the non-Markovian coupling among the features. Numerical results demonstrate the limitation of the standard GLE and the essential role of the broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation relaxation and transition. 3.1 Introduction Predicting the collective behavior of complex multiscale systems is often centered around projecting the full-dimensional dynamics onto a set of resolved variables. However, an accurate construction of such a reduced model remains a practical challenge for real applications such as molecular modeling. While model reduction frameworks such as the Koopman operator (Koopman, 1931) and the Mori-Zwanzig projection formalism (Mori, 1965; Zwanzig, 1961) enable us to write down the dynamic equations in terms of the resolved variables, the reduced model generally becomes non-Markovian with a memory term that may further depend on the resolved variables; the direct numerical evaluation involves solving the expensive full-dimensional orthogonal dynamics. In practice, one common approximation is to ignore such state-dependency; the reduced model is simplified as the standard generalized Langevin equation (GLE) (Zwanzig, 2001) with a memory kernel that only depends on time. Several approaches (Lange and Grubmüller, 2006; Darve et al., 2009; Ceriotti et al., 2009; Baczewski and Bond, 2013; Davtyan et al., 2015; Lei et al., 2016; Russo et al., 2019; Jung et al., 2017; Lee et al., 2019; Ma et al., 2019; Wang et al., 2020; Zhu and Venturi, 2020; Klippenstein and van der Vegt, 2021; Vroylandt et al., 2022; She et al., 2023; Xie et al., 2022) have been developed to construct the memory kernel such that certain dynamic properties (e.g., the two-point correlations) can be properly reproduced. Despite its broad application, the validity of 25 the standard GLE for real multiscale systems remains less understood (Hänggi, 1997; Klippenstein et al., 2021). Intuitively, the above model reduction problem is somewhat analogous to hiking on a mountain where the landscape map and the path roughness represent the free energy and the memory term, respectively. In general, we should not expect homogeneous path roughness at the different locations (e.g., the valleys and the ridges), which, conversely, needs to be inferred from the hiking records. Indeed, studies based on full molecular dynamics (MD) simulations (Posch et al., 1984; Straub et al., 1987, 1990; Plotkin and Wolynes, 1998; Luo et al., 2006; Best and Hummer, 2006, 2010; Hinczewski et al., 2010; Satija et al., 2017; Morrone et al., 2012; Daldrop et al., 2017) and sophisticated projection operator construction (Deutch and Oppenheim, 1971; Zwanzig, 1973, 1992; Berezhkovskii and Szabo, 2011; Glatzel and Schilling, 2022; Vroylandt, 2022; Vroylandt and Monmarché, 2022; Ayaz et al., 2022a; Jung and Jung, 2023) show that the extracted memory term can exhibit a pronounced state-dependent nature, where the implications for the collective behaviors remain under-explored. For extensive MD systems, a recent study (Lyu and Lei, 2023) on reduced modeling of polymer melt shows that the heterogeneous inter-molecular energy dissipation (i.e., the memory) can be crucial for transport on the hydrodynamic scale. However, for canonical non-extensive problems such as biomolecule systems, a quantitative understanding of the state-dependent memory effect on the reduced dynamics remains an open problem. Several recent works (Lei et al., 2016; Lee et al., 2019; Satija and Makarov, 2019; Grogan et al., 2020; Singh et al., 2021; Ayaz et al., 2021; Vroylandt et al., 2022; Dalton et al., 2023) model the non-Markovian effect for transition dynamics based on the standard GLE. While elegant semi-analytical studies (Straub et al., 1988; Singh et al., 1990; Carmeli and Nitzan, 1983; Tarjus and Kivelson, 1991; Krishnan et al., 1992; Voth, 1992; Straus et al., 1993; Haynes et al., 1993, 1994; Cossio et al., 2015) on idealized 1D double-well potential provide theoretical insights into the state-dependent nature, quantitative modeling that retains the reduced dynamics consistent with the full MD model, including collective properties such as transition and conformation relaxation, relies on accurate construction and efficient simulation of a reduced model beyond the standard GLE. 26 This work presents a data-driven approach for learning a new stochastic reduced model that retains a state-dependent memory for non-extensive systems. Instead of dealing with the orthogonal dynamics (Darve et al., 2009; Vroylandt and Monmarché, 2022; Lyu and Lei, 2023), the training only relies on the trajectory samples and does not directly solve the Mori-Zwanzig projection formalism. The main idea is to seek a generalized representation of the memory as the composition of a set of state-dependent features, which encodes the coupling between the resolved and unresolved variables and will be learned using three-point correlation functions. Efficient training is achieved by constructing the encoders using a set of sparse bases, whose correlations can be efficiently pre-computed. The time-dependent component is directly learned in the Fourier space which enables the efficient evaluation of the convolution term via the FFT and meanwhile ensures non-negative energy dissipation (i.e., model stability). To simulate the model, coherent noise can be introduced that strictly satisfies the second fluctuation-dissipation theorem (FDT) and retains a consistent invariant distribution. The present model, with a new memory form, essentially reveals a caveat in model reduction of multiscale systems and provides a reliable approach for simulating the stochastic reduced dynamics beyond empirical models. It enables us to probe open problems such as the effect of state-dependent memory on molecular kinetics. Numerical results show that the broadly overlooked state-dependency can play a crucial role. In particular, the standard GLE is insufficient to capture the collective properties such as conformation relaxation and transition rate distribution, which, fortunately, can be reproduced by the present model. 3.2 Model Derivation Let (q, p) ∈ R2𝑚 represent the resolved variables of a high-dimensional Hamiltonian system, where q denotes the coarse-grained (CG) coordinates as a function of the position variables of the full model, and p denotes the CG momenta. Following the Zwanzig’s formalism (Zwanzig, 2001; Hijón et al., 2010), the reduced dynamics takes the form (cid:164)q = M−1p, (cid:164)p = −∇𝑈 (q) − ∫ 𝑡 0 K(q(𝜏), 𝑡 − 𝜏)v(𝜏)d𝜏 + R𝑡, (3.1) 27 where M is the mass matrix, 𝑈 (q) is the free energy, v := (cid:164)q is the velocity, K(q, 𝑡) is the memory, and R𝑡 is the noise whose covariance function is related to the memory following the second FDT (Vroylandt and Monmarché, 2022). Before proceeding to the construction of K(q, 𝑡), we note that the rigorous form based on Zwanzig’s formalism depends on both q and p. Here we focus on the state-dependence on q and assume it is independent of p (Hijón et al., 2010). Furthermore, M generally depends on q; the current choice of q leads to a constant mass matrix [see Refs. (Lee et al., 2019; Ayaz et al., 2022a) and Section 3.5.1]. Also, the construction of the free energy 𝑈 (q) can be nontrivial; several canonical methods based on enhanced sampling (Torrie and Valleau, 1977; Kumar et al., 1992a; Darve and Pohorille, 2001; Laio and Parrinello, 2002) and temperature acceleration (Rosso et al., 2002; Maragliano and Vanden-Eijnden, 2006; Abrams and Tuckerman, 2008; Maragliano and Vanden-Eijnden, 2008) have been developed to facilitate the phase space exploration. We assume the phase space can be effectively explored and 𝑈 (q) is known a priori. Instead of rigorously constructing K(q, 𝑡) from the full model, we ask the question of which forms of K can generate a memory effect. One common approach is to embed the memory in a larger Markovian dynamics with a set of auxiliary variables. An essential observation is that the memory term can be generally written as K(q(𝜏), 𝑡 − 𝜏) ≈ C+ ◦ exp (cid:0)(𝑡 − 𝜏)Laux (cid:1) ◦ C−, (3.2) where Laux is the Liouville operator corresponding to the auxiliary dynamics and C± are channels representing the coupling of the resolved and auxiliary variables. As a special case, if the coupling and the auxiliary dynamics take a linear form, the embedded memory recovers the standard GLE kernel, i.e., K(q, 𝑡) = K(𝑡) (e.g., see Refs. (Lei and Li, 2021; She et al., 2023)). Therefore, to construct the reduced model beyond the standard GLE, the coupling channels need to properly retain certain kinds of state-dependency nature. This motivates us to represent C± by seeking a set of state-dependent features 𝜙(q) = [𝜙1(q), · · · , 𝜙𝑛 (q)], where 𝜙 : R𝑚 → R𝑛×𝑚 essentially encode the nonlinear coupling between the resolved and unresolved variables and the detailed form will be specified later. exp (𝑡Laux) induces the non-Markovian interactions among the features with a time lag 𝑡 characterized by a kernel function, i.e., C+ ◦exp ((𝑡 − 𝜏)Laux) ◦ C− = 𝜙(q(𝑡))𝑇 Θ(𝑡 −𝜏)𝜙(q(𝜏)), 28 where Θ : R+ → R𝑛×𝑛 and component Θ𝑖 𝑗 (𝑡 − 𝜏) represents the dissipation between features 𝜙𝑖 (q(𝑡)) and 𝜙 𝑗 (q(𝜏)). In the remainder of this work, we use 𝜙𝑡 to denote 𝜙(q(𝑡)). With the above observation, we propose the following form to model the reduced dynamics (3.1), i.e., (cid:164)q = M−1p, (cid:164)p = −∇𝑈 (q) − ∫ 𝑡 0 𝜙𝑇 𝑡 Θ(𝑡 − 𝜏)𝜙𝜏v(𝜏)d𝜏 + R𝑡, (3.3) where encoders {𝜙𝑖 (q)}𝑛 𝑖=1 and kernel Θ(𝑡) need to be determined. As a special case, at the Markovian limit Θ(𝑡) ∝ 𝛿(𝑡), Eq. (3.3) recovers the Langevin dynamics and the quadratic form 𝜙𝑇 𝜙 ensures positive energy dissipation. Also, by choosing Θ(𝑡) to be diagonal with individual components corresponding to certain frequency modes, Eq. (3.3) reduces to the heat bath model (Zwanzig, 1973) with a nonlinear coupling of bath coordinates. On the other hand, the present model enables an adaptive choice of the number of spatial features and a more general form of Θ(𝑡) with the off-diagonal components capturing the non-Markovian coupling among the features, which turns out to be crucial for reproducing the collective dynamics. 3.2.1 Coherent Noise and Invariant Density of the Reduced Model We emphasize that Eq. (3.3) should not be viewed as a direct approximation of Zwanzig’s projection formalism. Rather, it serves as a reduced model that faithfully retains the state-dependent memory effect. To construct the model, we represent encoders {𝜙𝑖 (q)}𝑛 𝑖=1 and kernel Θ(𝑡) in form of 𝜙𝑖 (q) = H𝑇 𝑖 𝜓(q), 𝑁 𝜔∑︁ Θ(𝑡) = e−𝛼𝑡 ˆΘ𝑘 cos(𝜔𝑘𝑡), (3.4) 𝑘=0 where 𝜓(q) = (cid:2)𝜓1(q), · · · , 𝜓𝑁𝑏 (q)(cid:3) is a set of sparse bases and H = (cid:2)H𝑇 coefficients, 𝜔𝑘 = 2𝜋 𝑇𝑐 not be viewed as the bases to approximate Θ(𝑡) (e.g., (cid:8)e−𝛼𝑖𝑡 cos(𝛽𝑖𝑡), e−𝛼𝑖𝑡 sin(𝛽𝑖𝑡)(cid:9) 𝑁 𝛼 (cid:3) are trainable 𝑘 and 𝑇𝑐 is the time domain cut-off of the kernel. We note that e−𝛼𝑡 should 𝑖=1; see Refs. (Lei et al., 2016; Lee et al., 2019)). Rather, Θ(𝑡) is mainly characterized by the Fourier series , · · · , H𝑇 𝑛 1 expansion on [0, 𝑇], and the exponential term e−𝛼𝑡 is essentially a regularization term to eliminate 29 the periodicity while maintaining the semi-positive definiteness condition. Θ(𝑡) needs to preserve 𝑘 , where Γ𝑘 ∈ R𝑛×𝑛 is a positive semi-definiteness. Hence, we represent Fourier modes ˆΘ𝑘 = Γ𝑘 Γ𝑇 low-triangular matrix to be determined along with 𝛼 ≥ 0. For the fluctuation term R𝑡, we represent it as a noise in the form of R𝑡 = 𝜙𝑇 𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡) is a Gaussian random process whose covariance function determined by Θ(𝑡), i.e., ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇Θ(𝑡 − 𝜏). This choice avoids dealing with the orthogonal dynamics to calculate the fluctuation term. Furthermore, we can show that this choice enables the reduced model to retain a consistent invariant density function. Proposition 3.2.1. For reduced model (3.3) with Θ(𝑡) = e−𝛼𝑡 (cid:205)𝑁 𝜔 𝑘=0 fluctuation term R𝑡 = 𝜙𝑇 𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡) is a Gaussian random process satisfying ˆΘ𝑘 cos(𝜔𝑘𝑡), by choosing the ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇Θ(𝑡 − 𝜏), the reduced model has an invariant distribution 𝜌eq(q, p) ∝ exp (cid:8)− (cid:2)𝑈 (q) + p𝑇 M−1p/2(cid:3) /𝑘 𝐵𝑇 (cid:9) . Proof. Let us introduce auxiliary variables z𝑘,1 = − z𝑘,2 = − ∫ 𝑡 0 ∫ 𝑡 0 e−𝛼(𝑡−𝜏)Γ𝑘 cos(𝜔𝑘 (𝑡 − 𝜏))𝜙𝜏v(𝜏)d𝜏 + R𝑘,1(𝑡), e−𝛼(𝑡−𝜏)Γ𝑘 sin(𝜔𝑘 (𝑡 − 𝜏))𝜙𝜏v(𝜏)d𝜏 + R𝑘,2(𝑡), where Γ𝑇 𝑘 Γ𝑘 = ˆΘ𝑘 and R𝑘,1(𝑡) is a Gaussian random process satisfying (3.5) (3.6) (3.7) (cid:10)R 𝑗,1(𝑡)R𝑘,1(𝜏)𝑇 (cid:11) = 𝑘 𝐵𝑇 𝛿 𝑗 𝑘 e−𝛼(𝑡−𝜏) cos(𝜔𝑘 (𝑡 − 𝜏)), (3.8) where 𝛿 𝑗 𝑘 is the Kronecker delta. Accordingly, the second equation of Eq. (3.3) can be written as (cid:164)p = −∇𝑈 (q) + 𝜙(q)𝑇 ∑︁ Γ𝑇 𝑘 z𝑘,1, 𝑘 (3.9) and R 𝑗,2(𝑡) will be specified later. 30 Let z𝑘 = (cid:2)z𝑘,1, z𝑘,2 (cid:3) and R𝑘 = (cid:2)R𝑘,1, R𝑘,2 (cid:3), we can rewrite Eq. (3.7) by z𝑘 = − ∫ 𝑡 0 cos(𝜔𝑘 (𝑡 − 𝜏))𝐼 e−𝛼(𝑡−𝜏) (cid:169) (cid:173) (cid:173) − sin(𝜔𝑘 (𝑡 − 𝜏))𝐼 (cid:171) −𝛼𝐼 𝜔𝑘 𝐼 ∫ 𝑡 exp = − (cid:170) (cid:174) (cid:174) (cid:172) By taking the time derivative of Eq. (3.10) with respect to 𝑡, we have d𝜏 + R𝑘 (𝑡). −𝜔𝑘 𝐼 −𝛼𝐼 (𝑡 − 𝜏) (cid:169) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) (cid:172) 0 0 Γ𝑘 𝜙𝜏v(𝜏)   (cid:169)  (cid:173)  (cid:173)   (cid:171)         sin(𝜔𝑘 (𝑡 − 𝜏))𝐼 cos(𝜔𝑘 (𝑡 − 𝜏))𝐼 Γ𝑘 𝜙𝜏v(𝜏) 0 (cid:170) (cid:174) (cid:174) (cid:172) (cid:170) (cid:174) (cid:174) (cid:172) (cid:169) (cid:173) (cid:173) (cid:171) d𝜏 + R𝑘 (𝑡) (3.10) dz𝑘 d𝑡 = (cid:169) (cid:173) (cid:173) (cid:171) (cid:124) −𝛼𝐼 𝜔𝑘 𝐼 −𝜔𝑘 𝐼 −𝛼𝐼 (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) ≜J (cid:170) (cid:174) (cid:174) (cid:172) (cid:125) z𝑘 − (cid:169) (cid:173) (cid:173) (cid:171) Γ𝑘 𝜙𝜏v(𝑡) 0 + dR𝑘 d𝑡 (cid:170) (cid:174) (cid:174) (cid:172) − JR𝑘 (𝑡). (3.11) Furthermore, we note that R𝑘 (𝑡) can be modeled as a generalized Ornstein–Uhlenbeck process and dR𝑘 d𝑡 − JR𝑘 (𝑡) can be represented by dR𝑘 d𝑡 − JR𝑘 (𝑡) = Λ𝑘 (cid:164)W𝑘,𝑡, (3.12) where (cid:164)W𝑘,𝑡 is the standard white noise and Λ𝑘 Λ𝑇 of R𝑘 (𝑡) = (cid:2)R𝑘,1, R𝑘,2 (cid:3) is given by 𝑘 = −𝑘 𝐵𝑇 (J + J𝑇 ). With this choice, the covariance (cid:10)R𝑘 (𝑡)R𝑘 (𝜏)𝑇 (cid:11) = 𝑘 𝐵𝑇e−𝛼(𝑡−𝜏) (cid:169) (cid:173) (cid:173) (cid:171) cos(𝜔𝑘 (𝑡 − 𝜏))𝐼 sin(𝜔𝑘 (𝑡 − 𝜏))𝐼 − sin(𝜔𝑘 (𝑡 − 𝜏))𝐼 cos(𝜔𝑘 (𝑡 − 𝜏))𝐼 (cid:170) (cid:174) (cid:174) (cid:172) such that Eq. (3.8) remains valid. Using Eqs. (3.7)(3.9)(3.11), we can write the reduced model (3.3) in the form of d d𝑡 q p · · · z𝑘,1 z𝑘,2 · · · (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) = (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) 0 −𝐼 0 𝐼 0 · · · 0 −Γ𝑘 𝜙(q) 0 0 0 · · · · · · 0 · · · 𝜙(q)𝑇 Γ𝑇 𝑘 · · · · · · · · · · · · · · · −𝛼𝐼 𝜔𝑘 𝐼 · · · 0 0 · · · −𝜔𝑘 𝐼 −𝛼𝐼 · · · ≜ K∇𝐹 (q, p, · · · , z𝑘,1, z𝑘,2, · · · ) + Λ (cid:164)W𝑡, ∇𝑈 (q) v · · · z𝑘,1 z𝑘,2 · · · · · · · · · · · · · · · · · · · · · (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) + (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) 0 0 · · · Λ𝑘 (cid:164)W𝑘,𝑡 · · · (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) (3.13) 31 where K is the first matrix of the right-hand-side of Eq. (3.13), 𝐹 (q, p, · · · , z𝑘,1, z𝑘,2, · · · ) = 𝑈 (q) + 1 2p𝑇 M−1p + 1 2 (cid:16) (cid:205)𝑁 𝜔 𝑘=1 𝑘,1z𝑘,1 + z𝑇 z𝑇 𝑘,2z𝑘,2 (cid:17) is the total free energy of the extended system, and Λ = diag(0, 0, · · · , Λ𝑘 , · · · ). Using (3.12), it is easy to show ΛΛ𝑇 = −𝑘 𝐵𝑇 (K + K𝑇 ). Therefore, the gradient system (3.13) (i.e., the reduced model (3.3)) has the invariant density function 𝜌eq(q, p, z) = exp[−𝐹 (q, p, z)/𝑘 𝐵𝑇]. 3.2.2 Training Method of the Reduced Model To learn the reduced model (3.3), we need to choose appropriate metrics such that the state- dependent non-Markovian nature can be manifested. While auto-correlations such as 𝑐𝑣𝑣 (𝑡) = (cid:10)v(𝑡)v(0)𝑇 (cid:11) merely characterize the overall memory effect, a crucial observation is that the correlation conditional with different initial state q0 further depends on the local energy dissipation and therefore naturally encode the signatures of the heterogeneous memory effect. Accordingly, we right-multiply the second equation of (3.3) by v(0) and take the conditional expectation on q0 = q∗, i.e., ∫ 𝑡 0 ∫ 𝑡 0 ∫ 𝑡 g(𝑡; q∗) = = = (cid:10)𝜙𝑇 𝑡 Θ𝑡−𝜏𝜙𝜏v𝜏v𝑇 0 |q0 = q∗(cid:11) d𝜏 (cid:10)Tr (cid:2)Θ𝑡−𝜏H𝜓𝜏v𝜏v𝑇 0 𝑡 H𝑇 (cid:3) |q0 = q∗(cid:11) d𝜏 𝜓𝑇 Tr (cid:2)Θ𝑡−𝜏HC𝜓,𝜓 (𝑡, 𝜏; q∗)H𝑇 (cid:3) d𝜏, 0 where g(𝑡; q∗) := (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]v𝑇 point correlation characterizing the coupling among the bases. Since 𝜓(q) is sparse, 𝜓𝜏𝜓𝑇 0 |q0 = q∗(cid:11) and C𝜓,𝜓 (𝑡, 𝜏; q∗) := (cid:10)𝜓𝜏v𝜏v𝑇 𝑡 |q0 = q∗(cid:11) is a three- 𝜓𝑇 𝑡 can 0 be evaluated with 𝑂 (1) complexity and hence C𝜓,𝜓 (𝑡, 𝜏; q∗) can be efficiently pre-computed. Accordingly, we can train the reduced model in terms of coefficients H for encoders 𝜙(q) as well as matrices {Γ𝑘 }𝑁 𝜔 𝑘=1 and 𝛼 for kernel Θ(𝑡) by minimizing the empirical loss 𝑁𝑞 ∑︁ 𝑁𝑡 ∑︁ 𝐿 = (cid:13) (cid:13)(cid:101)g(𝑡𝑘 ; q(𝑙)) − g(𝑡𝑘 ; q(𝑙)) (cid:13) (cid:13) 2 (cid:13) (cid:13) , (3.14) 𝑘=1 (cid:101)g(𝑡𝑘 ; q(𝑙)) = 𝑙=1 𝑘 ∑︁ 𝑗=1 Tr (cid:2)Θ(𝑡𝑘 − 𝑡 𝑗 )HC𝜓,𝜓 (𝑡𝑘 , 𝑡 𝑗 ; q(𝑙))H𝑇 (cid:3) 𝛿𝑡, 32 where (cid:8)q(𝑙)(cid:9) 𝑁𝑞 𝑙=1 represent configuration samples within the phase space. For systems with pronounced free energy barriers, q(𝑙) can be collected along with free energy construction (Maragliano and Vanden-Eijnden, 2008), whereas the conditional correlations for each q(𝑙) need to be sampled from unbiased equilibrium trajectories. (cid:101)g(·) represents the prediction by the reduced model which depends on the trainable variables and the pre-computed correlation C𝜓,𝜓. 𝛿𝑡 is the time step. Besides the conditional correlation functions, we can also introduce the loss function with respect to the overall correlation function, i.e., 𝑁𝑡 ∑︁ 𝑘=1 𝑘 ∑︁ ∥(cid:101)g2(𝑡𝑘 ) − g2(𝑡𝑘 ) ∥2 (cid:104) Θ(𝑡𝑘 − 𝑡 𝑗 )HC𝜓,𝜓 (𝑡𝑘 , 𝑡 𝑗 )H𝑇 (cid:105) 𝛿𝑡, Tr 𝐿2 = (cid:101)g2(𝑡𝑘 ) = where g2(𝑡) = (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]v𝑇 particular, if there is scale separation between 𝑐𝑣𝑣 (𝑡) and 𝑐𝑞𝑞 (𝑡) (e.g., 𝑐𝑣𝑣 (𝑡) decays much faster than 𝑗=1 (cid:11) is the overall correlation and C𝜓,𝜓 (𝑡, 𝜏) = (cid:10)𝜓𝜏v𝜏v𝑇 0 𝜓𝑇 𝑡 (cid:11). In 0 𝑐𝑞𝑞 (𝑡); see Fig. 3.4 and Fig. 3.5), we may approximate C𝜓,𝜓 (𝑡, 𝜏) by two-point correlations, i.e., C𝜓,𝜓 (𝑡, 𝜏) ≈ (cid:10)𝜓𝜏 ⊗ 𝜓𝑇 𝑡 (cid:11). (cid:11) : (cid:10)v𝜏v𝑇 0 Efficient training is achieved by using the following numerical methods to evaluate (cid:101)g and (cid:101)g2. 𝑡 can be efficiently pre-computed with 𝑂 (1) complexity by using the sparse Specifically, 𝜓𝜏𝜓𝑇 piecewise linear basis functions. Furthermore, we can use the low-rank representation (e.g., based on the singular value decomposition) of C𝜓,𝜓 and C𝜓,𝜓 to accelerate the matrix production HC𝜓,𝜓H𝑇 . In addition, the convolution on index 𝑗 can be efficiently evaluated by the Fast Fourier Transform algorithm (Cooley and Tukey, 1965). For the present study, 𝜓 is chosen as the uniform piecewise linear basis function defined on [2.8, 4.1] with 𝑁𝑏 = 66, 𝑇𝑐 = 200 and 𝑁𝜔 = 2000. We use 2 × 106 short trajectories extracted from the full MD simulation (see Sec. 3.3.1 for details) where each one consists of 300 time-series samples to sample the correlation for q∗ at the saddle point. We use 2 × 106 ∼ 8 × 106 short trajectories for other individual points. While 𝐿2 alone is insufficient to characterize the emergence of the state-dependent memory, it serves as a necessary condition and can facilitate the learning of the reduced model. In practice, 33 we can use both loss functions to train the reduced model with 𝑁𝑞 = 65, 𝑁𝑡 = 300 for 𝐿, and 𝑁𝑡 = 30000 for 𝐿2. Specifically, the training is conducted by the Adam (Kingma and Ba, 2015) optimization method in three stages with 2000, 6000, and 6000 steps respectively. For the first stage, we only use 𝐿2 to train the model with a constant learning rate of 0.04. For each step of the following two stages, 16 initial states (i.e., 𝑞 (𝑙)) are randomly selected as one training batch to evaluate the total loss 𝐿𝑡 = 𝐿 + 𝐿2. For both stages, the initial learning rate is 1 × 10−2 and the exponential decay rate is 0.9 per 150 steps. In practice, the number of 𝑁𝑏 and 𝑁𝑞 can be further reduced. As shown in the GitHub repository, with 𝑁𝑏 = 8, 𝑁𝑞 = 26, the constructed reduced model can accurately recover the MD results. 3.2.3 Simulation of the Reduced Model We simulate the reduced model (3.3) on 𝑡 ∈ [0, 𝑇] by generating the noise term R𝑡 = 𝜙𝑇 𝑡 (cid:101)R(𝑡), where (cid:101)R : R+ → R𝑛 is a Gaussian random process, using the constructed Fourier modes rather than Markovian embedding (e.g., see Ref. (Ayaz et al., 2022b)). Specifically, we have proved that by choosing ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇Θ(𝑡 − 𝜏), the reduced model retains a consistent equilibrium 2p𝑇 M−1p(cid:3) (cid:9) (Prop. 3.2.1). Accordingly, we can generate density, i.e., 𝜌eq(q, p) ∝ exp (cid:8)−𝛽 (cid:2)𝑈 (q) + 1 {(cid:101)R(𝑡𝑖)}𝑁 𝑖=0 similar to Refs. (Berkowitz et al., 1983; Ogorodnikov and Prigarin, 1996) by (cid:101)R(𝑡𝑖) = 𝛽−1/2 2𝑁 ∑︁ 𝑘=0 (cid:101)Θ1/2 𝑘 [cos(𝜔𝑘𝑡𝑖)𝜉𝑘 + sin(𝜔𝑘𝑡𝑖)𝜂𝑘 ] , (3.15) where (cid:101)Θ𝑘 are the Fourier (essentially cosine) modes of Θ (|𝑡|) on [−𝑇, 𝑇] (Berkowitz et al., 1983; Ogorodnikov and Prigarin, 1996); 𝜉𝑘 and 𝜂𝑘 are independent Gaussian random vectors. Specifically, for large simulation time 𝑇, the Fourier modes (cid:101)Θ𝑘 is given by (cid:101)Θ𝑘 = = ≈ 𝑁 𝜔∑︁ ∫ 𝑇 −𝑇 ∫ 𝑇 𝑗=1 𝑁 𝜔∑︁ e−𝛼|𝑡| ˆΘ 𝑗 cos(𝜔 𝑗 𝑡) cos(𝜔𝑘𝑡)d𝑡 e−𝛼𝑡 ˆΘ 𝑗 (cid:0)cos (cid:0)(𝜔 𝑗 − 𝜔𝑘 )𝑡(cid:1) + cos (cid:0)𝜔 𝑗 + 𝜔𝑘 )𝑡(cid:1)(cid:1) (3.16) 𝑗=1 𝑁 𝜔∑︁ (cid:32) 𝑗=1 0 𝛼 ˆΘ 𝑗 𝛼2 + (𝜔 𝑗 − 𝜔𝑘 )2 + 𝛼 ˆΘ 𝑗 𝛼2 + (𝜔 𝑗 + 𝜔𝑘 )2 (cid:33) . 34 Therefore (cid:101)R(𝑡) can be generated using the Fast Fourier Transform algorithm (Cooley and Tukey, 1965) using 𝑂 (𝑁 log 𝑁) complexity. Also, the convolution term ∫ 𝑡 𝜙𝑇 𝑡 Θ(𝑡 − 𝜏)𝜙𝜏v(𝜏)d𝜏 in Eq. 0 (3.3) can be computed using the fast convolution method developed in Ref. (Schädle et al., 2006) with 𝑂 (𝑁 log 𝑁) complexity. 3.3 Numerical Results The present model enables us to simulate the reduced dynamics beyond the standard GLE and systematically investigate open problems like the state-dependent memory effect on the collective dynamics of complex systems such as molecule kinetics. 3.3.1 Full Atomistic Model In this work, we consider the full micro-scale model of benzyl bromide (see Fig. 3.1 for a sketch of the molecule structure) in an aqueous environment. The general AMBER (Wang et al., 2004) force field is used for the benzyl bromide molecule and the partial charges of molecule atoms were set by the restrained electrostatic potential (RESP) approach (Bayly et al., 1993). The rigid TIP3P water model (Jorgensen et al., 1983) is used for the water molecules and the bond lengths and angles were held constant through the SHAKE algorithm (Ryckaert et al., 1977; Miyamoto and Kollman Peter, 2004). Long-range electrostatic interactions were calculated using a Particle Mesh Ewald summation with a relative error set to be 10−4. The full system consists of one benzyl bromide molecule and 2400 water molecules with the periodic boundary condition imposed along each direction. The isothermal-isobaric thermostat (Martyna et al., 1994) is used to equilibrate the system for 16 ns at 298K and 1 bar using a time step of 1 fs. Following the equilibration, the box size is scaled to be near 41.5 × 41.5 × 41.5 ˚A3. The simulation was run for a production period of 1.5 𝜇s in a canonical ensemble with a Nosé-Hoover thermostat (Nosé, 1984; Hoover, 1985). The numerical results of this work are presented in ˚A for length, picosecond for time, and gram per mole for mass. The resolved variable 𝑞 is defined as the distance between the bromine atom and the ipso-carbon atom. The free energy is obtained from the probability density function 𝜌(𝑞) (see the inset plot of Fig. 3.3(a)), i.e., 𝑈 (𝑞) = −𝑘 𝐵𝑇 ln 𝜌(𝑞), where 𝜌(𝑞) is directly obtained from the full MD samples 35 Figure 3.1 Left: A sketch of the molecule benzyl bromide. The resolved variable is defined as the distance between the bromine atom and the ipso-carbon atom. Right: The free energy of the resolved variable 𝑞. The error bar represent the 95% confidence interval. using the kernel density estimation. To verify the accuracy of the constructed 𝑈 (𝑞), we calculate the expectation of 𝑞∇𝑈 (𝑞) on the sample. The numerical result gives 0.996𝑘 𝐵𝑇 and is close to the theoretical prediction ⟨𝑞∇𝑈 (𝑞)⟩ = ∫ 𝑞∇𝑈 (𝑞)e−𝑈 (𝑞)/𝑘 𝐵𝑇 d𝑞 ≡ 𝑘 𝐵𝑇. 3.3.2 Limitation of the Standard GLE Let us start with the standard GLE by setting features 𝜙(q) ≡ I in Eq. (3.3), which capture the dynamics on the resolved scale considered in Refs. (Ayaz et al., 2021; Dalton et al., 2023). We right- multiply q(0) to Eq. (3.3) and compute the correlation functions, i.e., ℎ(𝑡) = ∫ 𝑡 where ℎ(𝑡) = (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]q𝑇 0 Θ(𝑡 − 𝜏)𝑐𝑣𝑞 (𝜏)d𝜏, (cid:11). The standard GLE kernel Θ(𝑡) (i.e., K(𝑡) in Eq. (3.1)) can 0 be obtained using the Fourier transform of the integral equation. If the reduced dynamics (3.1) can be simplified as the standard GLE, then 𝑐𝑣𝑞 (𝑡) should be accurately reproduced. Fig. 3.2 shows the prediction of 𝑐𝑣𝑞 (𝑡) from the standard GLE and the full MD model. The apparent deviations imply non-negligible state-dependency. To further probe this effect, we compute ℎ′′(𝑡; q∗) = 𝑔′(𝑡; q∗) conditional with different initial states q∗. Unlike a unified short-time correlation (i.e., 𝑔′(0; 𝑞∗) = −𝑘 𝐵𝑇Θ(0)/𝑚) predicted by the standard GLE, the large dispersion reveals the heterogeneous nature of the energy dissipation process. 36 3.153.63.901020 Figure 3.2 Correlation functions predicted by the standard GLE and the full MD: (a) Overall 𝑐𝑣𝑞 (𝑡) and (b) −𝑔′(𝑡; q∗) conditional with q∗ representing various initial states (gray lines), including two local minima and the saddle point (see inset of Fig. 3.3(a)). The large dispersion implies the limitation of the standard GLE, which predicts a single curve in short time. 3.3.3 GLE with State-dependent Memory To capture the state-dependent memory, we train the present model (3.3) with a different number of features. Fig. 3.3(a-b) shows the obtained encoder 𝜙(·) using one feature and Θ(𝑡) is scaled with Θ(0) = 1. 𝜙 exhibits apparent deviation from a uniform distribution. In particular, it shows a peak value near the saddle point 𝑞 = 3.65, implying a larger effective friction near the regime. This result supports a similar assumption in semi-analytical studies (Straus et al., 1993) on improving Kramers’ theory (Kramers, 1940). Also, it explains the short-time dispersion shown in Fig. 3.2, where 𝑔′(𝑡; q∗) at the saddle point is significantly larger than the local minima. Fig. 3.3(c-d) shows the obtained encoders {𝜙𝑖 (·)}𝑛 𝑖=1 with 𝑛 = 4 features and the diagonal components of Θ(𝑡). Compared with the case of 𝑛 = 1, the larger variation of 𝜙𝑖 enables a better representation of the state-dependent memory. Next, we examine the conditional correlations 𝑐𝑣𝑞 (𝑡; q∗) and 𝑐𝑣𝑣 (𝑡; q∗). As shown in Fig. 3.4, for both local minima and the saddle point, the predictions of the present model using four features agree well with the MD results. In contrast, the predictions of the standard GLE show apparent deviations for q∗ at the saddle point. Also, the present model using four features with a diagonal Θ(𝑡) (see Section 3.5.3) shows improved short-time predictions but remains insufficient for long-time 37 Figure 3.3 The state features 𝜙 and diagonal components of the matrix-valued kernel Θ(𝑡) for the present model with state-dependent memory (SD-GLE) trained using (a-b) one feature and (c-d) four features. Inset plots: (a) probability density function (PDF) of 𝑞, where 𝜙(𝑞) near the saddle point shows a peak; (b) Fourier modes of Θ(𝑡). correlations. This indicates the complex global variation of the memory term,which can not be represented by a simple state-dependent re-scaling of a kernel function; the non-Markovian coupling among multiple features is crucial to capture the heterogeneous energy dissipation over the full space. Finally, we examine the collective behavior related to molecule kinetics. Fig. 3.5(a) shows the position correlation 𝑐𝑞𝑞 (𝑡) characterizing the molecule conformation relaxation. Compared with the MD results, the standard GLE shows a significant underestimation of the relaxation time. This discrepancy is possibly due to the larger effective friction near the saddle point (see Fig. 3.3(a)), which essentially dampens the transition between the two local minima. The standard GLE overlooks such state-dependency and therefore yields a faster relaxation. This limitation is 38 Figure 3.4 Overall and conditional correlation functions predicted by the full MD and various reduced models for two local minima and the saddle point: (a-b) 𝑐𝑣𝑞 and (c-d) 𝑐𝑣𝑣. Shaded regimes represent the 95% confidence interval; same for Fig. 3.5. consistently reflected in the distribution of the transition time. For this system, the free energy barrier is approximately 3.5𝑘 𝐵𝑇 (see Fig. 3.1 right); the transition time is obtained from the simulation trajectories of the MD and various reduced models. As shown in Fig. 3.5(b), the standard GLE predicts a larger probability for the short transition time, indicating a smaller overall friction than the local (i.e., saddle point) value. Fortunately, the heterogeneous non-Markovianity can be faithfully retained in the present model. In particular, the constructed model using one feature yields a better prediction than the standard GLE. As we increase to four features, the predictions recover the MD results. 3.4 Summary In summary, to plan an optimal hiking trail on a mountain, a landscape map is generally insufficient; the local path roughness needs to be properly considered. Similarly, to predict the 39 Figure 3.5 Collective molecule behaviors predicted by the full MD and the various reduced models: (a) overall conformation relaxation and (b) distribution of the transition time between the two local minima. reduced dynamics of a multi-scale system, the state-dependent memory may need to be modeled to account for the heterogeneous energy dissipation arising from the unresolved dynamics, which, however, has been broadly overlooked. While the crucial role of the non-Markovian effect that complements the conservative free energy has been gradually recognized, the formulation of the memory term remains largely empirical (e.g., the standard GLE). The current work focuses on this caveat and presents a data-driven approach to learning such a stochastic reduced model beyond the standard GLE, where the complex state-dependent memory can be naturally encoded in the non-Markovian interactions among a set of features in terms of the resolved variables. The training does not rely on the explicit knowledge of the full model and only utilizes the trajectory samples, where the three-point correlations can be efficiently pre-computed. Numerical results of a molecule system demonstrate the crucial role of the state-dependent non-Markovianity on collective behavior, where the standard GLE shows limitations due to the over-simplified assumption of a homogeneous memory kernel. In contrast, the present model accurately predicts the molecule kinetics including the transition time distribution, and provides a reliable approach to simulate stochastic reduced dynamics of multiscale problems that faithfully retains the collective behaviors and rare event properties (E and Vanden-Eijnden, 2010) beyond empirical models. 40 3.5 Other Details 3.5.1 Mass Matrix of the Reduced Model Generally the mass matrix should depend on the resolved variables for the general cases, and we refer to Refs. (Ayaz et al., 2022a; Lee et al., 2019) for further discussions and the reduced dynamics with position-dependent mass. However, in this study we focus on the effect of the state-dependent non-Markovian memory on the collective behavior of complex systems. Therefore, we choose the coarse-grained resolved variables such that the corresponding mass matrix is a constant. Specifically, we define 𝑞 = ∥Q1 − Q2∥, where Q1 and Q2 are the atom coordinates of the full (cid:164)Q12/𝑞 and its model (see Fig. 3.1 and Sec. 3.3.1 for details). Accordingly, we have (cid:164)𝑞 = Q𝑇 12 covariance follows Q𝑇 12 (cid:164)Q12 (cid:164)Q𝑇 12Q12 (cid:29) Tr (cid:2)(Q12Q𝑇 (cid:29) 12)(cid:3) (cid:164)Q𝑇 12) ( (cid:164)Q12 (cid:29) (cid:16) Tr(Q12Q𝑇 12) 𝑀 −1 1 + 𝑀 −1 2 (cid:17) 𝑘 𝐵𝑇 (3.17) (cid:28) 1 𝑞2 (cid:28) 1 𝑞2 (cid:28) 1 𝑞2 ⟨ (cid:164)𝑞 (cid:164)𝑞⟩ = = = = (cid:16) 𝑀 −1 1 + 𝑀 −1 2 (cid:17) 𝑘 𝐵𝑇, where 𝑀1 and 𝑀2 represent the mass of two atoms and we have used the fact that the distribution of Q12 and (cid:164)Q12 are independent. Therefore, the mass matrix of 𝑞 is a constant 𝑀 ≡ 𝑀1𝑀2/(𝑀1 + 𝑀2). 3.5.2 Limitations of the Standard GLE near the Local Minima Fig. 3.6 shows the predictions of the conditional correlations 𝑐𝑞𝑣 (𝑡, 𝑞∗) and 𝑐𝑣𝑣 (𝑡, 𝑞∗) for 𝑞∗ representing the two local minima. Similar to the results of the saddle point (see Fig. 3.5(b)), the predictions of the standard GLE show apparent deviations from the full MD results due to the ignorance of the state-dependent memory nature. In contrast, the predictions of the present model with four features can accurately recover the MD predictions. 3.5.3 Other Forms of the State-dependent Memory Term For comparison, we also consider other forms of the reduced model. In particular, we retain the encoders 𝜙 in Eq. (3.3) but set Θ(𝑡) to be diagonal, i.e., we ignore the non-Markovian coupling among the different state features. The reduced model is trained using four features. Fig. 3.7 shows 41 Figure 3.6 The conditional correlation functions 𝑐𝑣𝑞 (𝑡, 𝑞∗) and 𝑐𝑣𝑣 (𝑡, 𝑞∗) for the two local minima predicted by the full MD, the standard GLE, and the present model (SD-GLE) constructed using one and four spatio-features. Left: 𝑞∗ = 3.07; Right: 𝑞∗ = 3.87. The predictions by the standard GLE show apparent discrepancies with the full MD results. Shaded regimes represent the 95% confidence interval. the conditional correlation 𝑐𝑣𝑣 (𝑡; 𝑞∗) obtained from the full MD and different reduced models. The prediction of the constructed model (labeled by “SD-GLE-Diag”) shows apparent deviations from the full MD result with incremental improvement over the stand GLE. The large discrepancy reveals the complex state-dependent nature; the non-Markovian effect can be neither approximated by ansatz like 𝛾(𝑞)𝜃 (𝑡) as a simple generalization/re-scaling of the initial value at 𝑡 = 0, nor represented by the coupling with the independent bath variables. Instead, the non-Markovian coupling among the various state-features retained in the present model plays a crucial for accurately modeling the heterogeneous energy dissipation arising from the unresolved intramolecular interactions and reproducing the collective dynamics. Furthermore, we note that Ref. (Vroylandt and Monmarché, 2022) develops an efficient approach 42 0123t-0.2-0.100.10.20.30.4MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.2-0.100.10.20.30.4MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.02-0.015-0.01-0.00500.0050.01MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.04-0.0200.020.040.06MDGLESD-GLE(n=1)SD-GLE(n=4) Figure 3.7 The conditional correlation functions 𝑐𝑣𝑣 (𝑡, 𝑞∗) for the saddle point predicted by the full MD, the standard GLE, the reduced models constructed using four state-features with diagonal Θ(𝑡) (SD-GLE-Diag) and full Θ(𝑡) (SD-GLE). The large discrepancy between the SD-GLE-Diag model and the full MD results implies the complexity of the state-dependency of the memory term, which can not be well represented by the coupling of independent bath variables. The non-Markovian interactions among the state-features are essential to capture the heterogeneous energy dissipation process. to compute the memory function based on the finite-rank approximation of the Zwanzig’s projection formalism. The method can be used to efficiently extract the state-dependent memory and probe the physics insights from the trajectory samples of the full MD model. The present work focuses on training a reduced model with heterogeneous memory that enables generating coherent noise and conducting stochastic simulations. Specifically, the memory term takes the form 𝐾 (𝑞(𝜏), 𝑡 − 𝜏) (with a simple change of variable 𝑠 = 𝑡 − 𝜏 following the notation) in Ref. (Vroylandt and Monmarché, 2022) and ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 − 𝜏) = 𝜙(𝑞(𝑡))𝑇 Θ(𝑡 − 𝜏)𝜙(𝑞(𝜏)) in the present work. In particular, by setting 𝑞(𝑡) = 𝑞(𝜏), the memory term in two forms should have similar prediction. Following this argument, we use the method in Ref. (Vroylandt and Monmarché, 2022) to calculate 𝐾 (𝑞(𝜏), 𝑡 −𝜏) and compare it with ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 −𝜏) of the present model with 𝑞(𝑡) = 𝑞(𝜏) = 𝑞∗. Fig. 3.8 shows the obtained memory functions for 𝑞∗ taking the saddle point and two local minima. The prediction by the two approaches show good agreement. 43 0123t-0.200.20.4MDGLESD-GLE-Diag(n=4)SD-GLE(n=4) Figure 3.8 The conservative force and memory kernel obtained from Ref. (Vroylandt and Monmarché, 2022) (labeled as Volterra Basis) and the present model. The memory term takes the form 𝐾 (𝑞(𝜏), 𝑡 − 𝜏) and ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 − 𝜏) = 𝜙(𝑞(𝑡))𝑇 Θ(𝑡 − 𝜏)𝜙(𝑞(𝜏)), respectively. Specifically, we set 𝑞(𝑡) = 𝑞(𝜏) = 𝑞∗; the predictions of the two models show good agreement for the saddle point and the two local minima. 3.5.4 Generalization of the Present Reduced Model Formulation So far, we have constructed the reduced model (3.3) by assuming the matrix-valued kernel Θ(𝑡) is symmetry. In fact, this form can be generalized by introducing an anti-symmetry part, i.e., Θ(𝑡) = e−𝛼𝑡 𝑁 𝜔∑︁ 𝑘=0 (Γ𝑇 𝑘,1Γ𝑘,1 + Γ𝑇 𝑘,2Γ𝑘,2) cos(𝜔𝑘𝑡) + (Γ𝑇 𝑘,1Γ𝑘,2 − Γ𝑇 𝑘,2Γ𝑘,1) sin(𝜔𝑘𝑡), (3.18) where Γ𝑘,1 and Γ𝑘,2 are lower-triangular matrices representing the Fourier modes of Θ(𝑡). The form is general non-symmetric except for 𝑡 = 0 and satisfies Θ(−𝑡) = Θ(𝑡)𝑇 . Similar to the symmetry form, we can model the fluctuation term R𝑡 as a noise in the form of R𝑡 = 𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡) is a Gaussian random process satisfying ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇e−𝛼(𝑡−𝜏)Θ(𝑡 − 𝜏). 𝜙𝑇 Similar to Prop. 3.2.1, we can show that this choice retains a consistent invariant density function. 44 33.54-20-10010SD-GLE(n=4)VolterraBasis0123time-1000100200300400SD-GLE(n=4)VolterraBasis0123time-1000100200300400500600SD-GLE(n=4)VolterraBasis0123time-200-1000100200300400SD-GLE(n=4)VolterraBasis In practice, we can generate the noise term (cid:101)R(𝑡) on [0, 𝑇] by (cid:101)R(𝑡) = 𝛽−1/2 2𝑁 ∑︁ (cid:104) 𝑘=0 𝑘,1(cid:101)Θ𝑇 𝑄1,𝑘 = (cid:101)Θ𝑘,2(cid:101)Θ−1 𝑘,2 𝑘,1 cos(𝜔𝑘𝑡)𝜉𝑘 + sin(𝜔𝑘𝑡) (𝑄1/2 (cid:101)Θ1/2 1,𝑘 𝜉𝑘 + 𝑄1/2 2,𝑘 𝜂𝑘 ) (cid:105) , 𝑄2,𝑘 = (cid:101)Θ𝑘,1 − (cid:101)Θ𝑘,2(cid:101)Θ−1 𝑘,1(cid:101)Θ𝑇 𝑘,2 , where 𝛽−1 = 𝑘 𝐵𝑇, (cid:101)Θ𝑘,1, (cid:101)Θ𝑘,2 are the Fourier cosine and sine modes on [−𝑇, 𝑇] with Θ(−𝑡) = Θ(𝑡)𝑇 , 𝜉𝑘 and 𝜂𝑘 are independent Gaussian random vectors, and 𝑁 is the total number of simulation step. Here (cid:101)R(𝑡) can still be generated using the Fast Fourier Transform algorithm (Cooley and Tukey, 1965) using 𝑂 (𝑁 log 𝑁) complexity. We will investigate this generalized formulation for model reduction in future studies. 45 CHAPTER 4 DEEP LEARNING-BASED NON-NEWTONIAN FLUID MODEL In this chapter, we focus on a micro-to-macro model named DeePN2 (Fang et al., 2022). A long standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time, the complex molecular structure and heterogeneous interaction. DeePN2, a deep learning-based non- Newtonian hydrodynamic model, has been proposed and has shown some success in systematically passing the micro-scale structural mechanics information to the macro-scale hydrodynamics for suspensions with simple polymer conformation and bond potential. The model retains a multi-scaled nature by mapping the polymer configurations into a set of symmetry-preserving macro-scale features. The extended constitutive laws for these macro-scale features can be directly learned from the kinetics of their micro-scale counterparts. In this paper, we develop DeePN2 using more complex micro-structural models. We show that DeePN2 can faithfully capture the broadly overlooked viscoelastic differences arising from the specific molecular structural mechanics without human intervention. 4.1 Introduction Accurate modeling of non-Newtonian hydrodynamics plays a central role in the modeling of the transport, diffusion, and synthesis processes in many scientific and engineering applications. Unlike simple fluids, non-Newtonian fluids may exhibit enormously complex flow behavior as a result of the micro-scale polymer dynamics. In particular, the polymer relaxation time often becomes comparable to the hydrodynamic time scale. As a result, the macro-scale fluid evolution can not be uniquely determined by the instantaneous flow field and the memory effect is generally important. To close the hydrodynamic equations, existing models are primarily based on the following two approaches. The first approach relies on empirical constitutive models (Larson, 1988; Owens and Phillips, 2002). Notable examples include the Hookean model (Oldroyd and Wilson, 1950; Lin et al., 2005), the FENE-P model (Peterlin, 1966; Bird et al., 1980), the Giesekus model (Giesekus, 46 1982), and the Phan-Thien and Tanner models (Thien and Tanner, 1977). Despite their popularity, the accuracy of these models is almost always in doubt. The second approach resorts to various sophisticated micro-macro coupling algorithms, e.g., by directly solving the Fokker-Planck equation, or sampling the polymer configuration via micro-scale simulations (Laso and Öttinger, 1993; Hulsen et al., 1997; Ren and E, 2005). While the effects of the polymer interaction can be carried over to the macro-scale model, the computational cost can be exceedingly large due to the retaining of the micro-scale description. Methods based on asymptotic analysis (Warner, 1972a) or the direct fitting of the strain-stress relationship (Zhao et al., 2018) are limited to simple flows such as the steady flow. Several semi-analytical approaches have been proposed (Grosso et al., 2000; Feng et al., 1998; Wang, 1997; Forest et al., 2003; Lielens et al., 1999; Yu et al., 2005; Hyon et al., 2008) using moment closure to approximate the micro-scale polymer configuration probability density function (PDF) and to derive the constitutive equations for the FENE dumbbell solution (Lielens et al., 1999; Yu et al., 2005; Hyon et al., 2008). However, these approaches are all based on restricted ansatz for the PDF and therefore are not reliable for more general flow regimes. To construct truly reliable and interpretable hydrodynamic models with molecular-level fidelity, it is essential to be able to efficiently code the information from the micro-scale interaction into the macro-scale transport equations. Ideally, the construction should meet the following requirements: • be interpretable; • be reliable – it should be accurate for all kinds of practical situations that one might encounter; • respect physical constraints, including symmetries and conservation laws; • be numerically robust and efficient. As a first step towards constructing models that meet these requirements, we developed a machine learning-based approach (Lei et al., 2020), “deep learning-based non-Newtonian hydrodynamic model” or DeePN2, that learns the non-Newtonian hydrodynamic model from the underlying micro-scale description of the dumbbell solution. Rather than approximating the closure with standard moments, DeePN2 finds a set of encoders, i.e., a set of macro-scale features that best represent the micro-scale dumbbell structure. It also finds accurate closed-form equation for these 47 macro-scale features. The constructed model retains a clear physical interpretation and accurately captures the nonlinear viscoelastic responses, where the conventional Hookean and FENE-P models show limitations. Beyond dumbbell suspensions, one major challenge towards constructing truly reliable hydrody- namic models arises from the heterogeneous polymer micro-structural mechanics. In this work, we aim to fill the gap by developing the generalized DeePN2 model for multi-bead polymer molecules with arbitrary structure and interaction. Firstly, with the proper design of the generalized micro-macro encoders and the machine learning-based symmetry-preserving constitutive dynamics, we demon- strate that the heterogeneous molecular structural-induced interaction can be systematically encoded into the macro-scale hydrodynamics. Unlike moment closure approximations, the encoders are not designed to recover the high-dimensional configuration PDF. Instead, they take an interpretable form and are learned to probe the optimal approximation of the polymer stress and constitutive dynamics. This essential difference enables DeePN2 to circumvent the high-dimensionality of the polymer configuration PDF. Secondly, the explicit form of the micro-macro encoders enables us to reliably learn the dynamics of the macro-scale features directly from the kinetic equations of their micro-scale analog. In this sense, this learning framework retains a multi-scaled nature where micro-scale interaction and physical constraints can be seamlessly inherited. Moreover, the learning only requires instantaneous micro-scale samples. This unique property differs from the common sophisticated data-driven approaches (Rudy et al., 2017; Schaeffer et al., 2018; Raissi et al., 2019b; Qin et al., 2019; Han et al., 2019b; Seryo et al., 2020; Huang et al., 2021), where time-derivative samples are often needed to learn the governing dynamics. This is particularly suited for multi-scale fluid models where accurate time-derivative samples may not be readily accessible. We demonstrate the power of the DeePN2 model for polymer molecules of three distinct shapes with training samples collected from one-dimensional (1D) homogeneous shear flow. Numerical results show that the broadly overlooked heterogeneous molecular structural mechanics plays an important role in the rheology of non-Newtonian fluids, which, fortunately, can be faithfully encoded into DeePN2. The constructed model successfully captures the hydrodynamics with different viscoelastic responses for 48 a variety of 1D and 2D flows when compared with the micro-scale simulation results. The present work also paves the way towards constructing truly reliable non-Newtonian hydrodynamic models for general 3D flows. 4.2 Methods 4.2.1 Micro-scale and continuum hydrodynamic models Let us start with the micro-scale description of the semi-dilute polymer suspension. We assume each molecule consists of 𝑁 particles with the position vector q = [q1; q2; · · · ; q𝑁 ], where q𝑖 ∈ R3 is the position of the 𝑖-th particle. The intramolecular potential energy 𝑉 (q) takes the form 𝑉 (q) = 𝑁𝑏∑︁ 𝑗=1 𝑉𝑏 (cid:0)|q 𝑗1 − q 𝑗2 |(cid:1) , 𝑉𝑏 (𝑙) = − (cid:34) 𝑙2 0 log 1 − 𝑘 𝑠 2 (cid:35) , 𝑙2 𝑙2 0 (4.1) where 𝑁𝑏 is the bond number and ( 𝑗1, 𝑗2) represents the indices of beads associated with the 𝑗-th bond. Without loss of generality, the individual bond interaction 𝑉𝑏 takes the form of the FENE potential (Warner, 1972b), where 𝑘 𝑠 is the spring constant and 𝑙0 is the maximum of the extension length. It is worth mentioning that the polymer molecule is not restricted to the dumbbell shape. Instead, it generally consists of multiple particles with arbitrary structure and bond connection. Fig. 4.1 shows a sketch of the polymer molecules with three different structures. As we will show, given the same form of the individual bond interaction 𝑉𝑏, the different polymer micro-structural mechanics leads to distinct non-Newtonian hydrodynamics. In principle, the viscoelastic response of the system is determined by the full micro-scale Figure 4.1 A sketch of 7-bead polymer molecules with chain-, star- and net-shaped structures (from left to right). The solid lines represent the FENE bond potential with the same interaction parameters. The dashed lines of the net-shaped molecule represent the three additional side chains connecting the polymer arms. While both the chain- and the star-shaped molecules are connected with six bonds; the suspensions exhibit different hydrodynamics due to the different micro-structural mechanics as shown below. 49 interaction. However, direct simulation for the full micro-scale interaction is often limited by the prohibited computational cost. Continuum hydrodynamics models based on various empirical constitutive models are often used, with the general form ∇ · u = 0, 𝜌 du d𝑡 = −∇𝑝 + ∇ · (τs + τp) + fext, (4.2) where 𝜌, u and 𝑝 represent the fluid density, velocity and pressure field, respectively. fext is the external body force and τs = 𝜂s(∇u + ∇u𝑇 ) is the solvent stress tensor with shear viscosity 𝜂𝑠. τp is the polymer stress tensor whose detailed form is generally unknown. To construct τp, the DeePN2 model seeks the approximation in terms of a set of macro-scale features c1, · · · , c𝑛, and simultaneously, the constitutive dynamics of these features, i.e., τp = G(c1, · · · , c𝑛), Dc𝑖 D𝑡 = H𝑖 (c1, · · · , c𝑛), 𝑖 = 1, · · · , 𝑛, (4.3a) (4.3b) where G and H𝑖 represent the stress and constitutive models, respectively. D D𝑡 denotes the objective tensor derivative. Eqs. (4.2) and (4.3) take the form similar to the conventional hydrodynamics. Instead of using empirical approximation to close the equation, we aim to construct a model directly from the micro-scale description (4.1) with the help of machine learning, such that the constructed model can naturally encode the molecular-specific interaction beyond empirical approximations with clear physical interpretation. 4.2.2 DeePN2 for arbitrary molecular structural mechanics To learn Eq. (4.2) from the full model (4.1), one essential problem lies in how to seamlessly pass the micro-scale interaction to the continuum model. To bridge the scales, we learn a set of micro-to-macro encoders, denoted by {b𝑖 (q)}𝑛 𝑖=1, such that the continuum modeling terms (e.g., the polymer stress τp) can be well approximated in terms of the corresponding macro-scale features {c𝑖 (q)}𝑛 (cid:205) 𝑗 ⟨q 𝑗 ⊗ ∇q 𝑗𝑉 (q)⟩, c𝑖 = ⟨b𝑖 (q)⟩, 𝑛p is the polymer number 𝑖=1 via Eq. (4.3a), where τp := 𝑛p density and ⟨·⟩ denotes the average with respect to the configuration PDF. In particular, the features 50 c𝑖 need to satisfy the proper invariant and symmetry conditions inherited from the encoders b𝑖 (·) such that the constructed continuum model can strictly preserve frame-indifference condition: (cid:101)τp = QτpQ𝑇 , G((cid:101)c1, · · · , (cid:101)c𝑛) = QG(c1, · · · , c𝑛)Q𝑇 , (4.4) where the superscript(cid:101)· denotes the corresponding values under an arbitrary orthogonal transformation by Q ∈ SO(3). To construct the encoder b(·), we note that the micro-scale potential 𝑉 (q) is translational and rotational invariant. Accordingly, let r∗(q) ∈ R3𝑁−6 (we consider the general case 𝑁 ≥ 3 here) denote the translational-rotational-invariant configuration vector and r(q) ∈ R3𝑁−3 denote the translational-invariant configuration vector consisting of 𝑁 − 1 linearly independent position vectors. Since 𝑁𝑏 ≥ 𝑁 − 1 for all molecules, one straightforward choice is the first 𝑁 − 1 bond connection vectors, i.e., r = [r1; r2; · · · ; r𝑁−1] , r 𝑗 = q 𝑗1 − q 𝑗2 r∗ = (cid:2)|r1| , |r2| , |r12| , |r3| , |r13| , |r23| , |r4| , |r24| , |r34| , · · · , |r𝑁−1| , (cid:12) 1 ≤ 𝑗 ≤ 𝑁 − 1, , (cid:12)r(𝑁−2)(𝑁−1) (4.5) (cid:3) , (cid:12) (cid:12) where r 𝑗 𝑘 := r 𝑗 − r𝑘 . We note that this form applies to general molecular structures; r determines the molecular structure up to translations. Specifically, r∗ represents the 3𝑁 − 6 degrees of freedom after eliminating translational and rotational degrees of freedom, and r suffices to fully determine the translational invariant polymer configuration and strictly retains the rotational symmetry in accordance with q, i.e., r 𝑗 (Qq) = Qr 𝑗 (q), r∗(Qq) = r∗(q). To preserve rotational symmetry, one straightforward approach is to represent b(·) in the linear 𝑗=1 . However, this choice yields the trivial macro-scale feature, i.e., (cid:10)r 𝑗 (cid:11) ≡ 0, space spanned by (cid:8)r 𝑗 (cid:9) 𝑁−1 due to the rotational symmetry. Alternatively, we construct the following second-order tensor c𝑖 = ⟨b𝑖 (r)⟩, b𝑖 = f𝑖f𝑇 𝑖 , 1 ≤ 𝑖 ≤ 𝑛, f𝑖 = 𝑔𝑖 (r∗) 𝑁−1 ∑︁ 𝑗=1 𝑤𝑖 𝑗 r 𝑗 , 51 (4.6) where [𝑤𝑖 𝑗 ]1≤𝑖≤𝑛,1≤ 𝑗 ≤𝑁−1 are the weights and {𝑔𝑖 (·)}𝑛 𝑖=1 is a set of scalar functions that encodes the polymer intramolecular interaction. Both terms will be learned from the micro-scale description and represented by deep neural networks (DNNs). Rotational symmetries can be naturally inherited, i.e., (cid:101)c = ⟨b((cid:101)r)⟩ ≡ QcQ𝑇 . Compared with the special form for dumbbell molecules in Ref. (Lei et al., 2020), Eq. (4.6) provides a general form of c applicable to multi-bead molecules of arbitrary structure since r and r∗ fully determine the 3𝑁 − 3 translational invariant polymer configuration. In the remaining of the paper, we will abuse the notation and denote b(q) as b(r). Besides the polymer stress model (4.3a), the remaining task to close Eq. (4.2) is the construction of the constitutive dynamics (4.3b) of the macro-scale features {c𝑖}𝑛 𝑖=1. There are two issues to deal with: the proper form of the objective time derivative of c𝑖 and the accurate estimation of their time evolution. In the literature, the objective tensor derivative, denoted by Dc𝑖 D𝑡 , is often chosen to take some heuristic forms (e.g. the convected (Oldroyd and Wilson, 1950) and corotational (Zaremba, 1903) forms). Moreover, the time-series samples collected from the micro-scale simulations are generally super-imposed with pronounced sampling error; direct estimation of the time derivative as was done in (Rudy et al., 2017; Raissi et al., 2019b; Seryo et al., 2020) will end with noisy data. Fortunately, both challenges are addressed in DeePN2 using an explicit micro-macro correspondence. The dynamics of c𝑖 can be derived from the its micro-scale correspondence b𝑖 (r) in the form of the micro-scale configuration r, i.e., d d𝑡 c𝑖 − κ : (cid:42)𝑁−1 ∑︁ 𝑗=1 (cid:43) r 𝑗 ⊗ ∇r 𝑗 ⊗ b𝑖 = 𝑘 𝐵𝑇 𝛾 (cid:42) 𝑁−1 ∑︁ (cid:43) 𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖 𝑗,𝑘=1 𝑁𝑏∑︁ (cid:42)𝑁−1 ∑︁ − 1 𝛾 𝐴 𝑗 𝑘 ∇r𝑘𝑉 (r1, · · · , r𝑁𝑏) · ∇r 𝑗 b𝑖 , (cid:43) (4.7) 𝑗=1 where κ := ∇u𝑇 , 𝛾 is the friction coefficient and r 𝑗 is the connection vector as defined in Eq. (4.5) for 𝑗 > 𝑁 − 1. We abuse the notation and denote 𝑉 (q) as 𝑉 (r1, · · · , r𝑁𝑏) = (cid:205)𝑁𝑏 𝑗=1 molecular structure and interaction are specified via A ∈ R𝑁𝑏×𝑁𝑏, which is defined by 𝑉𝑏 (𝑟 𝑗 ). The 𝑘=1 A = SS𝑇 , 𝑆 𝑗 𝑘 = +1, 𝑘 = 𝑗1, −1, 𝑘 = 𝑗2, 0, else    52 1 ≤ 𝑗 ≤ 𝑁𝑏, 1 ≤ 𝑘 ≤ 𝑁, (4.8) where 𝑗1 and 𝑗2 are the same notations as those in Eq. (4.1). We note that Eq. (4.7) only requires the first (𝑁 − 1) rows of A since the polymer configuration can be fully determined by r1, · · · , r𝑁−1. As a special case, if the molecule takes the chain shape, A recovers the standard Rouse matrix (Bird et al., 1987; Rouse, 1953). Eq. (4.7) defines the dynamics for the features {c𝑖}𝑛 𝑖=1, derived from their micro-scale correspon- dences. In particular, given the proposed form of the encoder functions (4.6), we can show that the two combined terms of the left-hand-side of Eq. (4.7) strictly preserve rotational symmetry (see Section 4.2.3). This leads to an important observation that the two combined terms provide the generalized form for the macro-scale objective tensor derivative Dc𝑖 D𝑡 . Unlike the heuristic choices in empirical models, the new form retains a clear micro-scale physical interpretation. Furthermore, all the modeling terms in the form of ⟨·⟩ can be directly evaluated using samples collected from the micro-scale simulations under the corresponding flow condition. This enables us to avoid estimating the time derivative values from the noise-prone time-series samples. Accordingly, the macro-scale constitutive dynamics takes the form dc𝑖 d𝑡 − κ : E𝑖 = 𝑘 𝐵𝑇 𝛾 H1,𝑖 (c1, · · · , c𝑛) − 1 𝛾 H2,𝑖 (c1, · · · , c𝑛), (4.9) where the individual terms will be represented by proper neural networks and parameterized by matching their micro-scale correspondences, i.e., E𝑖 (c1, · · · , c𝑛) = (cid:42)𝑁−1 ∑︁ r 𝑗 ⊗ ∇r 𝑗 ⊗ b𝑖 (cid:43) , (cid:43) 𝑗=1 (cid:42) 𝑁−1 ∑︁ H1,𝑖 (c1, · · · , c𝑛) = H2,𝑖 (c1, · · · , c𝑛) = 𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖 , (4.10) 𝑗,𝑘=1 (cid:42)𝑁−1 ∑︁ 𝑁𝑏∑︁ 𝑗=1 𝑘=1 𝐴 𝑗 𝑘 ∇r𝑘𝑉 (r1, · · · , r𝑁−1) · ∇r 𝑗 b𝑖 . (cid:43) 4.2.3 Rotational frame-indifference of the constitutive dynamics for the multi-bead encoder function We consider a polymer molecule consisting of 𝑁 particles. Let r = [r1; r2; · · · ; r𝑁−1] denote the 𝑖=1 q𝑖/𝑁 (cid:3) polymer configuration, so that there exists an invertible linear transformation between (cid:2)r; (cid:205)𝑁 53 and [q1; q2; · · · ; q𝑁 ], where q𝑖 is the position of the 𝑖-th particle. In fact, there are multiple choices for r, including the one we have applied in Eq. (4.5), where r consists of (𝑁 − 1) edges of a spanning tree in the bead-bond structure. We consider a second-order tensor taking the general form b = f (1) (r)f (2) (r)𝑇 , f (1) (r) = 𝑁−1 ∑︁ 𝑗=1 𝑔(1) 𝑗 (r∗)r 𝑗 , f (2) (r) = 𝑁−1 ∑︁ 𝑗=1 𝑔(2) 𝑗 (r∗)r 𝑗 , (4.11) where r∗ is a translational-rotational-invariant vector and 𝑔(1) and 𝑔(2) are two scalar functions. We note that the encoder in the form of Eq. (4.11) is more general than Eq. (4.6). In this section and the next, we consider two frames: frame 1 is static inertial, and frame 2 is rotating with respect to frame 1 with an time dependent orthogonal transformation Q(𝑡). Let ˜x, ˜v, ˜b and x, v, b denote the positions, velocities, and second-order tensors in frame 1 and 2 respectively. They have the following relations: ˜x = Qx, ˜v = Qv + (cid:164)Qx, ˜b = QbQ𝑇 . The material derivatives in both frames are d d𝑡 (cid:12) (cid:12) (cid:12) (cid:12)frame 1 := 𝜕 𝜕𝑡 + ˜v · ∇˜x, d d𝑡 (cid:12) (cid:12) (cid:12) (cid:12)frame 2 := 𝜕 𝜕𝑡 + v · ∇x. Proposition 4.2.1. With b defined by Eq. (4.11), we have (4.12) (4.13) d d𝑡 c − κ : (cid:42)𝑁−1 ∑︁ 𝑗=1 (cid:43) r 𝑗 ⊗ ∇r 𝑗 ⊗ b = 𝑘 𝐵𝑇 𝛾 (cid:42) 𝑁−1 ∑︁ (cid:43) 𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b 𝑗,𝑘=1 𝑁𝑏∑︁ (cid:42)𝑁−1 ∑︁ 𝑗=1 𝑘=1 − 1 𝛾 (cid:43) (4.14) 𝐴 𝑗 𝑘 ∇r𝑘𝑉p(r) · ∇r 𝑗 b , obeys rotational symmetry. Proof. Let us choose the vector r∗ = (cid:2)|r1|, |r2|, |r12|, |r3|, |r13|, |r23|, · · · , |r𝑁−2,𝑁−1|(cid:3). Denote by 𝑖 the 𝑖-th element of r∗ and r∗ 𝑟 ∗ 𝑖 the corresponding the 3-dimensional vector, i.e., 𝑟 ∗ 6 = |r23| and r∗ 6 = r23. Following Eq. (4.11), b consists of b = 𝑁−1 ∑︁ 𝑗,𝑘=1 b 𝑗 𝑘 , b 𝑗 𝑘 = 𝑔(r∗)r 𝑗 r𝑇 𝑘 , (4.15) 54 where 𝑔(r∗) denotes 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗) for simplicity. With this general form, we have d d𝑡 (cid:10) ˜b 𝑗 𝑘 (cid:11) (cid:12) (cid:12)frame 1 = (cid:164)Q (cid:10)b 𝑗 𝑘 (cid:11) Q𝑇 + Q (cid:10)b 𝑗 𝑘 (cid:11) (cid:164)Q𝑇 + Q d d𝑡 (cid:10)b 𝑗 𝑘 (cid:11) (cid:12) (cid:12)frame 2Q𝑇 . (4.16) Moreover, we note that ˜κ : (cid:32)𝑁−1 ∑︁ 𝑖=1 (cid:33) ˜r𝑖 ⊗ ∇ ˜r𝑖 ⊗ ˜b 𝑗 𝑘 = = = 𝑁−1 ∑︁ 𝑖=1 𝑁−1 ∑︁ 𝑖=1 𝑁−1 ∑︁ 𝑖=1 (cid:104)(cid:16) QκQ𝑇 + (cid:164)QQ𝑇 (cid:17) (cid:105) · Qr 𝑗 · Q ⊗ ∇r𝑖 ⊗ (cid:16) Qb 𝑗 𝑘 Q𝑇 (cid:17) (κ · r𝑖) · ∇r𝑖 (cid:16) Qb 𝑗 𝑘 Q𝑇 (cid:17) + (Q𝑇 (cid:164)Qr𝑖) · ∇r𝑖 (cid:16) Qb 𝑗 𝑘 Q𝑇 (cid:17) Q(κ · r𝑖) · ∇r𝑖 b 𝑗 𝑘 Q𝑇 + Q (cid:16) Q𝑇 (cid:164)Qb 𝑗 𝑘 + b 𝑗 𝑘 (cid:164)Q𝑇 Q (cid:17) Q𝑇 + Q (cid:32)𝑁−1 ∑︁ 𝑖=1 𝑇 ( (cid:164)Q𝑇 Q)∇r𝑖 𝑔(r∗) r𝑖 (cid:33) r 𝑗𝑟𝑇 𝑘 Q𝑇 = 𝑁−1 ∑︁ 𝑖=1 Q(κ · r𝑖) · ∇r𝑖 b 𝑗 𝑘 Q𝑇 + (cid:164)Qb 𝑗 𝑘 Q𝑇 + Qb 𝑗 𝑘 (cid:164)Q𝑇 , (4.17) where we have used r𝑖 𝑇 ( (cid:164)Q𝑇 Q)r𝑖 ≡ 0 since (cid:164)Q𝑇 Q is anti-symmetric. Eq. (4.16) and Eq. (4.17) shows that the combination of the two terms on the left-hand-side of Eq. (4.14) rigorously preserve the rotational symmetry, i.e., (cid:32) d d𝑡 (cid:10) ˜b(cid:11) − ˜κ : 𝑁−1 ∑︁ 𝑖=1 (cid:10) ˜r𝑖 ⊗ ∇ ˜r𝑖 ⊗ ˜b(cid:11) (cid:33)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12)frame 1 ≡ Q (cid:32) d d𝑡 ⟨b⟩ − κ : 𝑁−1 ∑︁ 𝑖=1 (cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) (cid:33)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12)frame 2 Q𝑇 . It is straightforward to prove rotational symmetry for the other terms in Eq (4.14). 4.2.4 Symmetry-preserving DNN models To complete the DeePN2 model, we need to specify the DNN models. These DNN models should also strictly preserve rotational symmetry. Different from the rotational-invariant scalar stress model considered in Ref. (Zhou et al., 2021), the second-order tensors G, H1,𝑖, H2,𝑖 need to satisfy the symmetry condition (4.4) and the fourth-order tensors E𝑖 need to retain the objectivity of Dc𝑖 D𝑡 . However, there does not exist such a reference frame in which these symmetry constraints can be satisfied by the macro-scale modeling terms. 55 To handle this problem, we consider the eigen-space of the feature c1 with a fixed form of the encoder b1(·), e.g., by setting 𝑔1(·) = 𝑤1,: ≡ 1 and let other b𝑖 (·) involved in the training. Consider the eigen-decomposition c1 = UΛU𝑇 has distinct eigenvalues, where U is the matrix whose columns are the eigenvectors of c1. U is not unique due to the non-uniqueness of the eigenvectors. Without loss of generality, we further assume that the first element of u1 to be positive. With the following lemma, we show that the general form of U can be always written as U( 𝑗) := U𝑆( 𝑗) with 𝑗 = 1, · · · , 4, where 𝑆( 𝑗) is given by 𝑆(1) = +1 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) +1 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) +1 , 𝑆(2) = +1 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) −1 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) +1 , 𝑆(3) = +1 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) +1 (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) −1 , 𝑆(4) = +1 (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) −1 . (cid:170) (cid:174) (cid:174) (cid:174) (cid:174) (cid:174) (cid:172) −1 Lemma 4.2.2. For a symmetry matrix 𝑀 ∈ R3×3, let 𝑆𝑀 denote the set of matrices with the transformation of 𝑆( 𝑗), i.e., 𝑆𝑀 := (cid:8)𝑆(1) 𝑀𝑆(1), · · · , 𝑆(4) 𝑀𝑆(4)(cid:9). For any 𝑀 ( 𝑗) := 𝑆( 𝑗) 𝑀𝑆( 𝑗) ∈ 𝑆𝑀, 𝑆(𝑘) 𝑀 ( 𝑗)𝑆(𝑘) ∈ 𝑆𝑀, 1 ≤ 𝑗, 𝑘 ≤ 4. Furthermore, 𝑆𝑀 can be constructed by 𝑀 ( 𝑗), i.e., 𝑆𝑀 ≡ (cid:8)𝑆(1) 𝑀 ( 𝑗)𝑆(1), · · · , 𝑆(4) 𝑀 ( 𝑗)𝑆(4)(cid:9). Proof. By applying 𝑆( 𝑗) to 𝑀, it is easy to see that the diagonal part of 𝑀 ( 𝑗) remains the same. Since 𝑀 ( 𝑗) is also symmetric, we only need to check the upper-triangular part, taking the four possible operations ∗ + + (cid:170) (cid:174) (cid:174) ∗ + (cid:174) (cid:174) (cid:174) (cid:172) ∗ (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) ∗ − + (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:170) (cid:174) (cid:174) ∗ − (cid:174) (cid:174) (cid:174) (cid:172) ∗ ∗ + − (cid:170) (cid:174) (cid:174) ∗ − (cid:174) (cid:174) (cid:174) (cid:172) ∗ (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) (cid:169) (cid:173) (cid:173) (cid:173) (cid:173) (cid:173) (cid:171) ∗ − + (cid:170) (cid:174) (cid:174) ∗ − (cid:174) (cid:174) (cid:174) (cid:172) ∗ , where “+” represents that the element remains the same and “−” represents a sign change. We see that number of “−” operations is either 0 or 2. Starting from any of the above choice for 𝑀 ( 𝑗), all of the four operators yields either 0 or 2 “−” operations. Therefore, 𝑆(𝑘) 𝑀 ( 𝑗)𝑆(𝑘) ∈ 𝑆𝑀. Furthermore, if the upper triangular part of 𝑀 has distinct absolute values, then ∀𝑀 ( 𝑗), 𝑆𝑘 𝑀 𝑗 𝑆𝑘 ≠ 𝑆𝑘 ′ 𝑀 𝑗 𝑆𝑘 ′ with 𝑘 ≠ 𝑘′, hence 𝑆𝑀 can be constructed by 𝑀 𝑗 . Otherwise, if some upper triangular entries of 𝑀 share the same absolute value, we can draw the same conclusion accordingly. 56 Now we consider the matrix whose columns are the eigenvectors of ˜c1 = Qc1Q𝑇 , denoted by ˜U. We can write ˜U = QU𝑆( 𝑗), where 𝑗 ∈ {1, 2, 3, 4}. Accordingly, the DNN input of c𝑖 takes the form ˜U𝑇 ˜c𝑖 ˜U = (cid:16) QU𝑆( 𝑗)(cid:17)𝑇 Qc𝑖Q𝑇 (cid:16) QU𝑆( 𝑗)(cid:17) = 𝑆( 𝑗)U𝑇 c𝑖U𝑆( 𝑗). Let 𝑀 = U𝑇 c𝑖U, by using Lemma 4.2.2, it is easy to see that 𝑆U𝑇 c𝑖U can be constructed by taking 𝑗 = 1, · · · , 4. Proposition 4.2.3. Let U be the matrix whose columns are the eigenvectors of c1. Let the DNN input be ˆc( 𝑗) 𝑖 = 𝑆( 𝑗)U𝑇 c𝑖U𝑆( 𝑗). The following form of τp G(c1, · · · , c𝑛) = 1 4 4 ∑︁ 𝑗=1 U( 𝑗) ˆG(ˆc( 𝑗) 1 , · · · , ˆc( 𝑗) 𝑛 )U( 𝑗)𝑇 , U( 𝑗) = U𝑆( 𝑗). (4.18) satisfies the rotational symmetry constraint (4.4). Finally, to account for the swap of the eigenvectors when the eigenvalues cross over, we consider the 6 permutations of the three eigenvalues of c1, i.e., G(c1, · · · , c𝑛) = 1 24 5 ∑︁ 4 ∑︁ 𝑘=0 𝑗=1 U( 𝑗,𝑘) ˆG(ˆc( 𝑗,𝑘) 1 , · · · , ˆc( 𝑗,𝑘) 𝑛 )U( 𝑗,𝑘)𝑇 , (4.19) where 𝑘 represents the rank of permutation (e.g., in lexicographical order) and U( 𝑗,𝑘) is a variation of U( 𝑗) with corresponding column permutation. During simulation, the eigenvalues of c1 may cross each other. To account for this, we consider all the 6 permutations of the three eigenvalues, i.e., G(c1, · · · , c𝑛) = 1 24 5 ∑︁ 4 ∑︁ 𝑘=0 𝑗=1 U( 𝑗,𝑘) ˆG(ˆc( 𝑗,𝑘) 1 , · · · , ˆc( 𝑗,𝑘) 𝑛 )U( 𝑗,𝑘)𝑇 , (4.20) where 𝑘 represents the rank of permutation (e.g., in lexicographical order) and U( 𝑗,𝑘) is a variation of U( 𝑗) with corresponding column permutation. Furthermore, to avoid the eigenvector degeneracy, we set a threshold value 𝜖 for the eigenvalues. When two eigenvalues approach each other, e.g., |𝜆2 − 𝜆3| < 𝜖, we freeze all the eigenvectors until |𝜆2 − 𝜆3| ≥ 𝜖. In this work, we take 𝜖 = 10−3, and we refer to Section 4.3.6 for detailed numerical studies. Eq. (4.20) provides the rotation-symmetric form for the second-order stress tensor G, where ˆG is represented by DNNs. The constitutive model terms H1,𝑖 and H2,𝑖 can be constructed in a similar 57 manner. Finally, we can show the fourth-order tensors {E𝑖}𝑛 𝑖=1 associated with the encoders (4.6) can be constructed in the form 9 ∑︁ κ : E𝑖 = κc𝑖 + c𝑖κ𝑇 + κ : (cid:169) (cid:173) (cid:171) 𝑗=1 , 1,𝑖 ⊗ E( 𝑗) E( 𝑗) 2,𝑖 (cid:170) (cid:174) (cid:172) (4.21) where E( 𝑗) 1,𝑖 and E( 𝑗) 2,𝑖 are second-order tensors which respect the symmetry condition (4.4) and can be constructed in the form of Eq. (4.20) (see Prop. 4.2.4). The constructed DeePN2 model takes the form similar to the general hydrodynamic equations (4.2) and (4.3), where some of the model terms are represented by DNNs in the form of Eqs. (4.20) and (4.21). Proposition 4.2.4. The following ansatz of (cid:10)(cid:205)𝑁−1 𝑖=1 r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) ensures that the dynamic of evolution of c retains rotational invariance. 𝑁−1 ∑︁ 𝑖=1 (cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) = 𝑁−1 ∑︁ (cid:68) 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗ r 𝑗 r𝑇 𝑘 (cid:69) 𝑗,𝑘=1 + 9 ∑︁ 𝑘=1 E(𝑘) 1 (c) ⊗ E(𝑘) 2 (c), (4.22) where c = (c1, · · · , c𝑛), ˜c = (˜c1, · · · , ˜c𝑛), and E1 and E2 satisfy ˜E1 := E1(˜c) = QE1(c)Q𝑇 , ˜E2 := E2(˜c) = QE2(c)Q𝑇 . (4.23) Proof. Without loss of generality, we represent the fourth order tensor by the following two bases F1(c) ⊗ F2(c) ⊗ F3(c) + F3(c) ⊗ (F2(c) ⊗ F1(c))𝑇{2,3} , F1(c), F3(c) ∈ R3, F2(c) ∈ R3×3, E1(c) ⊗ E2(c), E1(c), E2(c) ∈ R3×3, (4.24) where the super-script 𝑇{2,3} represents the transpose between the 2nd and 3rd indices; also F1, F2, F3, E1 and E2 satisfy the symmetry conditions F1(˜c) = QF1(c), F3(˜c) = QF3(c), E1(˜c) = QE1(c)Q𝑇 , E2(˜c) = QE2(c)Q𝑇 , F2(˜c) = QF2(c)Q𝑇 . For the term E1(c) ⊗ E2(c), we have (4.25) κ : E1(c) ⊗ E2(c) = Tr(κE1(c))E2(c) (4.26) 58 and ˜κ : ˜E1 ⊗ ˜E2 (cid:12) (cid:12)frame 1 = (cid:16) QκQ𝑇 + (cid:164)QQ𝑇 (cid:17) : (cid:16) QE1(c)Q𝑇 ⊗ ˜E2 (cid:17) = Tr(κE1(c)) ˜E2 + Tr( (cid:164)QQ𝑇 QE1(c)Q𝑇 ) ˜E2 = Tr(κE1(c)) ˜E2 (cid:16) ≡ Q κ : E1(c) ⊗ E2(c)(cid:12) (cid:12)frame 2 (cid:17) Q𝑇 , (4.27) where we have used Tr( (cid:164)QQ𝑇 ) ≡ 0. For the term F1(c) ⊗ F2(c) ⊗ F3(c) + F3(c) ⊗ (F2(c) ⊗ F1(c))𝑇{2,3} , we have κ : F1(c) ⊗ F2(c) ⊗ F3(c) = F2(c)𝑇 κF1(c)F3(c)𝑇 (4.28) and ˜κ : ˜F1 ⊗ ˜F2 ⊗ ˜F3 = QF2(c)𝑇 κF1(c)F3(c)𝑇 Q𝑇 + QF2(c)𝑇 Q𝑇 (cid:164)QF1(c)F3(c)𝑇 Q𝑇 . (4.29) On the other hand, we note that d ˜b d𝑡 (cid:12) (cid:12)frame 1 = (cid:164)QbQ𝑇 + Qb (cid:164)Q𝑇 + Q db d𝑡 (cid:12) (cid:12)frame 2Q𝑇 . (4.30) To ensure the rotational symmetry of Db D𝑡 , we have F2 ≡ I, ∑︁ 𝑖 F(𝑖) 1 ⊗ I ⊗ F(𝑖) 3 = 𝑁−1 ∑︁ (cid:68) 𝑗,𝑘=1 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗)r 𝑗 ⊗ I ⊗ r𝑘 (cid:69) . (4.31) Hence, we have d d𝑡 ˜c − ˜κ : (cid:32) ≡ Q (cid:32) ∑︁ 𝑖 ˜F(𝑖) 1 ⊗ ˜F(𝑖) 2 ⊗ ˜F(𝑖) 3 + ˜F(𝑖) 3 ⊗ (cid:16) ˜F(𝑖) 2 ⊗ ˜F(𝑖) 1 (cid:17)𝑇{2,3} d d𝑡 c − κ : (cid:32) ∑︁ 𝑖 F(𝑖) 1 ⊗ F(𝑖) 2 ⊗ F(𝑖) 3 + F(𝑖) 3 ⊗ (cid:16) F(𝑖) 2 ⊗ F(𝑖) 1 (cid:33)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12)frame 1 (cid:17)𝑇{2,3} (cid:33)(cid:33)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12)frame 2 (4.32) Q𝑇 . Furthermore, using Eq. (4.31), we obtain ∑︁ F(𝑖) 1 ⊗ F(𝑖) 2 ⊗ F(𝑖) 3 + F(𝑖) 3 ⊗ (cid:16) F(𝑖) 2 ⊗ F(𝑖) 1 (cid:17)𝑇{2,3} 𝑖 = 𝑁−1 ∑︁ (cid:68) 𝑗,𝑘=1 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗ r 𝑗 r𝑘 𝑇 (cid:69) . (4.33) 59 Accordingly, the remaining part of (cid:205)𝑁−1 𝑖=1 (cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) is expanded by (cid:42)𝑁−1 ∑︁ 𝑖=1 r𝑖 ⊗ ∇r𝑖 𝑁−1 ∑︁ 𝑗,𝑘=1 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗) ⊗ r 𝑗 r𝑇 𝑘 (cid:43) = 9 ∑︁ 𝑖=1 E(𝑖) 1 (c) ⊗ E(𝑖) 2 (c). (4.34) Combining Eq. (4.32), (4.33) and (4.34), we conclude that the decomposition 𝑁−1 ∑︁ 𝑖=1 (cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) = 𝑁−1 ∑︁ (cid:68) 𝑗,𝑘=1 𝑔(1) 𝑗 (r∗)𝑔(2) 𝑘 (r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗r 𝑗 r𝑇 𝑘 (cid:11) + 9 ∑︁ 𝑘=1 E(𝑘) 1 (c) ⊗ E(𝑘) 2 (c) ensures the objectivity of the time-derivative of c. 4.3 Numerical results (4.35) The present DeePN2 model is trained using micro-scale samples collected from the homogeneous shear flow. We demonstrate the model accuracy and generalization ability by considering various flows in comparison with the results of the micro-scale simulations for the suspensions with three different polymer structural models as shown in Fig. 4.1. As we will see, the micro-scale structure does play an important role in the viscoelastic response. We will use this to examine the DeePN2 model fidelity. 4.3.1 Micro-scale model of the polymer solutions In the present study, we consider suspensions with three different polymer structures as shown in Fig. 4.1. Each polymer molecule consists of 𝑁 = 7 beads connected with 𝑁𝑏 FENE bonds, i.e., 𝑉 (q) = 𝑁𝑏∑︁ 𝑗=1 𝑉𝑏 (cid:0)|q 𝑗1 − q 𝑗2 |(cid:1) , 𝑉𝑏 (𝑙) = − (cid:34) 𝑙2 0 log 1 − 𝑘 𝑠 2 (cid:35) , 𝑙2 𝑙2 0 (4.36) where 𝑘 𝑠 represents the spring constant and 𝑙0 is the maximum of the extension length. The chain- and star-shaped molecules have 𝑁𝑏 = 6 bonds with the same bond parameters 𝑘 𝑠 = 0.1 and 𝑙0 = 2.3 (in reduced unit). The net-shaped molecule is similar to the star-shaped molecule with the same parameters for the first 6 bonds; 3 additional bonds connect the side chain particles with 𝑘 𝑠 = 0.1 and 𝑙0 = 3.7. The polymer number density of the three suspensions is 𝑛p = 0.3. The solvent is modeled by the dissipative particle dynamics (DPD) (Hoogerbrugge and Koelman, 1992; Groot and 60 Warren, 1997) with number density 𝑛𝑠 = 4.0. The pairwise interaction between particle 𝑖 and 𝑗 takes the standard form F𝑖 𝑗 = F𝐶 𝑖 𝑗 + F𝐷 𝑖 𝑗 + F𝑅 𝑖 𝑗 , F𝐶 𝑖 𝑗 = F𝐷 𝑖 𝑗 =    −𝛾𝑤𝐷 (𝑟𝑖 𝑗 )(v𝑖 𝑗 · e𝑖 𝑗 )e𝑖 𝑗 , 𝑟𝑖 𝑗 < 𝑟𝑐 0, 𝑟𝑖 𝑗 > 𝑟𝑐 , F𝑅 𝑖 𝑗 = 𝑎(1.0 − 𝑟𝑖 𝑗 /𝑟𝑐)e𝑖 𝑗 , 𝑟𝑖 𝑗 < 𝑟𝑐 , 0, 𝑟𝑖 𝑗 > 𝑟𝑐 𝜎𝑤 𝑅 (𝑟𝑖 𝑗 )𝜉𝑖 𝑗 e𝑖 𝑗 , 𝑟𝑖 𝑗 < 𝑟𝑐 , 0, 𝑟𝑖 𝑗 > 𝑟𝑐       where r𝑖 𝑗 = r𝑖 − r 𝑗 , 𝑟𝑖 𝑗 = |r𝑖 𝑗 |, e𝑖 𝑗 = r𝑖 𝑗 /𝑟𝑖 𝑗 , and v𝑖 𝑗 = v𝑖 − v 𝑗 , 𝜉𝑖 𝑗 are independent identically distributed (i.i.d.) Gaussian random variables with zero mean and unit variance. 𝛾 and 𝜎 are related with the system temperature by the second fluctuation-dissipation theorem (Español and Warren, 1995) as 𝜎2 = 2𝛾𝑘 𝐵𝑇, where 𝑘 𝐵𝑇 is set to 0.25. The detailed parameters are given in Tab. 4.1. Table 4.1 Parameters (in reduced unit) of the micro-scale model of the polymer solution (S-solvent, P-polymer). 𝑎 S-S 4.0 S-P 0.0 P-P 4.0 𝛾 5.0 40.0 0.01 𝜎 1.58 4.47 0.071 𝑘 0.25 0.0 1.0 𝑟𝑐 1.0 1.0 0.7 4.3.2 Collecting training samples Collecting training samples is one of the most important steps in the construction of DeePN2. To obtain reliable models, we need to ensure that the training sample set is representative enough of all the practical situations that the model is intended for. In the present study, we collect the training samples in shear flow with shear rate (cid:164)𝛾 ∈ [0, 0.09]. Since the training of the DeePN2 model only requires discrete polymer configurations rather than time-series samples, one convenient approach is to consecutively increase the shear rate and collect the discrete configurations during the shear extension and relaxation process, where the inclusion of the relaxation process can facilitate the sampling of polymer configuration phase space due to the viscoelastic hysteresis effect. 32000 samples are collected where each sample consists of 5000 polymer configurations, which will be employed to evaluate the constitutive dynamics terms ⟨·⟩. Due to the permutation symmetry of the 61 the particle label, the effective number of configurations per sample is 1 × 104 for the chain-shaped molecule and 3 × 104 for the star- and net-shaped molecules. 4.3.3 Training procedure The DeePN2 model is constructed via the training of the NN representations of the encoder 𝑗=1, stress model G, evolution dynamics (cid:8)H1, 𝑗 (cid:9)𝑛 mappings (cid:8)𝑔 𝑗 (r∗)(cid:9)𝑛 𝑗=1 and the 4th order tensors (cid:8)E 𝑗 (cid:9)𝑛 𝑗=1 of the objective tensor derivatives. In this study, we choose 𝑛 = 3 encoders and fix 𝑔1(r∗) ≡ 1. For the chain-shaped molecule, we set 𝑤1,𝑖 = 1 − 𝑖/𝑁, 1 ≤ 𝑖 ≤ 𝑁 − 1 and (cid:205)𝑖 𝑤1,𝑖r𝑖 𝑗=1, (cid:8)H2, 𝑗 (cid:9)𝑛 represents the orientation between the free-end particle and the center of mass. For the star- and net-shaped molecules, we set 𝑤1,1 = 1 and 𝑤1,𝑖 = 0 for 𝑖 ≥ 2. All terms are represented by the fully connected NN. The number of hidden layers are set to be (120, 120, 120), (300, 300, 300), (400, 400, 400), (450, 450, 450), (560, 560, 560), respectively. The activation function is taken to be the hyperbolic tangent. We emphasize that the mappings (cid:8)𝑔 𝑗 (r∗)(cid:9)𝑛 𝑗=1 and weights 𝑤 ∈ R𝑛×(𝑁−1) involve in the training process for the joint learning of the encoders (cid:8)b 𝑗 (r)(cid:9)𝑛 𝑗=1 defined in Eq. (4.6) and the macro-scale features (cid:8)c 𝑗 (cid:9)𝑛 𝑗=1, although they do not appear explicitly in the macro-scale hydrodynamic equations. The DNNs are trained by the Adam stochastic gradient descent method (Kingma and Ba, 2015) for 20 epochs, using 5 samples per batch size. The initial learning rate is 2.8 × 10−4 and decay rate is 0.75 per 20000 steps. Similar to Ref. (Lei et al., 2020), the loss function is defined by 𝐿 = 𝜆𝐺 𝐿𝐺 + 𝜆𝐻1 𝐿𝐻1 + 𝜆𝐻2 𝐿𝐻2 + 𝜆E 𝐿E, where 𝜆𝐺 = 0.2, 𝜆𝐻1 = 0.1, 𝜆𝐻2 = 0.6 and 𝜆E = 0.1 are hyperparameters. For each training batch of 𝑚 training samples, 𝐿𝐺, 𝐿𝐻1, 𝐿𝐻2, 𝐿E of the system are given by 62 (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) 𝐿𝐺 = 𝐿𝐻1 = 𝐿𝐻2 = 𝑚 ∑︁ 𝑛 ∑︁ 𝑙=1 𝑖=1 𝑚 ∑︁ 𝑛 ∑︁ 𝑙=1 𝑖=1 𝑚 ∑︁ 𝑛 ∑︁ 𝑙=1 𝑖=1 G𝑖 (c(𝑙)) − (cid:42) 𝑁𝑏∑︁ 𝑘=1 r𝑘 ⊗ ∇r𝑘𝑉 2 (cid:43) (𝑙)(cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) H1,𝑖 (c(𝑙)) − (cid:42) 𝑁−1 ∑︁ 𝑗,𝑘=1 𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖 H2,𝑖 (c(𝑙)) − (cid:42)𝑁−1 ∑︁ 𝑁𝑏∑︁ 𝑗=1 𝑘=1 𝐴 𝑗 𝑘 ∇r𝑘𝑉 · ∇r 𝑗 b𝑖 2 (cid:43) (𝑙)(cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:43) (𝑙)(cid:13) 2 (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (4.37) 9 ∑︁ 𝑛 ∑︁ 𝑚 ∑︁ 𝐿E = (cid:43) (𝑙)(cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) , · · · , c(𝑙) where ∥ · ∥2 denotes the total sum of squares of the entries in the tensor, and c(𝑙) = (c(𝑙) 𝑛 ). 1 1,𝑖 (c(𝑙)) ⊗ E(𝑠) E(𝑠) 2,𝑖 (c(𝑙)) − 𝑤𝑖 𝑗 𝑤𝑖 𝑗 ′r 𝑗 r 𝑗 ′ r𝑘 ⊗ ∇r𝑘 𝑔2 (cid:42)𝑁−1 ∑︁ 𝑁−1 ∑︁ 𝑗, 𝑗 ′=1 𝑖 ⊗ 𝑘=1 𝑠=1 𝑙=1 𝑖=1 𝑇 2 , 4.3.4 Numerical Result of Reverse Poiseuille flow First, we consider the reverse Poiseuille flow in a 60 × 100 × 60 domain (in reduced unit) with the opposite body force fext = (0.016, 0, 0) applied to each half of the domain divided by the plane 𝑦 = 50 starting from 𝑡 = 0. At 𝑡 = 800, the external force is removed. The relaxation process of the flow field is recorded until the total simulation time 𝑡 = 1600. For all the three systems, the predictions from DeePN2 agree well with the micro-scale simulations results, as shown in Fig. 4.2. In particular, the flow velocity fields of the three systems are nearly identical at the initial stage 𝑡 ∈ [0, 200], as the development of the flow field is dominated by the solvent and the near-equilibrium responses of the polymer molecules in this regime. Starting from 𝑡 = 250, the velocity fields of the three systems exhibit distinct evolution processes. The velocity of the chain-shaped molecule suspension exhibits the largest oscillation and the longest development stage during 𝑡 ∈ [250, 800]. In contrast, the velocity of the star-shaped molecule suspension exhibits moderate oscillation and shows an apparent increase during 𝑡 ∈ [400, 800], indicating that the polymer elastic energy reaches a plateau earlier than the chain-shaped system. Moreover, the velocity of the net-shaped molecule suspension exhibits the smallest oscillation, indicating that the three additional side-chains further affect the rheological properties of the polymer suspension. Such differences can also be studied by examining the polymer stress development. As shown 63 Figure 4.2 The velocity 𝑢𝑥 (left) and polymer stress τp (right) of the reverse Poiseuille flow (𝑦 = 6) of the polymer suspensions of three different molecule structures shown in Fig. 4.1. τp is normalized by polymer number density 𝑛p, i.e., it is the stress energy per polymer (the same for the remaining figures). With the same FENE bond, the polymer suspensions exhibit different flow responses due to the different molecule structural mechanics. The dark blue lines with rough oscillations denote the micro-scale simulation results; the solid lines with symbols denote the DeePN2 predictions. in Fig. 4.2, the value of τp𝑥𝑥 for the chain-shaped molecule suspension keeps increasing through the development stage 𝑡 ∈ [0, 800] while for the star-shaped molecule, τp𝑥𝑥 shows only a moderate increase. In contrast, the net-shaped molecule suspension reaches steady state at about 𝑡 = 400. Moreover, the steady value of the shear stress τp𝑥𝑦 of the chain-shaped molecule is also larger than the star-shaped and the net-shaped molecules, indicating the largest restored elastic energy. This result is also consistent with the larger velocity oscillation from the minimal values to 0 during the relaxation process with 𝑡 ∈ [800, 1000]. The different rheological properties of the three polymer suspensions can be understood as follows. Although both the chain-shaped and star-shaped molecules have 6 identical FENE bonds, the chain-shaped molecule is less symmetric than the star-shaped molecule. Accordingly, it shows larger dispersion in the R18 configuration space, and hence, is more flexible than the star-shaped molecule. The elastic response time of the chain-shaped molecule suspension is longer than that of the star-shaped molecule suspension; larger elastic energy can be restored during the relaxation stage. On the other hand, the net-shaped molecule is more rigid than the star-shaped molecule due to the additional bond interaction. Another important feature of non-Newtonian fluids is the hysteresis effect. Classical models such 64 050010001500 0.0500.050.10.15tτP xx, τP xy0500100015000246810 Figure 4.3 The evolution of the polymer stress τp and conformation tensor c1 obtained from the reverse Poiseuille flow (𝑦 = 6) of the polymer suspensions. The clockwise loops represent the development and relaxation processes. For the visualization, the conformation tensor component 𝑐1𝑥𝑥 is rescaled by the maximum value obtained from the micro-scale simulation. as Hookean and FENE-P cannot capture such effects (Doyle et al., 1998; Lielens et al., 1998). Fig. 4.3 shows the evolution of the polymer stress and conformation tensor for the chain- and star-shaped molecule suspensions. The clockwise loops show the hysteresis effects during the development and relaxation processes; the non-unique stress values indicate that linear and mean field approximations are insufficient in describing the viscoelastic response of the system. In contrast, these effects are accurately captured with the DeePN2 model. Similar to Fig. 4.2, the chain-shaped molecule suspension shows more pronounced hysteresis effect due to the larger dispersion in the configuration space, reflected as the larger “loop area” than the results for star-shaped molecule suspension. 4.3.5 Numerical Result of Womersley flow Next, we investigate the Womersley flow (Womersley, 1955) by applying the opposite oscillating body force fext = (± 𝑓0 cos(2𝜋𝜔𝑡), 0, 0) to each half of the domain along the z-direction, where we set 𝑓0 = 0.012 and 𝜔 = 1/3000. Fig. 4.4 shows the velocity development of the star- and net-shaped molecule suspensions. Similar to the reverse Poiseuille flow, the net-shaped molecule suspension shows less pronounced viscoelastic responses, reflected as the slower decay near 𝑡 ∈ [200, 400] and the larger oscillation due to the less elastic energy storage. For comparison, we also show the prediction from the conventional FENE-P model. The parameters are chosen to match the dynamics of the orientation tensor (the vector between two free-end particles) near equilibrium. As expected, 65 c1 xxτP xx00.250.50.75102.557.510DeePN2 chainDeePN2 starMD chainMD starc1 xxτP xy00.250.50.7510123 Figure 4.4 The oscillating Womersley flow of the star- and net-shaped molecule suspensions predicted from the micro-scale simulation, DeePN2 and the FENE-P model. The FENE-P model parameters are chosen to match the dynamics of the orientation tensor (the vector between two free-end particles) near equilibrium. Left: the velocity evolution 𝑢𝑥 (𝑦, 𝑡) at 𝑦 = 6. Right: the velocity profile 𝑢𝑥 (𝑦, 𝑡) at 𝑡 = 6450. the FENE-P model shows limitations for predicting the flow responses of the two suspensions. The distinct viscoelastic responses of the different suspensions can be further elucidated by examining the elongation flow. We impose the traceless flow gradient ∇u = diag( (cid:164)𝜖, − (cid:164)𝜖, 0) where the strain rate (cid:164)𝜖 is set to be 4 × 10−4. Fig. 4.5 shows the stress development of the chain- and star-shaped molecule suspensions. The micro-scale simulations are imposed by the generalized uniaxial extension flow boundary conditions (Nicholson and Rutledge, 2016; Murashima et al., 2018). Compared with the shear flow, the elongation flow yields larger extension and longer processes, as was shown in experimental studies (Smith et al., 1999); the steady state is achieved at about 𝑡 = 2.5 × 103 and 𝑡 = 104 for the star- and chain-shaped molecule suspensions, respectively. Moreover, the steady stress value 𝜏p𝑥𝑥 of the chain-shaped molecule suspension is much larger than the value of the star-shaped molecule suspension. Such differences are also due to the larger flexibility of the chain-shaped molecule, which produces a stronger extension under external flow. DeePN2 successfully captures the different responses and shows good agreement with the micro-scale simulations for both cases. Finally, we consider the Taylor-Green vortex flow (Taylor, 1934; Thomases and Shelley, 2007) in a 100 × 100 × 160 domain (in reduced unit) of the micro-scale simulation. The external force 66 tux02000400060008000 0.100.10.2DeePN2 starDeePN2 netFENE PMD star/netyux0255075100 0.1 0.0500.050.1 Figure 4.5 The elongation flow of the chain- and star-shaped molecule suspensions predicted from the micro-scale simulation and DeePN2. With the same bond potential and strain rate, the chain-shaped molecule suspension yields larger elongation stress. The lines with rough oscillations denote the micro-scale simulation results; the solid lines with symbols denote the DeePN2 predictions. fext = ( 𝑓𝑥, 𝑓𝑦, 0) is applied to the domain following 𝑓𝑥 (𝑥, 𝑦) = −2 𝑓0 sin (cid:19) (cid:18) 2𝜋𝑥 𝐿 cos (cid:19) (cid:18) 2𝜋𝑦 𝐿 , 𝑓𝑦 (𝑥, 𝑦) = 2 𝑓0 cos (cid:19) (cid:18) 2𝜋𝑥 𝐿 (cid:19) (cid:18) 2𝜋𝑦 𝐿 , sin where 𝐿 = 100 and 𝑓0 = 6 × 10−3. Periodic boundary conditions are imposed along all of the three directions. The force field imposes an elongation to the flow field along the x-direction and a compression along the y-direction. The flow near the center (𝐿/2, 𝐿/2) resembles the planar elongation flow. Four vortices appear at (𝐿/2 ± 𝐿/4, 𝐿/2 ± 𝐿/4). Figure. 4.6(a-b) shows the steady-state velocity field. Compared with the star-shaped molecule suspension, the velocity field of the chain-shaped molecule suspension shows larger deviation from the symmetric structure of the Newtonian flow (i.e., ∝ [− sin (2𝜋𝑥/𝐿) cos (2𝜋𝑦/𝐿) , cos (2𝜋𝑥/𝐿) sin (2𝜋𝑦/𝐿)]) due to the larger polymer stress across the flow regime. Furthermore, the two suspensions yield different velocity magnitude, as shown in Fig. 4.6(c). Fig. 4.6(d) shows the velocity development at (75, 49). The velocities of both suspensions achieve a similar maximum value near 𝑡 = 30 and decay along with the polymer stress development. However, the star-shaped molecule suspension reaches the steady state much earlier with a larger velocity than the chain-shaped molecule suspension. Fig. 4.7 (a-b) shows the steady-state stress field for the two suspensions. We see that the chain-shaped molecule suspension exhibits larger polymer stress variation along the elongation and contraction directions, reflected in the larger loop area in Fig. 4.7(b). Such difference is also 67 tτP xx, τP yy0100002000011.522.53DeePN2 chainDeePN2 starMD chainMD star Figure 4.6 The velocity field of the Taylor-Green vortex flow of the chain- and star-shaped molecule suspensions predicted from the micro-scale simulations and DeePN2. (a-b) The 2D steady-state velocity field of the chain- and star-shaped molecule suspensions from the micro-scale simulations. The velocity field of the chain-shaped system yields more pronounced deviations from the symmetric Newtonian flow due to the more pronounced polymer stress across the flow regime. (c) The steady-state 1D velocity profile 𝑢𝑥 (𝑥, 𝑦 = 49). The solid and dashed lines represent the predictions from the micro-scale simulations and the DeePN2 model, respectively. (d) The time history of 𝑢𝑥 (𝑥 = 75, 𝑦 = 49). consistent with the more pronounced asymmetric velocity field shown in Fig. 4.6(a-b). In addition, we also examine the transient states where the flow undergoes intricate and heterogeneous process. Fig. 4.7(c) shows the stress development at point (49, 35), where 𝜏p𝑥𝑥 and 𝜏p𝑦𝑦 cross over during the evolution. During the initial stage, 𝜏p𝑦𝑦 increases along with the flow development towards to the stagnation point. At 𝑡 > 150, 𝜏p𝑦𝑦 decreases due to the compression along the y-direction. Meanwhile, 𝜏p𝑥𝑥 increases and achieves a steady state slightly larger than 𝜏p𝑦𝑦 for the star-shaped solution. On the other hand, the chain-shaped solution ends up with a significantly larger value of 𝜏p𝑥𝑥 due to the larger molecule flexibility and further extension along the x-direction. The different viscoelastic responses are also reflected in the stress development at point (49, 49). As shown in 68 xy025507510002550751000.020.0150.010.0050 0.005 0.01 0.015 0.02uxxy02550751000255075100xux0255075100 0.03 0.01500.0150.03chainstartux05001000150020000.030.060.091011021030.030.060.09 Figure 4.7 The stress field of the Taylor-Green vortex flow of the chain- and star-shaped molecule suspensions predicted from the micro-scale simulations and DeePN2. (a) The 2D steady-state stress field of the chain-shaped molecule suspension from the micro-scale simulations. (b) The 1D steady-state stress profiles 𝜏p𝑥𝑥 (𝑥, 𝑦 = 49) and 𝜏p𝑥𝑥 (𝑥 = 49, 𝑦). The chain-shaped molecule suspension yields larger stress variations (i.e., the “loop area”) along the flow domain. (c-d) The stress evolution of 𝜏p𝑥𝑥 (𝑡) and 𝜏p𝑦𝑦 (𝑡) at the points (49, 35) and (49, 49), respectively. The dashed and the solid lines denote the micro-scale simulations and the DeePN2 predictions, respectively. Fig. 4.7(d), the chain-shaped solution exhibits longer evolution of 𝜏p𝑥𝑥 and larger steady value than the star-shaped solution. DeePN2 successfully captures such micro-structure-induced rheological differences and shows good agreement with the micro-scale simulation results. 4.3.6 Validation of the rotational-symmetry preserving NN representation To validate the performance of the proposed DNN representation, we check the accuracy of the modeling terms given a set of conformation tensors c1, · · · , c𝑛 under different unitary transformations. Fig. 4.8 shows the relative error under each transformation. The DNN representation (4.18) yields the same results under all the transformation. In contrast, the DNN without accounting for the four transformations yields significant error due to the non-uniqueness of the eigenvectors of c1. 69 xy0255075100025507510010.59.58.57.56.55.54.53.52.51.5τPxxx, yτP xx51015202504812chainstartτP xx, τP yy101102103123τPxx chain τPxx starτPyy chain τPyy startτP xx, τP yy101102103036912 Figure 4.8 The relative 𝑙∞ error of the model prediction under randomly chosen orthogonal transformations without (left) and with (right) accounting for the four eigen-space transformations in Eq. (4.18). In addition, we examine the 2D Taylor-Green vortex flow where the evolution of c1 becomes degenerate at certain points. Fig. 4.9 shows the stress evolution at (45, 37). At 𝑡 = 1080, the eigenvalues 𝜆2 and 𝜆3 cross over. Concurrently, the prediction of the polymer stress τp from the model without considering the swap of u2 and u3 shows apparent deviations near the regime as shown in Fig. 4.9. In contrast, the prediction from the model retaining the eigenvalue permutation trained by Eq. (4.20) shows good agreement with the MD results. Figure 4.9 Stress evolution of the Taylor-Green vortex flow at position (45, 37) of the chain-shaped molecule suspension. Left: prediction without considering the swap of eigenvectors when the two eigenvalues approaches near 𝑡 = 1255 as shown in the inset plot. Right: predictions from the model retaining the eigenvalue permutation trained by Eq. (4.20). The dashed and the solid lines denote the micro-scale simulations and the DeePN2 predictions, respectively. 70 rotation test24681010 310 210 1100101rotation test24681010 810 710 6GΕH1H2tτP xx, τP yy, τP zz01500300045001234tλ1, λ2, λ30.20.40.6tτP xx, τP yy, τP zz01500300045001234τPxxτPyyτPzz 4.4 Discussion We have developed a general machine-learning based model, DeePN2, for describing the non- Newtonian hydrodynamics for polymer solutions with arbitrary molecular structure and interaction. The constructed model retains a clear physical interpretation and faithfully encodes the micro-scale structural information into the macro-scale hydrodynamics, where conventional models based on empirical closures generally show limitations. In particular, for the chain- and star-shaped molecule suspensions with the same bead number and bond interaction, DeePN2 successfully captures the different viscoelastic responses arising from the different molecular structural symmetry (i.e., the effective rigidity) in the configuration space without additional human intervention. Unlike the direct evaluation or moment-closure representations of the configurational PDF, the present DeePN2 model directly learns a set of micro-to-macro mappings to probe the optimal approximations of the constitutive dynamics in terms of the macro-scale features, and thereby circumventing the numerical challenges due to the high-dimensionality of the polymer configuration space. This multi-scaled nature enables us to learn the constitutive dynamics of the macro-scale features directly from the kinetic equations of their micro-scale counterparts using only discrete rather than the time-derivative samples commonly used in the machine learning-based models of complex dynamic problems. One thing we have not investigated systematically is the generation of training samples. For DeePN2 to be truly reliable, the training samples should be representative enough for all the practical situations that one might encounter. However, due to the cost associated with generating such training samples, we would also like the training set to be as small as possible. This calls for an adaptive procedure for generating the training sample, such as the concurrent learning procedure discussed in (E et al., 2021). The present DeePN2 models are trained with samples collected from homogeneous shear flow. Even though the numerical predictions show good agreement with micro-scale simulations for a variety of flows, one should not expect this to be generally the case. Further work on sampling is needed to make sure that one can produce truly reliable DeePN2 models. Furthermore, instead of the general form (4.6), a specific design of the encoders b(·) accounting for the molecule symmetry and rigidity may facilitate the extraction of the macro-scale features c. In 71 addition, more accurate micro-scale kinetic models accounting for the heterogeneous hydrodynamic interactions and non-Markovianity (Lei et al., 2016; Lei and Li, 2021) can be used to construct the macro-scale constitutive dynamics. Finally, the adaptive choice of the number of features and the enhanced sampling of the discrete micro-scale configurations may further improve the performance of the DeePN2 model. We leave these issues for future work. 72 CHAPTER 5 CONCLUSION AND OUTLOOK This dissertation presents advancements in multi-scale modeling by utilizing machine learning to construct reliable and structure-preserved reduced models with interpretable micro-scale to macro and meso-scale mapping. We explored two main directions in reduced modeling: (1) micro-to-meso data-driven stochastic reduced model, and (2) micro-to-macro deep-learning-based non-Newtonian hydrodynamic model. The first part on micro-to-meso modeling developed a data-driven method to learn stochastic reduced models of complex systems that retain a state-dependent memory beyond the standard GLE with a homogeneous kernel. The main idea is to seek a generalized representation of the state-dependent memory which satisfies the second fluctuation-dissipation theorem with a coherent colored noise. To parameterize the representation, traditional two-point correlation functions are not enough for extracting state-dependent information where we use the state-dependent three-point correlation functions instead. Efficient training is achieved by constructing the state-dependent encoders using a set of sparse bases, whose correlations can be efficiently precomputed. The convolution part in simulation can be efficiently evaluated using the fast convolution algorithm. Numerical results demonstrate the limitation of the standard GLE and the essential role of the broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation relaxation and transition. The second part on micro-to-macro modeling developed a deep learning-based non-Newtonian hydrodynamic model, which focused on learning accurate non-Newtonian hydrodynamic models from micro-scale polymer kinetics. The model aims to close the empirical constitutive hydrodynamics form with micro-scale level fidelity. First, we establish a micro-macro correspondence via a set of encoders for the micro-scale polymer configurations and their macro-scale counterparts, a set of conformation tensors. Instead of directly approximating the momenta of distribution, we use neural networks to construct the nonlinear encoders. Then the dynamics (including generalized objective tensor derivative) of conformation tensors can be directly derived from the micro-scale 73 model by the Fokker-Planck equation, where the molecular structural mechanics are automatically included. Here the dynamics are only related to instantaneous samples instead of time series. Finally, we use the symmetry-preserving neural networks to parameterize the dynamics based on the conformation tensors. By learning the model only based on reverse Poiseuille flow, we demonstrate its accuracy and generalization ability by considering various flows in comparison with the results of the micro-scale simulations. Despite these advancements, several challenges remain open for future research. 5.1 High Dimensional Stochastic Reduced Model Although our work (Ge et al., 2024) shows high promising results, it only works for 1D reduced systems due to the curse of dimensionality. The main challenge arises from the state-dependent features, which are directly related to the dimension. Traditional methods are limited for such situations. However, neural networks provide a powerful tool to construct high-dimensional functions efficiently. By carefully designing the training process and the formulation of the state-dependent features, we can develop a practical high-dimensional stochastic reduced model. Furthermore, the high-dimensional stochastic reduced model can also help build a more accurate DeePN2 model with state-dependent memory accounted. 5.2 Variational-informed Structure-preserving Macro-scale Reduced Model Structure-preserving is important in multi-scale modeling, especially for constructing stable and reliable reduced models. Some existing methods such as deep learning-based coarse-grained molecular dynamics Zhang et al. (2018b) follow some physical constraints. One way to impose energy stability is by pre-building the constitutive dynamics with a generalized extendable energy functional structure. One possible formulation is the GENERIC formalism Grmela and Öttinger (1997) governed by the coupling of a reversible and an irreversible process: dX d𝑡 L 𝛿𝑆 𝛿X = L + M 𝛿𝑆 𝛿X ≡ 0 𝛿𝐸 𝛿X 𝛿𝐸 𝛿X = M 74 (5.1) (5.2) where X : Ω → R𝑑 is the field variables, 𝐸 [X] = ∫ the entropy. 𝐿 is the Poisson matrix satisfied L = −L𝑇 , M is the friction matrix satisfied M ≻ 0. ˆ𝐸 (X)dΩ is the energy, 𝑆[X] = ∫ ˆ𝑆(X)dΩ is Ω Ω The degeneracy condition Eq. (5.2) ensures the energy conservation d𝐸/d𝑡 ≡ 0 and the entropy production d𝑆/d𝑡 ≥ 0 and therefore the free energy stability. Our main idea is to seek a generalized extendable energy variational form of our DeePN2 model where X = (𝜌, u, c1, · · · , c𝑛). 𝜌 is the density, u is the velocity, {c𝑖}𝑛 𝑖=1 is our conformation tensor. By doing so, we can use existing numerical schemes such as the scalar auxiliary variable approach Shen et al. (2018) and invariant energy quadratization method Yang (2016) to ensure energy stability. 75 BIBLIOGRAPHY Abrams, J. B. and Tuckerman, M. E. (2008). Efficient and direct generation of multidimensional free energy surfaces via adiabatic dynamics without coordinate transformations. The Journal of Physical Chemistry B, 112(49):15742–15757. Allen, E. C. and Rutledge, G. C. (2008). A novel algorithm for creating coarse-grained, density dependent implicit solvent models. The Journal of Chemical Physics, 128(15):154115. Allen, E. C. and Rutledge, G. C. (2009). Evaluating the transferability of coarse-grained, density- dependent implicit solvent models to mixtures and chains. The Journal of Chemical Physics, 130(3):034904. Ayaz, C., Scalfi, L., Dalton, B. A., and Netz, R. R. (2022a). Generalized langevin equation with a non-linear potential of mean force and non-linear memory friction from a hybrid projection scheme. Physical Review E, 105:054138. Ayaz, C., Tepper, L., Brünig, F. N., Kappler, J., Daldrop, J. O., and Netz, R. R. (2021). Non- markovian modeling of protein folding. Proceedings of the National Academy of Sciences, 118(31):e2023856118. Ayaz, C., Tepper, L., and Netz, R. R. (2022b). Markovian embedding of generalized langevin equations with a nonlinear friction kernel and configuration-dependent mass. Turkish Journal of Physics, 46(6):194 – 205. Baczewski, A. D. and Bond, S. D. (2013). Numerical integration of the extended variable generalized Langevin equation with a positive Prony representable memory kernel. The Journal of chemical physics, 139(4):044107. Bayly, C. I., Cieplak, P., Cornell, W., and Kollman, P. A. (1993). A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the resp model. The Journal of Physical Chemistry, 97(40):10269–10280. Berezhkovskii, A. and Szabo, A. (2011). Time scale separation leads to position-dependent diffusion along a slow coordinate. The Journal of Chemical Physics, 135(7):074108. Berkowitz, M., Morgan, J., and McCammon, J. A. (1983). Generalized Langevin dynamics simulations with arbitrary time-dependent memory kernels. J. Chem. Phys., 78:3256. Berne, B. J., Weeks, J. D., and Zhou, R. (2009). Dewetting and hydrophobic interaction in physical and biological systems. Annual Review of Physical Chemistry, 60(1):85–103. Berressem, F., Scherer, C., Andrienko, D., and Nikoubashman, A. (2021). Ultra-coarse-graining of ho- mopolymers in inhomogeneous systems. Journal of Physics: Condensed Matter, 33(25):254002. 76 Best, R. B. and Hummer, G. (2006). Diffusive model of protein folding dynamics with kramers turnover in rate. Phys. Rev. Lett., 96:228104. Best, R. B. and Hummer, G. (2010). Coordinate-dependent diffusion in protein folding. Proceedings of the National Academy of Sciences, 107(3):1088–1093. Bird, R. B., Curtiss, C. F., Armstrong, R. C., and Hassager, O. (1987). Dynamics of Polymeric Liquids, Volume 2: Kinetic Theory, 2nd Edition. Wiley, 2nd edition. Bird, R. B., Dotson, P. J., and Johnson, N. (1980). Polymer solution rheology based on a finitely extensible bead—spring chain model. Journal of Non-Newtonian Fluid Mechanics, 7(2):213 – 235. Buff, F. P., Lovett, R. A., and Stillinger, F. H. (1965). Interfacial density profile for fluids in the critical region. Phys. Rev. Lett., 15:621–623. Carmeli, B. and Nitzan, A. (1983). Theory of activated rate processes: Position dependent friction. Chemical Physics Letters, 102(6):517–522. Ceriotti, M., Bussi, G., and Parrinello, M. (2009). Langevin equation with colored noise for constant-temperature molecular dynamics simulations. Physical review letters, 102(2):020601. Chan, H., Cherukara, M. J., Narayanan, B., Loeffler, T. D., Benmore, C., Gray, S. K., and Sankaranarayanan, S. K. R. S. (2019). Machine learning coarse grained models for water. Nature Communications, 10(1):379. Chandler, D. (2005). 437(7059):640–647. Interfaces and the driving force of hydrophobic assembly. Nature, Cooley, J. W. and Tukey, J. W. (1965). An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19(90):297–301. Cossio, P., Hummer, G., and Szabo, A. (2015). On artifacts in single-molecule force spectroscopy. Proceedings of the National Academy of Sciences, 112(46):14248–14253. Daldrop, J. O., Kowalik, B. G., and Netz, R. R. (2017). External potential modifies friction of molecular solutes in water. Physical Review X, 7(4):041065. Dalton, B. A., Ayaz, C., Kiefer, H., Klimek, A., Tepper, L., and Netz, R. R. (2023). Fast protein folding is governed by memory-dependent friction. Proceedings of the National Academy of Sciences, 120(31):e2220068120. Darve, E. and Pohorille, A. (2001). Calculating free energies using average force. The Journal of Chemical Physics, 115(20):9169–9183. 77 Darve, E., Solomon, J., and Kia, A. (2009). Computing generalized Langevin equations and generalized fokker-planck equations. Proc. Natl. Acad. Sci., 106(27):10884–10889. Das, A. and Andersen, H. C. (2010). The multiscale coarse-graining method. v. isothermal-isobaric ensemble. The Journal of Chemical Physics, 132(16):164106. Das, A. and Andersen, H. C. (2012). The multiscale coarse-graining method. ix. a general method for construction of three body coarse-grained force fields. The Journal of chemical physics, 136(19):194114. Davtyan, A., Dama, J. F., Voth, G. A., and Andersen, H. C. (2015). Dynamic force matching: A method for constructing dynamical coarse-grained models with realistic time dependence. J. Chem. Phys., 142. DeLyser, M. and Noid, W. G. (2020). Bottom-up coarse-grained models for external fields and interfaces. The Journal of Chemical Physics, 153(22):24103. DeLyser, M. R. and Noid, W. G. (2017). Extending pressure-matching to inhomogeneous systems via local-density potentials. The Journal of chemical physics, 147(13):134111. DeLyser, M. R. and Noid, W. G. (2019). Analysis of local density potentials. The Journal of Chemical Physics, 151(22):224106. DeLyser, M. R. and Noid, W. G. (2022). Coarse-grained models for local density gradients. The Journal of Chemical Physics, 156(3):034106. Deutch, J. M. and Oppenheim, I. (1971). Molecular theory of brownian motion for several particles. The Journal of Chemical Physics, 54(8):3547–3555. Dinpajooh, M. and Guenza, M. G. (2017). On the density dependence of the integral equation coarse-graining effective potential. The Journal of Physical Chemistry B, 122(13):3426–3440. Doyle, P. S., Shaqfeh, E. S., McKinley, G. H., and Spiegelberg, S. H. (1998). Relaxation of dilute polymer solutions following extensional flow. Journal of Non-Newtonian Fluid Mechanics, 76(1):79–110. Dunn, N. J. H. and Noid, W. G. (2015). Bottom-up coarse-grained models that accurately describe the structure, pressure, and compressibility of molecular liquids. The Journal of Chemical Physics, 143(24):243148. Dunn, N. J. H. and Noid, W. G. (2016). Bottom-up coarse-grained models with predictive accuracy and transferability for both structural and thermodynamic properties of heptane-toluene mixtures. The Journal of Chemical Physics, 144(20):204124. E, W., Han, J., and Zhang, L. (2021). Machine-learning-assisted modeling. Physics Today, 78 74(7):36–41. E, W., Lei, H., Xie, P., and Zhang, L. (2023). Machine learning-assisted multi-scale modeling. Journal of Mathematical Physics, 64(7):071101. E, W. and Vanden-Eijnden, E. (2010). Transition-path theory and path-finding algorithms for the study of rare events. Annual Review of Physical Chemistry, 61(1):391–420. Empereur-mot, C., Capelli, R., Perrone, M., Caruso, C., Doni, G., and Pavan, G. M. (2022). Automatic multi-objective optimization of coarse-grained lipid force fields using swarmcg. The Journal of Chemical Physics, 156(2):024801. Español, P. and Warren, P. (1995). Statistical mechanics of dissipative particle dynamics. Europhysics Letters, 30(4):191–196. Evans, R. (1979). The nature of the liquid-vapour interface and other topics in the statistical mechanics of non-uniform, classical fluids. Advances in Physics, 28(2):143–200. Fang, L., Ge, P., Zhang, L., E, W., and Lei, H. (2022). DeePN2: A deep learning-based non-newtonian hydrodynamic model. Journal of Machine Learning, 1(1):114–140. Feng, J., Chaubal, C. V., and Leal, L. G. (1998). Closure approximations for the doi theory: Which to use in simulating complex flows of liquid-crystalline polymers? Journal of Rheology, 42(5):1095–1119. Forest, G. M., Zhou, R., and Wang, Q. (2003). Full-tensor alignment criteria for sheared nematic polymers. Journal of Rheology, 47(1):105–127. Galvelis, R. and Sugita, Y. (2017). Neural network and nearest neighbor algorithms for enhancing sampling of molecular dynamics. Journal of Chemical Theory and Computation, 13(6):2489– 2500. Ge, P., Zhang, L., and Lei, H. (2023). Machine learning assisted coarse-grained molecular dynamics modeling of meso-scale interfacial fluids. The Journal of Chemical Physics, 158(6). 064104. Ge, P., Zhang, Z., and Lei, H. (2024). Data-driven learning of the generalized langevin equation with state-dependent memory. Phys. Rev. Lett., 133:077301. Giesekus, H. (1982). A simple constitutive equation for polymer fluids based on the concept of deformation-dependent tensorial mobility. Journal of Non-Newtonian Fluid Mechanics, 11(1):69 – 109. Glatzel, F. and Schilling, T. (2022). The interplay between memory and potentials of mean force: A discussion on the structure of equations of motion for coarse-grained observables. Europhysics Letters, 136(3):36001. 79 Grmela, M. and Öttinger, H. C. (1997). Dynamics and thermodynamics of complex fluids. i. development of a general formalism. Physical Review E, 56(6):6620. Grogan, F., Lei, H., Li, X., and Baker, N. A. (2020). Data-driven molecular modeling with the generalized Langevin equation. J. Comput. Phys., 418:109633–109641. Groot, R. D. and Warren, P. B. (1997). Dissipative particle dynamics: Bridging the gap between atomistic and mesoscopic simulation. Journal of Chemical Physics, 107(11):4423–4435. Grosso, M., Maffettone, P., Halin, P., Keunings, R., and Legat, V. (2000). Flow of nematic polymers in eccentric cylinder geometry: influence of closure approximations. Journal of Non-Newtonian Fluid Mechanics, 94(2):119–134. Han, J., Ma, C., Ma, Z., and E, W. (2019a). Uniformly accurate machine learning-based hydrodynamic models for kinetic equations. Proceedings of the National Academy of Sciences, 116(44):21983– 21991. Han, J., Ma, C., Ma, Z., and E, W. (2019b). Uniformly accurate machine learning-based hydrodynamic models for kinetic equations. Proceedings of the National Academy of Sciences, 116(44):21983– 21991. Hänggi, P. (1997). Generalized langevin equations: A useful tool for the perplexed modeller of nonequilibrium fluctuations? In Schimansky-Geier, L. and Pöschel, T., editors, Stochastic Dynamics, pages 15–22, Berlin, Heidelberg. Springer Berlin Heidelberg. Haynes, G. R., Voth, G. A., and Pollak, E. (1993). A theory for the thermally activated rate constant in systems with spatially dependent friction. Chemical Physics Letters, 207(4):309–316. Haynes, G. R., Voth, G. A., and Pollak, E. (1994). A theory for the activated barrier crossing rate constant in systems influenced by space and time dependent friction. The Journal of Chemical Physics, 101(9):7811–7822. Hijón, C., Español, P., Vanden-Eijnden, E., and Delgado-Buscalioni, R. (2010). Mori–Zwanzig formalism as a practical computational tool. Faraday discuss., 144:301–322. Hinczewski, M., von Hansen, Y., Dzubiella, J., and Netz, R. R. (2010). How the diffusivity profile reduces the arbitrariness of protein folding free energies. The Journal of Chemical Physics, 132(24):245103. Hoogerbrugge, P. J. and Koelman, J. M. V. A. (1992). Simulating microscopic hydrodynamic phenomena with dissipative particle dynamics. Europhys. Lett., 19(3):155–160. Hoover, W. G. (1985). Canonical dynamics: equilibrium phase-space distributions. Physical Review A, 31(3):1695–1697. 80 Huang, J., Ma, Z., Zhou, Y., and Yong, W.-A. (2021). Learning thermodynamically stable and galilean invariant partial differential equations for non-equilibrium flows. Journal of Non-Equilibrium Thermodynamics, 46(4):355–370. Hulsen, M., van Heel, A., and van den Brule, B. (1997). Simulation of viscoelastic flows using brownian configuration fields. Journal of Non-Newtonian Fluid Mechanics, 70(1):79 – 101. Hummer, G., Garde, S., García, A. E., Paulaitis, M. E., and Pratt, L. R. (1998). Hydrophobic effects on a molecular scale. The Journal of Physical Chemistry B, 102(51):10469–10482. Hummer, G., Garde, S., García, A. E., Pohorille, A., and Pratt, L. R. (1996). An information theory model of hydrophobic interactions. Proceedings of the National Academy of Sciences, 93(17):8951–8955. Hyon, Y., Du, Q., and Liu, C. (2008). An enhanced macroscopic closure approximation to the micro- macro fene model for polymeric materials. Multiscale Modeling & Simulation, 7(2):978–1002. Izvekov, S., Chung, P. W., and Rice, B. M. (2010). The multiscale coarse-graining method: Assessing its accuracy and introducing density dependent coarse-grain potentials. The Journal of Chemical Physics, 133(6):064109. Izvekov, S. and Voth, G. A. (2005). A multiscale coarse-graining method for biomolecular systems. The Journal of Physical Chemistry B, 109(7):2469–2473. Jin, J. and Voth, G. A. (2018). Ultra-coarse-grained models allow for an accurate and transferable treatment of interfacial systems. Journal of Chemical Theory and Computation, 14(4):2180–2197. John, S. T. and Csányi, G. (2017). Many-body coarse-grained interactions using gaussian approxi- mation potentials. The Journal of Physical Chemistry B, 121(48):10934–10949. Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., and Klein, M. L. (1983). Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics, 79(2):926–935. Jung, B. and Jung, G. (2023). Dynamic coarse-graining of linear and non-linear systems: Mori–Zwanzig formalism and beyond. The Journal of Chemical Physics, 159(8):084110. Jung, G., Hanke, M., and Schmid, F. (2017). Iterative reconstruction of memory kernels. Journal of Chemical Theory and Computation, 13(6):2481–2488. Kingma, D. and Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR). Klippenstein, V., Tripathy, M., Jung, G., Schmid, F., and van der Vegt, N. F. (2021). Introduc- ing memory in coarse-grained molecular simulations. The Journal of Physical Chemistry B, 81 125(19):4931–4954. Klippenstein, V. and van der Vegt, N. F. A. (2021). Cross-correlation corrected friction in (generalized) langevin models. The Journal of Chemical Physics, 154(19):191102. Koopman, B. O. (1931). Hamiltonian systems and transformation in Hilbert space. Proceedings of the National Academy of Sciences, 17(5):315–318. Kramers, H. (1940). Brownian motion in a field of force and the diffusion model of chemical reactions. Physica, 7(4):284–304. Krishnan, R., Singh, S., and Robinson, G. W. (1992). Space-dependent friction in the theory of activated rate processes: The hamiltonian approach. The Journal of Chemical Physics, 97(8):5516–5521. Kumar, S., Rosenberg, J. M., Bouzida, D., Swendsen, R. H., and Kollman, P. A. (1992a). The weighted histogram analysis method for free-energy calculations on biomolecules. i. the method. Journal of Computational Chemistry, 13(8):1011–1021. Kumar, S., Rosenberg, J. M., Bouzida, D., Swendsen, R. H., and Kollman, P. A. (1992b). The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. Journal of Computational Chemistry, 13(8):1011–1021. Laio, A. and Parrinello, M. (2002). Escaping free-energy minima. Proceedings of the National Academy of Sciences, 99(20):12562–12566. Landau, L. D. and Lifshitz, E. M. (1987). Fluid Mechanics, Second Edition: Volume 6 (Course of Theoretical Physics). Course of theoretical physics / by L. D. Landau and E. M. Lifshitz, Vol. 6. Butterworth-Heinemann, 2 edition. Lange, O. F. and Grubmüller, H. (2006). Collective Langevin dynamics of conformational motions in proteins. J. Chem. Phys., 124:214903. Larini, L., Lu, L., and Voth, G. A. (2010). The multiscale coarse-graining method. vi. implementation of three-body coarse-grained potentials. The Journal of Chemical Physics, 132(16):164107. Larson, R. G. (1988). Constitutive Equations for Polymer Melts and Solutions. Butterworth- Heinemann Press. Laso, M. and Öttinger, H. C. (1993). Calculation of viscoelastic flow using molecular models: the connffessit approach. Journal of Non-Newtonian Fluid Mechanics, 47:1 – 20. Lee, H. S., Ahn, S.-H., and Darve, E. F. (2019). The multi-dimensional generalized Langevin equation for conformational motion of proteins. The Journal of Chemical Physics, 150(17). 82 Lei, H., Baker, N. A., and Li, X. (2016). Data-driven parameterization of the generalized Langevin equation. Proc. Natl. Acad. Sci., 113(50):14183–14188. Lei, H., Caswell, B., and Karniadakis, G. E. (2010). Direct construction of mesoscopic models from microscopic simulations. Phys. Rev. E, 81:026704. Lei, H. and Li, X. (2021). Petrov–Galerkin methods for the construction of non-Markovian dynamics preserving nonlocal statistics. The Journal of Chemical Physics, 154(18):184108. Lei, H., Mundy, C. J., Schenter, G. K., and Voulgarakis, N. K. (2015). Modeling nanoscale hydrodynamics by smoothed dissipative particle dynamics. J. Chem. Phys., 142(19):194504. Lei, H., Wu, L., and E, W. (2020). Machine learning based non-newtonian fluid model with molecular fidelity. Physics Review E, 102:043309. Lemke, T. and Peter, C. (2017). Neural network based prediction of conformational free energies Journal of Chemical Theory and - a new route toward coarse-grained simulation models. Computation, 13(12):6213–6221. Li, Z., Kovachki, N. B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. (2021). Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations. Lielens, G., Halin, P., Jaumain, I., Keunings, R., and Legat, V. (1998). New closure approximations for the kinetic theory of finitely extensible dumbbells. Journal of Non-Newtonian Fluid Mechanics, 76(1):249–279. Lielens, G., Keunings, R., and Legat, V. (1999). The fene-l and fene-ls closure approximations to the kinetic theory of finitely extensible dumbbells. Journal of Non-Newtonian Fluid Mechanics, 87(2):179 – 196. Lin, F.-H., Liu, C., and Zhang, P. (2005). On hydrodynamics of viscoelastic fluids. Communications on Pure and Applied Mathematics, 58(11):1437–1471. Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. (2021). Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229. Lum, K., Chandler, D., and Weeks, J. D. (1999). Hydrophobicity at small and large length scales. The Journal of Physical Chemistry B, 103(22):4570–4577. Luo, G., Andricioaei, I., Xie, X. S., and Karplus, M. (2006). Dynamic distance disorder in proteins is caused by trapping. The Journal of Physical Chemistry B, 110(19):9363–9367. Lyu, L. and Lei, H. (2023). Construction of coarse-grained molecular dynamics with many-body 83 non-markovian memory. Phys. Rev. Lett., 131:177301. Lyubartsev, A. P. and Laaksonen, A. (1995). Calculation of effective interaction potentials from radial distribution functions: A reverse monte carlo approach. Phys. Rev. E, 52:3730–3737. Ma, L., Li, X., and Liu, C. (2019). Coarse-graining langevin dynamics using reduced-order techniques. Journal of Computational Physics, 380:170–190. Maragliano, L. and Vanden-Eijnden, E. (2006). A temperature accelerated method for sampling free energy and determining reaction pathways in rare events simulations. Chemical Physics Letters, 426(1):168–175. Maragliano, L. and Vanden-Eijnden, E. (2008). Single-sweep methods for free energy calculations. The Journal of Chemical Physics, 128(18):184110. Martyna, G. J., Tobias, D. J., and Klein, M. L. (1994). Constant pressure molecular dynamics algorithms. The Journal of Chemical Physics, 101(5):4177–4189. Miller, T. F., Vanden-Eijnden, E., and Chandler, D. (2007). Solvent coarse-graining and the string method applied to the hydrophobic collapse of a hydrated chain. Proceedings of the National Academy of Sciences, 104(37):14559–14564. Miyamoto, S. and Kollman Peter, A. (2004). Settle: An analytical version of the shake and rattle algorithm for rigid water models. Journal of Computational Chemistry, 13(8):952–962. Molinero, V. and Moore, E. B. (2009). Water modeled as an intermediate element between carbon and silicon. The Journal of Physical Chemistry B, 113(13):4008–4016. Mones, L., Bernstein, N., and Csányi, G. (2016). Exploration, sampling, and reconstruction of free energy surfaces with gaussian process regression. Journal of Chemical Theory and Computation, 12(10):5100–5110. Moore, J. D., Barnes, B. C., Izvekov, S., Lísal, M., Sellers, M. S., Taylor, D. E., and Brennan, J. K. (2016). A coarse-grain force field for rdx: Density dependent and energy conserving. The Journal of Chemical Physics, 144(10):104501. Mori, H. (1965). Transport, collective motion, and Brownian motion. Progress of Theoretical Physics, 33(3):423–455. Morris, J. P., Fox, P. J., and Zhu, Y. (1997). Modeling low reynolds number incompressible flows using sph. J. Comput. Phys., 136(1):214–226. Morrone, J. A., Li, J., and Berne, B. J. (2012). Interplay between hydrodynamics and the free energy surface in the assembly of nanoscale hydrophobes. The Journal of Physical Chemistry B, 116(1):378–389. 84 Murashima, T., Hagita, K., and Kawakatsu, T. (2018). Elongational viscosity of weakly entangled polymer melt via coarse-grained molecular dynamics simulation. Nihon Reoroji Gakkaishi, 46(5):207–220. Nicholson, D. A. and Rutledge, G. C. (2016). Molecular simulation of flow-enhanced nucleation in n-eicosane melts under steady shear and uniaxial extension. The Journal of Chemical Physics, 145(24):244903. Nielsen, S. O., Lopez, C. F., Srinivas, G., and Klein, M. L. (2004). Coarse grain models and the computer simulation of soft materials. Journal of Physics: Condensed Matter, 16(15):R481–R512. Noé, F., Tkatchenko, A., Müller, K.-R., and Clementi, C. (2020). Machine learning for molecular simulation. Annual Review of Physical Chemistry, 71(1):361–390. Noid, W. G. (2013). Perspective: coarse-grained models for biomolecular systems. J. Chem. Phys., 139(9):090901. Noid, W. G., Chu, J.-W., Ayton, G. S., Krishna, V., Izvekov, S., Voth, G. A., Das, A., and Andersen, H. C. (2008). The multiscale coarse-graining method. I. a rigorous bridge between atomistic and coarse-grained models. J. Chem. Phys., 128(24):244114. Nosé, S. (1984). A molecular dynamics method for simulations in the canonical ensemble. Molecular Physics, 52(2):255–268. Ogorodnikov, V. A. and Prigarin, S. M. (1996). Numerical Modelling of Random Processes and Fields: Algorithms and Applications. De Gruyter, Berlin, Boston. Oldroyd, J. G. and Wilson, A. H. (1950). On the formulation of rheological equations of state. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 200(1063):523–541. Owens, R. G. and Phillips, T. N. (2002). Computational Rheology. Imperial College Press. Pagonabarraga, I. and Frenkel, D. (2001). Dissipative particle dynamics for interacting systems. The Journal of Chemical Physics, 115(11):5015–5026. Patel, A. J., Varilly, P., and Chandler, D. (2010). Fluctuations of water near extended hydrophobic and hydrophilic surfaces. The Journal of Physical Chemistry B, 114(4):1632–1637. Peterlin, A. (1966). Hydrodynamics of macromolecules in a velocity field with longitudinal gradient. Journal of Polymer Science Part B: Polymer Letters, 4(4):287–291. Plotkin, S. S. and Wolynes, P. G. (1998). Non-Markovian configurational diffusion and reaction coordinates for protein folding. Phys. Rev. Lett., 80:5015–5018. 85 Posch, H. A., Balucani, U., and Vallauri, R. (1984). On the relative dynamics of pairs of atoms in simple liquids. Physica A, 123:516–534. Qin, T., Wu, K., and Xiu, D. (2019). Data driven governing equations approximation using deep neural networks. Journal of Computational Physics, 395:620–635. Raissi, M., Perdikaris, P., and Karniadakis, G. (2019a). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707. Raissi, M., Perdikaris, P., and Karniadakis, G. E. (2019b). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707. Rein ten Wolde, P., Sun, S. X., and Chandler, D. (2001). Model of a fluid at small and large length scales and the hydrophobic effect. Phys. Rev. E, 65:011201. Reith, D., Pütz, M., and Müller-Plathe, F. (2003). Deriving effective mesoscale potentials from atomistic simulations. Journal of Computational Chemistry, 24(13):1624–1636. Ren, W. and E, W. (2005). Heterogeneous multiscale method for the modeling of complex fluids and micro-fluidics. Journal of Computational Physics, 204(1):1 – 26. Rosso, L., Mináry, P., Zhu, Z., and Tuckerman, M. E. (2002). On the use of the adiabatic molecular dynamics technique in the calculation of free energy profiles. The Journal of Chemical Physics, 116(11):4389–4402. Rouse, P. E. (1953). A theory of the linear viscoelastic properties of dilute solutions of coiling polymers. The Journal of Chemical Physics, 21(7):1272–1280. Rowlinson, J. J. S. and Widom, B. (2002). Molecular theory of capillarity, volume 8. Courier Dover Publications. Rudd, R. E. and Broughton, J. Q. (1998). Coarse-grained molecular dynamics and the atomic limit of finite elements. Phys. Rev. B, 58:R5893–R5896. Rudy, S. H., Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2017). Data-driven discovery of partial differential equations. Science Advances, 3(4). Russo, A., Durán-Olivencia, M. A., Kevrekidis, I. G., and Kalliadasis, S. (2019). Deep learning as closure for irreversible processes: A data-driven generalized Langevin equation. arXiv preprint arXiv:1903.09562. Ryckaert, J.-P., Ciccotti, G., and Berendsen, H. J. (1977). Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. Journal of 86 Computational Physics, 23(3):327–341. Sanyal, T. and Shell, M. S. (2016). Coarse-grained models using local-density potentials optimized with the relative entropy: Application to implicit solvation. The Journal of chemical physics, 145(3):034109. Sanyal, T. and Shell, M. S. (2018). Transferable coarse-grained models of liquid-liquid equilibrium using local density potentials optimized with the relative entropy. The Journal of Physical Chemistry B, 122(21, SI):5678–5693. Satija, R., Das, A., and Makarov, D. E. (2017). Transition path times reveal memory effects and anomalous diffusion in the dynamics of protein folding. The Journal of Chemical Physics, 147(15):152707. Satija, R. and Makarov, D. E. (2019). Generalized langevin equation as a model for barrier crossing dynamics in biomolecular folding. The Journal of Physical Chemistry B, 123(4):802–810. Schädle, A., López-Fernández, M., and Lubich, C. (2006). Fast and oblivious convolution quadrature. SIAM Journal on Scientific Computing, 28(2):421–438. Schaeffer, H., Tran, G., and Ward, R. (2018). Extracting sparse high-dimensional dynamics from limited data. SIAM Journal on Applied Mathematics, 78(6):3279–3295. Schneider, E., Dai, L., Topper, R. Q., Drechsel-Grau, C., and Tuckerman, M. E. (2017). Stochastic neural network approach for learning high-dimensional free energy surfaces. Phys. Rev. Lett., 119:150601. Serrano, M. and Español, P. (2001). Thermodynamically consistent mesoscopic fluid particle model. Phys. Rev. E, 64:046115. Seryo, N., Sato, T., Molina, J. J., and Taniguchi, T. (2020). Learning the constitutive relation of polymeric flows with memory. Phys. Rev. Research, 2:033107. Shahidi, N., Chazirakis, A., Harmandaris, V., and Doxastakis, M. (2020). Coarse-graining of polyisoprene melts using inverse monte carlo and local density potentials. The Journal of Chemical Physics, 152(12):124902. She, Z., Ge, P., and Lei, H. (2023). Data-driven construction of stochastic reduced dynamics encoded with non-markovian features. The Journal of Chemical Physics, 158(3):034102. Shen, J., Xu, J., and Yang, J. (2018). The scalar auxiliary variable (sav) approach for gradient flows. Journal of Computational Physics, 353:407–416. Shinoda, W., DeVane, R., and Klein, M. L. (2008). Coarse-grained molecular modeling of non-ionic surfactant self-assembly. Soft Matter, 4(12):2454–2462. 87 Singh, D., Mondal, K., and Chaudhury, S. (2021). Effect of memory and inertial contribution on transition-time distributions: Theory and simulations. The Journal of Physical Chemistry B, 125(17):4536–4545. Singh, S., Krishnan, R., and Robinson, G. (1990). Theory of activated rate processes with space-dependent friction. Chemical Physics Letters, 175(4):338–342. Smith, D. E., Babcock, H. P., and Chu, S. (1999). Single-polymer dynamics in steady shear flow. Science, 283(5408):1724–1727. Soper, A. (1996). Empirical potential monte carlo simulation of fluid structure. Chemical Physics, 202(2):295–306. Stecher, T., Bernstein, N., and Csányi, G. (2014). Free energy surface reconstruction from umbrella samples using gaussian process regression. Journal of Chemical Theory and Computation, 10(9):4079–4097. Straub, J. E., Berne, B. J., and Roux, B. (1990). Spatial dependence of time-dependent friction for pair diffusion in a simple fluid. The Journal of Chemical Physics, 93(9):6804–6812. Straub, J. E., Borkovec, M., and Berne, B. J. (1987). Calculation of dynamic friction on intramolecular degrees of freedom. The Journal of Physical Chemistry, 91(19):4995–4998. Straub, J. E., Borkovec, M., and Berne, B. J. (1988). Molecular dynamics study of an isomerizing diatomic in a lennard-jones fluid. The Journal of Chemical Physics, 89(8):4833–4847. Straus, J. B., Gomez Llorente, J. M., and Voth, G. A. (1993). Manifestations of spatially dependent friction in classical activated rate processes. The Journal of Chemical Physics, 98(5):4082–4097. Tarjus, G. and Kivelson, D. (1991). Solvent effect on activated rate processes: On the validity of the gle approach. Chemical Physics, 152(1):153–167. Taylor, G. I. (1934). The formation of emulsions in definable fields of flow. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 146(858):501–523. Thien, N. P. and Tanner, R. I. (1977). A new constitutive equation derived from network theory. Journal of Non-Newtonian Fluid Mechanics, 2(4):353–365. Thomases, B. and Shelley, M. (2007). Emergence of singular structures in oldroyd-b fluids. Physics of Fluids, 19(10):103103. Torrie, G. and Valleau, J. (1977). Nonphysical sampling distributions in monte carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics, 23(2):187–199. 88 Voth, G. A. (1992). A theory for treating spatially-dependent friction in classical activated rate processes. The Journal of Chemical Physics, 97(8):5908–5910. Vroylandt, H. (2022). On the derivation of the generalized langevin equation and the fluctuation- dissipation theorem. Europhysics Letters, 140(6):62003. Vroylandt, H., Goudenège, L., Monmarché, P., Pietrucci, F., and Rotenberg, B. (2022). Likelihood- based non-markovian models from molecular dynamics. Proceedings of the National Academy of Sciences, 119(13):e2117586119. Vroylandt, H. and Monmarché, P. (2022). Position-dependent memory kernel in generalized Langevin equations: Theory and numerical estimation. The Journal of Chemical Physics, 156(24):244105. Wagner, J. W., Dannenhoffer-Lafage, T., Jin, J., and Voth, G. A. (2017). Extending the range and physical accuracy of coarse-grained models: Order parameter dependent interactions. The Journal of chemical physics, 147(4):044113. Wang, J., Olsson, S., Wehmeyer, C., Pérez, A., Charron, N. E., de Fabritiis, G., Noé, F., and Clementi, C. (2019). Machine learning of coarse-grained molecular dynamics force fields. ACS Central Science, 5(5):755–767. Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A., and Case, D. A. (2004). Development and testing of a general amber force field. Journal of Computational Chemistry, 25(9):1157–1174. Wang, Q. (1997). Comparative studies on closure approximations in flows of liquid crystal polymers: I. elongational flows. Journal of Non-Newtonian Fluid Mechanics, 72(2):141 – 162. Wang, S., Ma, Z., and Pan, W. (2020). Data-driven coarse-grained modeling of polymers in solution with structural and dynamic properties conserved. Soft Matter, 16(36):8330–8344. Warner, H. R. (1972a). Kinetic theory and rheology of dilute suspensions of finitely extendible dumbbells. Industrial & Engineering Chemistry Fundamentals, 11(3):379–387. Warner, H. R. (1972b). Kinetic theory and rheology of dilute suspensions of finitely extendible dumbbells. Industrial & Engineering Chemistry Fundamentals, 11(3):379–387. Willard, A. P. and Chandler, D. (2010). Instantaneous liquid interfaces. The Journal of Physical Chemistry B, 114(5):1954–1958. Womersley, J. R. (1955). Method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. The Journal of Physiology, 127(3):553–563. Xie, P., Car, R., and E, W. (2022). Ab Initio Generalized Langevin Equations. arXiv preprint arXiv:2211.06558. 89 Yang, X. (2016). Linear, first and second-order, unconditionally energy stable numerical schemes for the phase field model of homopolymer blends. Journal of Computational Physics, 327:294–316. Yu, P., Du, Q., and Liu, C. (2005). From micro to macro dynamics via a new closure approximation to the fene model of polymeric fluids. Multiscale Modeling & Simulation, 3(4):895–917. Zaremba, S. (1903). Sur une forme perfectionee de la theorie de la relaxation. Bull. Int. Acad. Sci. Cracovie, pages 594–614. Zavadlav, J., Marrink, S. J., and Praprotnik, M. (2018). Multiscale simulation of protein hydration using the swinger dynamical clustering algorithm. Journal of Chemical Theory and Computation, 14(3):1754–1761. Zhang, L., Han, J., Wang, H., Car, R., and E, W. (2018a). Deep potential molecular dynamics: A scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett., 120:143001. Zhang, L., Han, J., Wang, H., Car, R., and E, W. (2018b). Deepcg: Constructing coarse-grained models via deep neural networks. The Journal of Chemical Physics, 149(3):034101. Zhang, L., Han, J., Wang, H., Saidi, W., Car, R., and E, W. (2018c). End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems 31, pages 4436–4446. Curran Associates, Inc. Zhang, L., Wang, H., and E, W. (2018d). Reinforced dynamics for enhanced sampling in large atomic and molecular systems. The Journal of Chemical Physics, 148(12):124113. Zhang, Y., Gao, C., Liu, Q., Zhang, L., Wang, H., and Chen, M. (2020). Warm dense matter simulation via electron temperature dependent deep potential molecular dynamics. Physics of Plasmas, 27(12):122704. Zhao, L., Li, Z., Caswell, B., Ouyang, J., and Karniadakis, G. E. (2018). Active learning of constitutive relation from mesoscopic dynamics for macroscopic modeling of non-newtonian flows. Journal of Computational Physics, 363:116 – 127. Zhou, X.-H., Han, J., and Xiao, H. (2021). Frame-independent vector-cloud neural network for nonlocal constitutive modeling on arbitrary grids. Computer Methods in Applied Mechanics and Engineering. Zhu, Y. and Venturi, D. (2020). Generalized langevin equations for systems with local interactions. Journal of Statistical Physics. Zwanzig, R. (1961). Statistical mechanics of irreversiblity. Lectures in Theoretical Physics, 3:106–141. 90 Zwanzig, R. (1973). Nonlinear generalized Langevin equations. J. Stat. Phys., 9:215 – 220. Zwanzig, R. (1992). Diffusion past an entropy barrier. The Journal of Physical Chemistry, 96(10):3926–3930. Zwanzig, R. (2001). Nonequilibrium Statistical Mechanics. Oxford University Press. 91