MACHINE-LEARNING-BASED MULTI-SCALE MODELING FOR COMPLEX FLUIDS        By   Pei Ge                A DISSERTATION   Submitted to Michigan State University in partial fulfillment of the requirements  for the degree of  Computational Mathematics, Science and Engineering – Doctor of Philosophy  2025 ABSTRACT

Multi-scale modeling presents a long-standing challenge in computational mathematics and is

pertinent to a wide range of applications in materials science, fluid physics, and chemical engineering.

Predicting collective behaviors typically necessitates the integration of modeling dynamics across

micro-scale (atomistic), meso-scale (kinetic), and macro-scale (continuum) levels, with the vast

range of spatiotemporal scales posing a fundamental obstacle. Existing methods often rely on

certain empirical constitutive closures or micro-macro coupling approaches. Despite their broad

applications, modeling accuracy and efficiency are often challenged in real applications.

This dissertation aims to develop data-driven approaches for constructing accurate and reliable

reduced models of multi-scale systems based on first-principle descriptions. The first part, including

Chapters 2 and 3, focuses on constructing meso-scale reduced models of polymer kinetics directly

from the full molecular dynamics. Chapter 2 discussed the many-body effect on conservative

force, which is important to accurately reproduce both the probability density function of the

void formation in bulk and the spectrum of the capillary wave across the fluid interface. Chapter

3 discussed the state-dependence on memory kernel and demonstrated the essential role of the

broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation

relaxation and transition. The second part, Chapter 4, focuses on building accurate macroscale

models from microscale polymer kinetics through meso-scale Langevin dynamics. A non-Newtonian

hydrodynamic model is given as an example, which shows some success in systematically passing

the micro-scale heterogeneous polymer structural mechanics to the macro-scale hydrodynamics

without human intervention.

Copyright by
PEI GE
2025

ACKNOWLEDGEMENTS

Completing this dissertation has been a journey of intellectual growth, perseverance, and support

from many individuals. I am deeply grateful to all those who have guided and encouraged me

throughout this process.

First and foremost, I would like to express my deepest gratitude to my advisor, Dr. Huan Lei, for

his unwavering support, insightful guidance, and patience throughout my academic journey. His

mentorship has been invaluable in shaping my research skills and fostering my academic growth.

His dedication and encouragement have inspired me to persevere through challenges and strive for

excellence. I am truly fortunate to have had the opportunity to learn from him. I am also sincerely

thankful to my dissertation committee members, Dr. Daniel Appelö, Dr. Michael Murillo, and Dr.

Yimin Xiao, for their valuable feedback and encouragement. Their expertise and perspectives have

greatly enriched this work.

I am especially grateful to my parents for their steadfast belief in me. Their endless love and

encouragement have been my greatest source of strength, providing the foundation for my resilience.

I would also like to thank my friends Liyao Lyu, Zhiyuan She, Shijun Liang, and many others,

including Lidong Fang, Haishen Dai, Yue Zhao, and Siyu Guo, for their support and camaraderie

throughout this journey.

This dissertation is as much a product of my efforts as it is of the collective support and

encouragement of all these individuals. I am truly grateful.

iv

TABLE OF CONTENTS

CHAPTER 1

OVERVIEW .
.

.
1.1 Background .
.
.
1.2 Dissertation Contributions
1.3 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
. . . . . . . . . . . . . . . . . . . . . . . . . . .

.

.

.

1
1
1
3

CHAPTER 2

2.1
. .
Introduction . .
2.2 Methods and Models
2.3 Numerical Results .
.
2.4 Summary .

COARSE-GRAINED MOLECULAR DYNAMICS MODELING . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5
5
9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
. 23

.
.

.

.

.

.

CHAPTER 3

DATA-DRIVEN LEARNING OF THE STATE-DEPENDENT
MEMORY KERNEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
.
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
. 39
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.1
.
Introduction .
3.2 Model Derivation .
3.3 Numerical Results .
3.4 Summary . .
3.5 Other Details

. .
. .

. .
. .

.

.

CHAPTER 4

DEEP LEARNING-BASED NON-NEWTONIAN FLUID MODEL . .

Introduction .

.
.
4.1
4.2 Methods . .
. .
4.3 Numerical results .
.
4.4 Discussion .

.
. .

.

.

.

. 46
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 49
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
. 71
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 5

5.1 High Dimensional Stochastic Reduced Model
5.2 Variational-informed Structure-preserving Macro-scale Reduced Model

CONCLUSION AND OUTLOOK . . . . . . . . . . . . . . . . . . . . . 73
. . . . . . . . . . . . . . . . . . 74
. 74

. . .

BIBLIOGRAPHY .

.

.

.

.

. .

. .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

v

CHAPTER 1

OVERVIEW

1.1 Background

Multi-scale modeling is pertinent to a wide range of applications in materials science, fluid

physics, and chemical engineering. Predicting collective behaviors typically necessitates the

integration of modeling dynamics across micro-scale (atomistic), meso-scale (kinetic), and macro-

scale (continuum) levels, with the vast range of spatiotemporal scales posing a fundamental obstacle.

Empirical models are often based on oversimplified micro-scale information but do not have effective

algorithms to extract the relevant information from micro-scale (E et al., 2023).

Machine learning present as been successful in dealing with high dimensional problems in

scientific computing, such as physics-informed neural networks (Raissi et al., 2019a), deep operator

networks (Lu et al., 2021), and Fourier neural operator (Li et al., 2021). For multi-scale modeling,

a tremendous amount of progress has been made, such as deep learning-based coarse-grained

molecular dynamics (Zhang et al., 2018b), machine learning-based moment closure models for the

kinetic equation (Han et al., 2019a), etc. Despite the overwhelming success during the past years,

some challenges as the collection of data, the generalization of the neural network, and preservation

of physical constraints remain less illustrating. By addressing these challenges, this dissertation

aims to utilize machine learning to construct reliable and structure-preserved reduced models with

interpretable micro-scale to macro-scale mapping.

1.2 Dissertation Contributions

This dissertation focuses on both micro-to-meso and micro-to-macro modeling.

1.2.1 Micro to Meso: Data-driven Stochastic Reduced Model

Predicting the collective behavior of complex multiscale systems is often centered around

projecting the full-dimensional dynamics onto a set of resolved variables. However, an accurate

construction of such a reduced model remains a practical challenge for real applications such as

molecular modeling. While model reduction frameworks such as the Koopman operator (Koopman,

1931) and the Mori-Zwanzig projection formalism (Mori, 1965; Zwanzig, 1961) enable us to write

1

down the dynamic equations in terms of the resolved variables, the reduced model generally becomes

non-Markovian with a memory term that may further depend on the resolved variables; the direct

numerical evaluation involves solving the expensive full-dimensional orthogonal dynamics.

Consider the systems whose microscopic state is determined by the instantaneous positions

Z𝑞 and momenta Z𝑝 of the N atoms (3D system). Denote the collection of these variables by

Z(𝑡) = (Z𝑝, Z𝑞), which is a vector of 6𝑁 components. The Hamiltonian dynamics can be written

as:

𝑑Z(𝑡)
𝑑𝑡

= J

𝜕𝐻 (Z(𝑡))
𝜕z

,

Z(0) = z,

(1.1)

where 𝐻 is the Hamiltonian and J = (cid:169)
(cid:173)
(cid:173)
(cid:171)

0

I

−I 0

(cid:170)
(cid:174)
(cid:174)
(cid:172)

Let (q, p) ∈ R2𝑚 represent the resolved variables of a high-dimensional Hamiltonian system,

where q denotes the coarse-grained (CG) coordinates as a function of Z𝑞, and p denotes the CG

momenta. Following the Zwanzig’s formalism (Zwanzig, 2001; Hijón et al., 2010), the reduced

dynamics takes the form

(cid:164)q = M−1p,

(cid:164)p = −∇𝑈 (q) −

∫ 𝑡

0

K(q(𝜏), 𝑡 − 𝜏)v(𝜏)d𝜏 + R𝑡,

(1.2)

where M is the mass matrix, 𝑈 (q) is the free energy, v := (cid:164)q is the velocity, K(q, 𝑡) is the memory,

and R𝑡 is the noise whose covariance function is related to the memory following the second FDT

(Vroylandt and Monmarché, 2022). The construction of the free energy will be discussed in Chapter

2 and the construction of the memory will be discussed in Chapter 3.

1.2.2 Micro to Macro: A Deep Learning-Based Non-Newtonian Hydrodynamic Model

A long-standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows

is the availability of reliable and interpretable hydrodynamic models that faithfully encode the

underlying micro-scale polymer dynamics. The main complication arises from the long polymer

relaxation time, the complex molecular structure, and the heterogeneous interaction.

The empirical continuum hydrodynamic model of incompressible non-Newtonian fluids is given

2

as follows:

∇ · u = 0

𝜌 du
d𝑡

= −∇𝑝 + ∇ · (𝜏𝑠 + 𝜏𝑝) + fext

(1.3)

where u is the velocity, 𝜌 is the dendity, 𝑝 is the pressure, 𝜏𝑠 ∝ (∇u + ∇u𝑇 ) is the solvent stress, fext
is the external force, 𝜏𝑝 is the polymer stress which is generally unknown. The construction of 𝜏𝑝

will be discussed in Chapter 4.

1.3 Dissertation Structure

The dissertation is organized as follows.

Chapter 2 constructs the free energy in CG models of both single- and two-component of

polymeric fluid systems based on the recently developed deep coarse-grained potential (DeePCG)

Zhang et al. (2018b) scheme, where each polymer molecule is modeled as a CG particle.

In

section 2.2, the free energy of the CG models is constructed by only using the training samples

of the instantaneous force under the thermal equilibrium state. In section 2.3, we show that the

constructed CG models can accurately reproduce both the probability density function of the void

formation in bulk and the spectrum of the capillary wave across the fluid interface. More importantly,

the CG models accurately predict the volume-to-area scaling transition for the apolar solvation

energy, illustrating the effectiveness to probe the meso-scale collective behaviors encoded with

molecular-level fidelity.

Chapter 3 focuses on constructing the memory kernel. We present a data-driven method to learn

stochastic reduced models of complex systems that retain a state-dependent memory beyond the

standard generalized Langevin equation with a homogeneous kernel. In section 3.2, we show that

the constructed model naturally encodes the heterogeneous energy dissipation by jointly learning a

set of state features and the non-Markovian coupling among the features. Section 3.3 demonstrate

the limitation of the standard GLE and the essential role of the broadly overlooked state-dependency

nature in predicting molecule kinetics related to conformation relaxation and transition.

Chapter 4 focuses on a micro-to-macro model named DeePN2. In section 4.2 presents that

3

the model retains a multi-scaled nature by mapping the polymer configurations into a set of

symmetry-preserving macro-scale features. The extended constitutive laws for these macro-scale

features can be directly learned from the kinetics of their micro-scale counterparts. Section 4.3

shows that DeePN2 can faithfully capture the broadly overlooked viscoelastic differences arising

from the specific molecular structural mechanics without human intervention.

Finally, Chapter 5 summarizes the discussed modeling and discusses the potential future work.

4

CHAPTER 2

COARSE-GRAINED MOLECULAR DYNAMICS MODELING

In this chapter, we focus on how to construct the conservative force. Conservative forces are essential

in constructing CG models, as they describe the interactions between CG particles. Accurate

conservative forces are critical for obtaining accurate memory and noise terms in CG models.

Therefore, it is crucial to have a good approximation of the conservative forces.

In Ge et al. (2023), we construct the CG models of both single- and two-component of polymeric

fluid systems based on the recently developed deep coarse-grained potential (DeePCG) (Zhang

et al., 2018b) scheme, where each polymer molecule is modeled as a CG particle. By only using the

training samples of the instantaneous force under the thermal equilibrium state, the constructed CG

models can accurately reproduce both the probability density function of the void formation in bulk

and the spectrum of the capillary wave across the fluid interface. More importantly, the CG models

accurately predict the volume-to-area scaling transition for the apolar solvation energy, illustrating

the effectiveness to probe the meso-scale collective behaviors encoded with molecular-level fidelity.

2.1

Introduction

Molecular dynamics (MD) simulations provide a promising avenue to establish the atomistic-

level understanding of many complex systems relevant to biological and materials science. Despite

the overwhelming success during the past decades, a remaining bottleneck roots in the limitation of

the achievable spatio-temporal scales; the gap between the micro-scale atomistic motions and many

meso-scale emerging phenomena remains large. One important problem is the nano-scale interfacial

fluids, which play a crucial role in the hydration and the assembly of the biomolecules and functional

nano-materials (Chandler, 2005; Berne et al., 2009). However, it is well-known that such fluid

systems generally exhibit complex and multifaceted nature on different scales. On the small scale

(i.e., the fluid molecule correlation length), the solvation energy is determined by the molecular

reorganization and scales with the volume of the void space. On the large scale, the solvation energy

is determined by the free energy for maintaining a fluid-void interface and scales with the surface

area. The scale-dependent behavior indicates an cross-over regime of the entropy-enthalpy transition.

5

While theoretical understandings (Lum et al., 1999; Rein ten Wolde et al., 2001; Hummer et al.,

1996, 1998) of this ubiquitous phenomenon have been developed, computational modeling often

relies on full micro-scale MD simulations to retain the multifaceted properties, which, however,

remain too expensive to achieve the resolved scale for applications such as nano-scale assembly.

To accelerate the full MD simulations, many coarse-grained (CG) models have been developed.

By modeling the dynamics in terms of a set of CG variables with reduced dimensionality, the

coarse-grained molecular dynamics (CGMD) simulations, in principle, enable us to probe the

collective behaviors on a broader scale. However, in practice, the construction of truly reliable

CG models can be highly non-trivial, especially for the meso-scale interfacial fluids. There are

two major challenges. The first challenge arises from the many-body nature of CG interactions.

Specifically, the equilibrium density distribution of the CG model needs to match the marginal

density distribution of the CG variables of the full model. Due to the unresolved atomistic degrees

of freedom, the CG potential generally encodes the many-body interactions even if the full MD

force field is governed by two-body interactions (Noid et al., 2008). Existing approaches often rely

on various physical intuitions as well as empirical approximations (Izvekov and Voth, 2005; Noid,

2013; Lei et al., 2010; Hijón et al., 2010; Rudd and Broughton, 1998; Pagonabarraga and Frenkel,

2001; Nielsen et al., 2004; Shinoda et al., 2008; Molinero and Moore, 2009; Larini et al., 2010; Das

and Andersen, 2012; Dinpajooh and Guenza, 2017; Sanyal and Shell, 2016; Moore et al., 2016) that

reproduce certain target thermodynamic quantities and/or structural distributions. For example, the

pairwise additive decomposition based on direct ensemble averaging (Lei et al., 2010; Hijón et al.,

2010) can recover the thermodynamic pressure but often fail to recover the pair distribution function.

Conversely, the Monte Carlo and Boltzmann inverse approaches (Lyubartsev and Laaksonen, 1995;

Soper, 1996; Reith et al., 2003) can reproduce the pairwise distribution function, which, however,

lead to the biased predictions of the equation of state. Several studies account for the many-body

effects by introducing the configuration-independent volume potential (Das and Andersen, 2010;

Dunn and Noid, 2015, 2016) and the local density (Allen and Rutledge, 2008, 2009; Izvekov et al.,

2010; Moore et al., 2016; Sanyal and Shell, 2016; Shahidi et al., 2020) into the pairwise interactions.

6

On the other hand, the accuracy of the high-order structural correlations as well as the direct

applications to interfacial systems remains under-explored.

Besides the many-body effect, the fluid molecules also exhibit heterogeneous density at the

interfacial vicinity. What further complicates the problem is the fact that the interfacial fluid

density distribution is scale-dependent. On the small scale, the molecular reorganization generally

leads to a wet interface with larger density than the bulk value. On the large scale, the fluid-void

phase separation generally leads to a dry interface with lower density. The crossover implies

complex molecular correlations near the interface. To capture this multi-faceted property, the

constructed CG potential needs to properly embody the local particle distribution other than the

homogeneous bulk distribution. Conventional structural-based CG potential functions generally

show limitations to incorporate such information. Similar to the many-body dissipative particle

dynamics (Pagonabarraga and Frenkel, 2001), recent studies employed the local density (Wagner

et al., 2017; DeLyser and Noid, 2017; Sanyal and Shell, 2018; Jin and Voth, 2018; DeLyser and

Noid, 2019, 2020; Berressem et al., 2021) as well as the density gradient (DeLyser and Noid, 2022)

as the auxiliary field variables to construct the CG potential functions. While the CG models

show significant improvement to reproduce the interfacial density profile, the scale-dependent

interfacial energy and fluctuations have not been systematically investigated. In Ref. (Lei et al.,

2015), interfacial energy is integrated into the continuum fluctuation hydrodynamic equation

(Landau and Lifshitz, 1987) from the top-down perspective. Fluid particles essentially represent the

Lagrangian discretization points based on the smoothed dissipative particle hydrodynamics (Serrano

and Español, 2001) instead of the CG molecules; the meso-scale fluid structural properties can not

be retained. Currently, the construction of reliable bottom-up CGMD models that faithfully encode

the multifaceted molecular interactions remains largely open.

In this work, we aim to address the above challenges by constructing CG models of meso-

scale interfacial fluids based on the deep molecular dynamics (DeePMD) scheme (Zhang et al.,

2018a,c). DeePMD is initially developed for learning the many-body interactions from the ab initio

molecular dynamics, and has been applied to construct the deep coarse-grained (DeePCG) model

7

(Zhang et al., 2018b) of liquid water in bulk. Unlike the conventional forms of the inter-molecular

potential function, the DeePMD represents each particle as an agent and the relative positions of its

neighboring particles as the local environment. Rather than approximating the total potential of the

full system by an unified parametric function, the DeePMD directly maps the local environment of

each agent to the potential energy of that particle through a neural network that strictly preserves the

spatial symmetries and the particle permutation invariance. Accordingly, the construction does not

rely on the empirical decomposition (e.g., pairwise, three-body) of the high-dimensional particle

configuration space. This unique feature is particularly suited for modeling the many-body potential

of CGMD models, where the ensemble-averaged interaction between two CG particles further

depends on the other neighboring CG particles and can not be represented by a pairwise additive

function. Moreover, the heterogeneous particle density distribution across the fluid interface can

be naturally incorporated into the CG potential function as the local environment of each particle.

Accordingly, the constructed CG models can accurately model the multifaceted, scale-dependent

interfacial fluctuations and apolar solvation without additional human intervention.

We demonstrate the effectiveness of the CG models by considering both the single- and two-

component fluids in presence of thermal interfacial fluctuations. As discussed in Ref. (Chandler,

2005), the scale-dependent hydrophobic effects can be general for solvent molecules with attractive

interactions; polymeric liquids are therefore used as the benchmark problem. We compare the

numerical results from the full MD simulations and the CG description that represents each molecule

as a single particle located at the center of mass. By merely using training samples under equilibrium

thermal fluctuations, the constructed CG models accurately predict the high-order correlations, the

local compressibility and the interfacial capillary wave for both single- and two-component fluids.

In contrast, the empirical CG potential constructed based on the pairwise approximation shows

apparent deviations. Furthermore, we conduct the rare-event sampling simulations to estimate the

probability of the void formulation in bulk. The predictions of CG model show good agreement

with the full MD results. More importantly, the CG models accurately predict the volume-to-area

scaling transition for the solvation energy, and therefore, pave the way for modeling the nanoscale

8

assembly in aqueous environment.

Before wrapping up this section, we note that the present work focuses on the collective,

quasi-equilibrium properties determined by the conservative potential function of a set of extensive

CG variables; see Refs.

(John and Csányi, 2017; Chan et al., 2019) for relevant work. For

the conformational free energy of non-extensive CG variables, several machine-learning based

approaches (Stecher et al., 2014; Mones et al., 2016; Lemke and Peter, 2017; Galvelis and Sugita,

2017; Schneider et al., 2017; Zhang et al., 2018d; Zavadlav et al., 2018; Wang et al., 2019) have been

developed; see also a recent review (Noé et al., 2020) and the references therein. Furthermore, to

accurately predict the dynamic properties, memory and coherent noise terms (Mori, 1965; Zwanzig,

1973) arising from the unresolved variables need to be properly introduced into the CG model

(Lei et al., 2010; Hijón et al., 2010; Lei et al., 2016; Lei and Li, 2021), which are left to future

investigations.

2.2 Methods and Models

2.2.1 Full Model of the Polymeric Fluids

We consider the micro-scale models of the star polymer melt similar to Ref. (Hijón et al., 2010).

The full system consists of 𝑀 molecules with a total number of 𝑁 atoms. Each polymer molecule

consists of a “center” atom connected by 𝑁𝑎 arms with 𝑁𝑏 atoms per arm. The positions of the

atoms are denoted by q = [q1, q2, · · · , q𝑁 ], where q𝑖 represents the position of the 𝑖-th atom. The

potential function is governed by the pairwise and bond interactions, i.e.,

𝑉 (q) =

∑︁

𝑖≠ 𝑗

𝑉𝑝 (𝑞𝑖 𝑗 ) +

𝑉𝑏 (𝑙𝑘 ),

∑︁

𝑘

(2.1)

where 𝑉𝑝 is the pairwise interaction between both the intra- and inter-molecular atoms except the

bonded pairs. 𝑞𝑖 𝑗 = ∥q𝑖 − q 𝑗 ∥ is the distance between the 𝑖-th and 𝑗-th atoms. 𝑉𝑏 is the bond

interaction between the neighboring particles of each polymer arm and 𝑙𝑘 is the length of the 𝑘-th

bond. The bond potential 𝑉𝑏 is chosen to be the harmonic potential, i.e.,

𝑉𝑏 (𝑙) =

1
2

𝑘 𝑠 (𝑙 − 𝑙0)2,

9

(2.2)

where 𝑘 𝑠 and 𝑙0 represent the elastic coefficient and the equilibrium length 𝑙0, respectively. In this

study, all the physical quantities take the reduced unit. The atom mass is chosen to be unity.

We investigate three fluid systems with micro-scale potential governed by Eq. (2.1). In Sec.

2.3.1, we consider the polymeric fluids in bulk and examine if the CG models can retain the

many-body interactions and the local compressibility. In particular, we choose 𝑁𝑎 = 12, 𝑁𝑏 = 6,

𝜎 = 2.415, 𝜖 = 1.0, 𝑘 𝑠 = 1.714, 𝑙0 = 2.77 similar to Ref. (Hijón et al., 2010). 𝑉𝑝 takes the form of

the Lennard–Jones potential with cut-off 𝑟𝑐, i.e.,

𝑉p(𝑟) =


𝑉LJ(𝑟) − 𝑉LJ(𝑟𝑐), 𝑟 < 𝑟𝑐



0, 𝑟 ≥ 𝑟𝑐

𝑉LJ(𝑟) = 4𝜖

(cid:17) 12

(cid:20)(cid:16) 𝜎
𝑟

(cid:17) 6(cid:21)

,

−

(cid:16) 𝜎
𝑟

(2.3)

where 𝜖 = 1.0 is the dispersion energy and 𝜎 = 2.415 is the hardcore distance. Also we choose

𝑟𝑐 = 21/6𝜎 so that 𝑉𝑝 recovers the Weeks-Chandler-Andersen potential. The full system consists of

𝑁 = 2120 polymer molecules in a cubic domain 180 × 180 × 180 (in reduced unit) with periodic

boundary condition imposed along each direction. The Nosé-Hoover thermostat is employed to

conduct the canonical ensemble simulation with 𝑘 𝐵𝑇 = 3.96.

In Sec. 2.3.2, we consider the polymeric fluid in presence of fluid-void interface. Micro-scale

model parameters are similar to Sec. 2.3.1 except that 𝑟𝑐 = 2.5𝜎 and 𝑘 𝐵𝑇 = 1.7. Simulations are

conducted in a domain 180 × 180 × 200 with periodic boundary condition imposed along the 𝑥- and

𝑦-direction. At the equilibrium, the fluid shows a clear fluid-void interface near 𝑧 = 20 and 𝑧 = 180,

respectively.

In Sec. 2.3.3, we consider a two-component polymeric fluid. Micro-scale model of the polymer

molecule is similar to the single-component fluid system with 𝑁𝑎 = 15, 𝑁𝑏 = 12, 𝑘 𝑠 = 20.0, 𝑙0 = 1.5,

𝑘 𝐵𝑇 = 0.5. The full system consists of 3488 molecules in a domain 200 × 200 × 120 with periodic

boundary condition imposed along each direction. The pairwise interaction 𝑉𝑝 is chosen to be

quadratic, i.e.,

𝑉p(𝑟) =





𝑎
2𝑟𝑐

(𝑟 − 𝑟𝑐)2 , 𝑟 < 𝑟𝑐

.

0, 𝑟 ≥ 𝑟𝑐

10

(2.4)

Specifically, we consider two sets of the pairwise interaction: (I) 𝑎11 = 6.0, 𝑎12 = 3.0, 𝑎22 = 6.0,

𝑟𝑐 = 1.5 , where 𝑎12 represents the pairwise interaction between the component-1 and component-2

atoms. (II) 𝑎11 = 3.0, 𝑎12 = 60.0, 𝑎11 = 3.0, 𝑟 11

𝑐 = 1.5, 𝑟 12

𝑐 = 2.5, 𝑟 11

𝑐 = 1.5. The fluid shows a full

mixture and interfacial separated state for the two cases respectively.

2.2.2 Coarse-grained Models

For all of the three systems, we construct the CG models by representing each molecule as an

individual particle. The positions of the CG particles are denoted by Q = [Q1, Q2, · · · , Q𝑀], where

Q𝑖 = Q𝑖 (q) represents the center of mass (COM) of the 𝑖-th molecule. The conservative potential

𝑈 (Q) is determined by the marginal density function of Q with respect to the equilibrium density

function of the full model, i.e.,

∫

𝜌(Q) =

𝑒−𝑉 (q)/𝑘 𝐵𝑇

𝑀
(cid:214)

𝑖=1

𝑈 (Q) = −𝑘 𝐵𝑇 ln 𝜌(Q).

𝛿(Q𝑖 (q) − Q𝑖)dq/

∫

𝑒−𝑉 (q)/𝑘 𝐵𝑇 dq,

(2.5)

In DeePCG, a neural network ˜𝑈 (Q; 𝚯)is used to represent the CG potential 𝑈 (Q), where 𝚯

represents the neural network parameters. To keep the extensive property, the total energy is

decomposing into local contributions of the individual CG particles:

˜𝑈 (Q; 𝚯) =

𝑀
∑︁

𝑖=1

˜𝑈𝑛𝑛 (D( ˜Q𝑖); 𝚯),

(2.6)

where ˜𝑈𝑛𝑛 is the local potential of an individual particle,

˜Q𝑖 ∈ R𝑁𝑖×4 is the generalized co-

ordinates of the 𝑖-th particle.

It represents the local environment of the 𝑖-th particle relative

to its 𝑁𝑖 neighboring particles within cutoff 𝑅𝑐.

In particular, the 𝑗-th row is defined as

˜Q𝑖

𝑗 = (𝑠(𝑟 𝑗 ), 𝑠(𝑟 𝑗 )𝑥 𝑗 /𝑟 2

𝑗 , 𝑠(𝑟 𝑗 )𝑦 𝑗 /𝑟 2

𝑗 , 𝑠(𝑟 𝑗 )𝑧 𝑗 /𝑟 2

𝑗 ), where r 𝑗 = (𝑥 𝑗 , 𝑦 𝑗 , 𝑧 𝑗 ) denotes the relative

position between the 𝑖-th particle and its 𝑗-th local neighbor. 𝑠(𝑟) is a smooth differentiable function

that decays to 0 at 𝑟 = 𝑅𝑐, which ensures the force also smoothly decays to zero at the cut-off.

D ∈ 𝑅𝑀1×𝑀2 is the symmetry preserving features of each particle. The entry of 𝐷 can be written as:

𝐷 𝑗,𝑙 ( ˜Q𝑖) =

(cid:32) 𝑁𝑖∑︁

𝑘=1

𝑔1, 𝑗 (𝑠(𝑟𝑘 ); 𝚯) ˜Q𝑖
𝑘

𝑔2,𝑙 (𝑠(𝑟𝑘 ); 𝚯) ˜Q𝑖
𝑘

(cid:33)𝑇

,

(cid:33) (cid:32) 𝑁𝑖∑︁

𝑘=1

(2.7)

11

where (cid:8)𝑔1, 𝑗 (𝑠(𝑟); 𝚯)(cid:9) 𝑀1
𝑗=1 are neural networks mapping from the scalar 𝑟
to multiple features, and 𝑀1 and 𝑀2 are the number of customized features. 𝐷 𝑗,𝑙 preserves the

𝑗=1 and (cid:8)𝑔2,𝑙 (𝑠(𝑟); 𝚯)(cid:9) 𝑀2

translational and rotational invariance; the summation over index 𝑘 ensures the permutational

symmetry. In this study, 𝑠(𝑟) is chosen as

𝑠(𝑟) =

𝑟 ,
1

𝑟 ≤ 𝑅𝑐𝑠

(cid:16)

(cid:104) 1
2 cos

1
𝑟

𝜋 𝑟−𝑅𝑐𝑠
𝑅𝑐−𝑅𝑐𝑠

(cid:17)

(cid:105)

+ 1
2

, 𝑅𝑐𝑠 < 𝑟 ≤ 𝑅𝑐

0,

𝑟 > 𝑅𝑐

(2.8)






where 𝑅𝑐𝑠 = 0.97𝑅𝑐 is a smooth cut-off parameter.

In principle, ˜𝑈 (Q; 𝚯) can be trained by minimizing the difference of the predicted force terms

between the full micro-scale and the CG models, i.e., (cid:10)∥∇ ˜𝑈 (Q; 𝚯) − ∇𝑈 (Q) ∥2(cid:11)
represents the conditional expectation with respect to the constraints of Q, i.e., (cid:206)𝑀
𝑖=1

Q, where ⟨·⟩Q
𝛿(Q𝑖 (q) − Q𝑖).

However, the evaluation of the force term −∇𝑈 (Q) relies on the constraint sampling with respect

to 𝛿(Q (q) − Q), which can be computational expensive. On the other hand, we note that the

instantaneous force F (Q) follows F (Q) = −∇𝑈 (Q) + R (Q), where R (Q) is the zero-mean
fluctuation force. Therefore, we have (cid:10)∥∇ ˜𝑈 (Q; 𝚯) − ∇𝑈 (Q) ∥2(cid:11)
(cid:10)∥R (Q)∥2(cid:11)

Q, where the last term does not involve in the training. Accordingly, we can transform

Q = (cid:10)∥∇ ˜𝑈 (Q; 𝚯) + F (Q) ∥2(cid:11)

+

Q

the training by minimizing the empirical loss

𝐿 =

𝑆
∑︁

𝑀
∑︁

𝑖=1

𝑗=1

(cid:13)
(cid:13)
(cid:13)

∇ ˜𝑈 (Q(𝑖); 𝚯) + F𝑗 (Q(𝑖))

(cid:13)
2
(cid:13)
(cid:13)

,

(2.9)

where the superscript represents the index of 𝑆 configurations. For the three micro-scale models

specified in Sec. 2.2.1, we collect training samples from 50-, 200-, 250-long (in reduced unit)

trajectories from the full MD simulations. 5000 snapshots are used to train the CG potential function

for each case. The networks are trained by the Adam stochastic gradient descent method (Kingma

and Ba, 2015). In particular, we emphasize that all the training samples are collected from thermal

equilibrium states. As shown in Sec. 2.3, the constructed CG potentials naturally encode the

12

many-body and heterogeneous interfacial interactions, which enable us to accurately predict rare

events such as the probability of the void formation and scale-dependent apolar solvation energy.

2.3 Numerical Results

2.3.1 Bulk Fluids

Let us start with the CG model of fluids in bulk. Due to the constraint terms in Eq. (2.5), the

marginal probability density function 𝜌(Q) generally can not be represented in form of the simple

two-point correlation 𝜌(2) (Q𝑖, Q 𝑗 ). Accordingly, the CG potential function 𝑈 (Q) generally exhibits

the many-body nature and can not be exactly constructed in form of the pairwise interaction. This

limitation was verified in earlier studies on the CG modeling of polymeric fluids (Lei et al., 2010;

Hijón et al., 2010), where the CG interactions are constructed based on the pairwise decomposition,

i.e.,

𝑈 (Q) ≈

∑︁

𝑖≠ 𝑗

𝑈 (2) (𝑄𝑖 𝑗 )

d𝑈 (2) (𝑟)
d𝑟

= − (cid:10)F𝑖 𝑗 (Q𝑖 𝑗 ) · e𝑖 𝑗 (cid:11)

,

𝑄𝑖 𝑗 =𝑟

(2.10)

where e𝑖 𝑗 = Q𝑖 𝑗 /𝑄𝑖 𝑗 represents the unit vector between the 𝑖-th and 𝑗-th particle.

To examine the model accuracy, we simulate the CG models with 𝑈 (Q) constructed in form of

both Eq. (2.6) and Eq. (2.10). Fig. 2.1 shows the obtained radial distribution functions (RDFs).

Predictions from the full MD and the reduced model based on the DeePCG potential (2.6) show

good agreement. In contrast, the pairwise CG potential (2.10) yields pronounced over-estimations

of the peak value near 𝑟 = 16 due to the over-simplification of the many-body CG potential using

the two-body interaction; see also Refs. (Lei et al., 2010; Hijón et al., 2010).

The many-body nature of 𝑈 (Q) is also manifested in the angular distribution functions (ADFs)

𝑝(𝜃), where 𝜃 is the angle determined by relative positions of three molecules.

𝑃(𝜃; 𝐴𝑟𝑐) =

(cid:42)

1
𝑊

∑︁

∑︁

∑︁

𝑖

𝑗≠𝑖

𝑘 > 𝑗

(cid:43)

𝛿(𝜃 − 𝜃 𝑗𝑖𝑘 )

(2.11)

where 𝜃 𝑗𝑖𝑘 is the angle between Q 𝑗𝑖 and Q𝑘𝑖, and 𝑊 is a normalization factor. The summation is

over all the triplet 𝑖, 𝑗, 𝑘, such that ∥Q𝑖 − Q 𝑗 ∥ ≤ 𝐴𝑟𝑐 and ∥Q𝑖 − Q𝑘 ∥ ≤ 𝐴𝑟𝑐. Fig. 2.2 shows the

13

ADFs within four different cut-off regimes. Similar to the RDF, predictions of the DeePCG model

agree well with the full MD model while the pairwise approximation yields apparent deviations.

Besides the equilibrium correlations, we further examine the fluid local compressibility. While

this property plays an important role in the nano-scale hydrophobicity, canonical solvation theories

generally refer to the fluids at the proximity of the vapor-liquid coexistent phase. Here we examine

this property of bulk fluids for the validation of the constructed many-body CG potential ˜𝑈 (Q); the

discussion of the apolar solvation energy is postponed to Sec. 2.3.2. Specifically, we examine the

rare event of the void formation in bulk. Following Ref. (Patel et al., 2010), we define the smoothed

molecule number within a probing spherical volume centered at Q𝑐 by

(cid:16)

ˆ𝑛

{Q𝑖}𝑀
𝑖=1

(cid:17)

=

(cid:18)

𝑀
∑︁

𝑖=1

1
2

1 + 2 tanh

(cid:18) 𝑅 − 𝑄𝑖
ℎ

(cid:19)(cid:19)

,

(2.12)

where 𝑅 is the radius of the probing sphere, 𝑄𝑖 = ∥Q𝑖 − Q𝑐 ∥ is the distance between the COM of

molecule 𝑖 (or equivalently, the CG particle) and the spherical center, and ℎ = 1.0 represents the

smooth length.

By Eq. (2.12), particle number ˆ𝑛 is differentiable with respect to the individual molecule position

Q𝑖. Similar to Ref. (Patel et al., 2010), we can probe the probability of the void formation by

establishing a replica of umbrella sampling by imposing the bias potential

𝑈bias( ˆ𝑛; 𝑛 𝑗 ) =

𝑘𝑛
2

( ˆ𝑛 − 𝑛 𝑗 )2,

(2.13)

where 𝑘𝑛 is the magnitude of the bias potential and 𝑛 𝑗 is the target value of the particle number inside

the domain, as shown in Fig. 2.3(a). We set 𝑘𝑛 = 21.9 and establish 40 independent simulations

with 𝑛 𝑗 evenly distributed between 0 and 7.5. For each replica, we collect 8 × 105 samples of ˆ𝑛 from

a 1600-long trajectory. By using the weighted histogram analysis method (Kumar et al., 1992b), we

can stitch the joint probability density 𝜌( ˆ𝑛, 𝑛 𝑗 ) to construct 𝜌( ˆ𝑛). Fig. 2.3(b) shows the probability

density 𝜌( ˆ𝑛) obtained from the full MD and the reduced model. The predictions of the DeePCG

model agree well with the full MD model over the full regime of ˆ𝑛.

Finally, we examine the normalized density fluctuation 𝛿𝑛/⟨𝑛⟩ within a spherical volume of
√︃(cid:10)( ˆ𝑛 − ⟨𝑛⟩)2(cid:11) is the standard

various sizes, where ⟨𝑛⟩ is the average particle number and 𝛿𝑛 =

14

Figure 2.1 Radial distribution function 𝑔(𝑟) of the molecule COM obtained from the full MD
simulation, the CG model using the pairwise force approximation by Eq. (2.10), and the DeePCG
model.

deviation. Specifically, we define the particle number by Eq. (2.12) with two different smooth length

ℎ = 1.0 and ℎ = 0.1, respectively. The latter case essentially represents each molecule as a simple

point and counts the particle number as integers, and therefore, yields larger density fluctuations.

As shown in Fig. 2.3(c), the full MD and CG model show good agreement for both cases, indicating

that the CG model can faithfully capture the high-order correlations and the local compressibility

beyond the continuum thermodynamic limit.

2.3.2 Single-component Interfacial Fluids

Besides the many-body interactions, another hallmark of interfacial fluids is the heterogeneous

molecular distribution across the fluid interface, which leads to scale-dependent interfacial inter-

actions and fluctuations. On the macro-scale level, the interfacial interactions can be generally

described by continuum models such as the Young-Laplace equation (Rowlinson and Widom, 2002);

the apolar solvation energy is proportional to the interfacial area and characterized by the surface

tensor. However, on the length scale comparable to the correlation length of the fluid molecules,

the interfacial energy often exhibits a cross-over regime representing the volume-dependent to

area-dependent scaling transition. Therefore, the meso-scale interfacial energy provides a crucial

metric to validate the accuracy of the CG model.

First, we examine the interfacial thermal fluctuations. With the micro-scale model specified in

Sec. 2.2.1, the fluid molecule interaction consists of both the short-range repulsion and long-range

15

010203040r00.511.52g(r)Full MDPairwise CGDeePCGFigure 2.2 Angular distribution function 𝑝(𝜃) of the molecule COM obtained from the full MD
simulation, the pairwise CG model and the DeePCG model with different cut-off regimes 𝐴𝑟𝑐.

attraction. Under the thermal equilibrium states, the fluid system exhibits the fluid-void interfaces

near 𝑧 = 20 and 𝑧 = 180. The periodic boundary condition is imposed along the 𝑥- and 𝑦-direction.

To quantify the molecule distribution near the interface at 𝑧 = 𝑧0, we define the smoothed density

field 𝜌𝑠 (R) by

𝜌𝑠 (R) =

𝑀
∑︁

𝑖=1

𝑊 (∥R − Q𝑖 ∥, ℎ)

(2.14)

on the 𝑁𝑥 × 𝑁𝑦 × 𝑁𝑧 lattice grids. Specifically, R(𝑖, 𝑗,𝑘) := (𝑥𝑖, 𝑦 𝑗 , 𝑧𝑘 ), where (𝑥𝑖, 𝑦 𝑗 ) = (𝑖, 𝑗) × 𝑑𝑙,
𝑑𝑙 = 𝐿/𝑁𝑥 and 𝑧𝑘 = 𝑧0 − ℎ + 𝑘 × 𝑑𝑧, 𝑑𝑧 = 2ℎ/𝑁𝑧. Q𝑖 represents the COMs of the neighboring
molecules for each grid point. 𝑊 (𝑟, ℎ) represents the quintic spline kernel function (Morris et al.,

1997) with finite support ℎ. In this study, we set ℎ = 30.0, 𝑑𝑙 = 1.8 and 𝑑𝑧 = 0.2.

The smoothed density field 𝜌𝑠 (R) enables us to define the instantaneous surface (IS) height

16

00.20.40.60.8100.20.40.60.81P()Arc=13Full MDPairwise CGDeePCG00.20.40.60.8100.20.40.60.81P()Arc=20Full MDPairwise CGDeePCG00.20.40.60.8100.10.20.30.40.50.6P()Arc=30Full MDPairwise CGDeePCG00.20.40.60.8100.10.20.30.40.50.6P()Arc=35Full MDPairwise CGDeePCG(a)

(c)

(b)

(d)

Figure 2.3 The density fluctuation and the molecule number distribution within a spherical probing
volume. (a) A sketch of the star polymer with the black atom as the center. Atoms in the same
arm have the same color. The transparent particle represents the coarse-grained molecule. (b)
A sketch of the instantaneous molecule position under bias potential (2.13). The iso-surface in
blue color represents the interface of the void space. (c) The probability density function of the
molecule number within a spherical volume of radius 𝑅 = 16.0. The vertical dashed line represents
the average molecule number under equilibrium. (d) The normalized density fluctuations within a
spherical volume of radius 𝑅 between 8.0 and 16.0. The particle number is defined by Eq. (2.12)
with the resolution length ℎ set to be 0.1 (solid lines) and 1.0 (dashed lines).

˜ℎ(𝑥, 𝑦) as the iso-surface of the fluid density (Willard and Chandler, 2010), i.e.,

𝜌𝑠 (𝑥, 𝑦, ˜ℎ(𝑥, 𝑦)) = 𝜌0/2,

(2.15)

where 𝜌0 is the bulk fluid density, as shown in Fig. 2.4(a). Accordingly, we can compute the IS

density distribution ˜𝜌(𝑧) along the 𝑧-direction, where the reference position is chosen to be ˜ℎ(𝑥, 𝑦)

for each grid point (𝑥, 𝑦). As shown in Fig. 2.4(c), ˜𝜌(𝑧) exhibits apparent oscillations across the

instantaneous surface. The peaks near 𝑧 = 6 and 𝑧 = 16 represent the first and the second layer of

the fluid molecule near the interface. Alternatively, we can compute the density distribution 𝜌(𝑧)

with respect to the plane at the average of the instantaneous height ⟨ ˜ℎ(𝑥, 𝑦)⟩, i.e., the Gibbs dividing

17

02468n-35-30-25-20-15-10-50log P(n)Full MDDeePCG0.010.020.030.040.05R-3/20.10.20.30.40.50.60.7n/<n>Full MD dh=1.0DeePCG dh=1.0Full MD dh=0.1DeePCG dh=0.1surface. Different from ˜𝜌(𝑧), 𝜌(𝑧) shows a smooth transition from 0 to the bulk value across the

interface. For both definitions, the predictions from the CG model agree well with the full MD

simulations. We emphasize that the learning of the DeePCG potential does not involve any human

intervention such as the definitions of the density field and the interface height. The consistent

predictions between the MD and CG models validate that constructed DeePCG potential ˜𝑈 (Q; 𝚯)

faithfully captures the intrinsic fluid structure near the interface.

(a)

(c)

(b)

(d)

Figure 2.4 The fluid density and the fluctuating interface of the single-component interfacial fluid
system. (a) The interface defined by Eq. (2.15) with molecules (red) and interface (green). (b)
A sketch of the instantaneous density field defined by Eq. (2.14). (c) The average density profile
across the Gibbs dividing surface (GDS) and the instantaneous interface (IS) defined by Eq. (2.15).
(d) The ensemble average of the capillary wave spectrum of the fluctuating interface. The solid line
in red represents the CWT fitting using Eq. (2.17) at the low wave number.

To further examine the interfacial fluctuations, we evaluate the Fourier spectrum of the

instantaneous height ˜ℎ(𝑥, 𝑦), i.e.,

ˆℎ(k) =

1
𝐿2

∫ 𝐿

∫ 𝐿

0

0

˜ℎ(𝑥, 𝑦)𝑒−𝑖𝑘 𝑥𝑥−𝑖𝑘 𝑦 𝑦d𝑥d𝑦,

(2.16)

where k = (𝑘𝑥, 𝑘 𝑦) is the 2D wave number. Fig. 2.4 shows the ensemble average of the spectrum
(cid:10)| ˆℎ(k)|2(cid:11). On the low wave number limit, the interfacial energy is governed by the surface tensor

18

00.10.20.30.40.50.60.70.80.91-10010203040z00.511.52Full MD ISDeepCG ISFull MD GDSDeepCG GDS10-1100|k|10-5100Full MDCWTDeePCGwith equi-partition distribution among the individual Fourier modes following the capillary wave

theory (CWT) (Buff et al., 1965; Evans, 1979), i.e.,

(cid:68)(cid:12)
2(cid:69)
(cid:12) ˆℎ(k)(cid:12)
(cid:12)

=

𝑘 𝐵𝑇 𝐿2
𝛾|k|2

,

(2.17)

where 𝛾 is the surface tension. At low wave number, (cid:10)| ˆℎ(k)|2(cid:11) obtained from numerical simulations

shows good agreement with the CWT theory. As the wave number increases, the spectrum gradually

deviates from the CWT prediction, indicating that there exists strong correlations between the height

fluctuations of neighboring sites on the molecular scales. Nevertheless, the predictions from the CG

model agree well with the MD results over the entire wave number regime. In particular, the good

agreement in the high wave number regime shows that the CG model can accurately capture the

local roughness of the interface, which is extremely sensitive to the molecule spatial correlations

and the many-body interactions.

Next, we examine the meso-scale, size-dependent apolar solvation energy. Similar to the bulk

system considered in Sec. 2.3.1, we examine the probability density function of the number of

molecule 𝑃( ˆ𝑛) within a spherical volume of radius 𝑅 = 25.0. As shown in Fig. 2.5(a), the predictions

from the full MD and the CG model agree well over the full regime of ˆ𝑛. In particular, at the

quasi-equilibrium regime, the interfacial energy is mainly determined by the fluid compressibility;

𝑃( ˆ𝑛) and ˆ𝑛 follow the quadratic relationship, i.e., 𝑃( ˆ𝑛) ∝ ( ˆ𝑛 − ⟨𝑛⟩)2/𝛿𝑛2. Since both ˆ𝑛 and 𝛿𝑛2

scale with the volume, the free energy −𝑘 𝐵𝑇 ln 𝑃( ˆ𝑛) scale with the volume near ⟨𝑛⟩. In contrast,

𝑃( ˆ𝑛) deviates from the quadratic relationship as ˆ𝑛 decreases and yields a larger value of 𝑃(0).

The fat tail arises from the formation of a clear void-fluid interface. In particular, on the scale

beyond the correlation length of fluid molecules, the local molecular reorganization is insufficient to

accommodate the phase separation. Accordingly, the interfacial energy scales with the surface area

of the void space.

The multi-faceted nature of the interface energy can be further examined by computing the

apolar solvation free energy Δ𝐺 = −𝑘 𝐵𝑇 ln 𝑃(0) for the different sizes of the void space. By the

theory of Pratt and his co-worker (Hummer et al., 1996), for the small void space, Δ𝐺 is governed

19

by the molecule number fluctuations with the Gaussian distribution, i.e.,

Δ𝐺 ≈

1
2

𝑘 𝐵𝑇 ˆ𝑛2/𝛿𝑛2 +

1
2

𝑘 𝐵𝑇 ln 2𝜋𝛿𝑛2,

(2.18)

where ˆ𝑛2/𝛿𝑛2 scales with the space volume 4/3𝜋𝑅3. On the large scale, Δ𝐺 is determined by the

macro-scale surface tensor 𝛾, i.e., Δ𝐺 ≈ 4𝜋𝑅2𝛾.

(a)

(b)

Figure 2.5 (a) The probability density function of the molecule number within a spherical volume
of radius 𝑅 = 25.0. The red line represents the quadratic fitting; the deviation near 𝑛 = 0 arises
from the formation of a clear fluid-void interface, where free energy approximately scales with the
area of the interface. (b) Normalized solvation free energy Δ𝐺 (𝑅)/4𝜋𝑅2 obtained from the thermal
integration sampling by Eq. (2.19). The transition from the volume- to area-scaling occurs between
𝑅 = 15 and 25. The two symbols represent the predictions from the probability of the void space
−𝑘 𝐵𝑇 ln 𝑃(0) for 𝑅 = 25 in (a). The dashed horizontal line represents the macro-scale limit with
the surface tensor 𝛾 obtained from the fluctuating interface using CWT (Eq. (2.17)) presented in
Fig. 2.4.

To quantify the cross-over regime, we conduct the thermal integration sampling of Δ𝐺 (𝑅) with

𝑅 between 0 and 34. The integration force dΔ ˜𝐺

d𝑅 is estimated by imposing the biased potential, i.e.,

dΔ𝐺 (𝑅)
d𝑅

=

(cid:42) 𝑀
∑︁

𝑖=1

∇Q𝑖𝑈𝑏𝑖𝑎𝑠 ·

Q𝑖 − Q𝑐
∥Q𝑖 − Q𝑐 ∥

(cid:43)

,

(2.19)

where 𝑈bias( ˆ𝑛; 0) is defined by Eq. (2.13) with 𝑘𝑛 = 29.20 and ℎ = 0.4. Fig. 2.5(b) shows the

obtained solvation energy Δ𝐺 (𝑅) normalized by the surface area. The predictions of the CG and the

full MD models show good agreement. In particular, at small value of 𝑅, Δ𝐺 (𝑅)/4𝜋𝑅2 grows with

𝑅 and implies the volume-scaling regime. The transition from the volume- to the area-scaling occurs

between 𝑅 = 15 and 𝑅 = 25. For 𝑅 > 30, Δ𝐺 (𝑅)/4𝜋𝑅2 approaches the value of the macro-scale

surface tensor 𝛾 estimated from the interfacial fluctuations by the CWT theory (2.17) shown in Fig.

2.4.

20

010203040n-150-100-500log P(n)Full MDQuadratic FitDeePCG51015202530R00.0050.010.0150.020.0250.03G(R)/(4R2)Full MDDeePCG(a)

(b)

Figure 2.6 The average equilibrium fluid density with a distance 𝑟 + 𝑅, where 𝑅 is the radius of the
spherical void space with 𝑅 = 10 (left) and 𝑅 = 30 (right).

The scale-dependent interfacial energy is also manifested in the solvent density distribution

near the vicinity of the void space. Fig. 2.6 shows the normalized radial distribution function

𝑔(𝑟 + 𝑅) adjacent to the interface. For 𝑅 = 10, solvation is governed by the local compressibility

and molecule re-organization, leading to the high fluid density adjacent to the interface. For 𝑅 = 30,

solvation leads to the clear fluid-void interface and fluid density is closer to the bulk value. The CG

model accurately captures the transition and agrees well with the full MD results for both cases.

2.3.3 Two-component Fluids

We first consider a two-component fluid system that takes the parameter set (I) specified in

Sec. 2.2.1. Therefore, the full MD system can maintain a full mixture state. The reduced model is

represented by the CG particles of two different types. The equilibrium state reaches a full mixture

state as well. Fig. 2.7 shows the radial distribution functions of the COM of the molecules. Due to

the “hydrophilic” interactions between type-1 and -2 molecules, the pair distribution between type

1-2 shows more a pronounced peak at 𝑅 = 11.5 as compared with the distribution between type

1-1 at 𝑅 = 12.5. Similar to Sec. 2.3.1, we compute the angular distribution functions among the

molecules of both types. For all of the correlation functions, the CG and full MD models show

good agreement.

Next, we consider the parameter set (II) specified in Sec. 2.2.1. Due to the “hydrophobic”

interaction between the two molecule types, the system develops into an immiscible state with a

clear interface between the two components, as shown in Fig. 2.8(a). To examine the heterogeneous

21

051015202530r0.511.522.533.5g(R+r)R=10Full MDDeePCG051015202530r0.511.522.533.5g(R+r)R=30Full MDDeePCG(a)

(b)

Figure 2.7 (a) Radial distribution function 𝑔(𝑟) of the two-component, miscible polymer fluid system
among the molecule COM of type 1-1, type 1-2. (b) Angular distribution function 𝑃(𝜃) of the same
system among the molecule COM of type 1.

fluid particle distribution, we analyze the radial distribution functions of the fluid particle on the 𝑥-𝑦

plane at different regimes. Fig. 2.8(b) shows the planar RDFs sampled at 𝑧 = 60 (interface) and

𝑧 = 30 (bulk). In particular, the planar RDF near the interfacial regime shows more pronounced

peaks and structural oscillations compared with the RDF in the bulk regime. For both cases, the

predictions from the CG model show good agreement with the full MD simulations.

To further quantify the fluid density across the interface, we define the density field 𝜌𝑠 (R) by

Eq. (2.14) on the lattice grids across the average interface of the two components (i.e., GDS) and

the instantaneous height ˜ℎ(𝑥, 𝑦) as the iso-surface of the fluid density of a single component (i.e.,

IS). For this system, we set ℎ = 40, 𝑑𝑙 = 2.0 and 𝑑𝑧 = 1.0. Fig. 2.8(c) shows the density profiles

˜𝜌(𝑧) and 𝜌(𝑧) across the interface based on the definition of IS and GDS, respectively. Similar to

the single-component fluid system, ˜𝜌(𝑧) shows pronounced oscillations that represent the intrinsic

multi-layer fluid structure across the interface. In contrast, 𝜌(𝑧) shows a smooth transition across the

interface due to the ensemble-averaged definition of the interface plane. The consistent predictions

between the MD and CG models validate the accuracy of the constructed DeePCG potential.

Finally, we examine the thermal fluctuations across the interface. Fig. 2.8(d) shows the ensemble
average of the Fourier spectrum density (cid:10)| ˆℎ(k)|2(cid:11) of the instantaneous height ˜ℎ(𝑥, 𝑦) defined by
Eq. (2.16). Similar to the single-component interfacial fluid system, (cid:10)| ˆℎ(k)|2(cid:11) agrees well with the

CWT theory at the low wave number and deviates from the 1/|k|2 scaling at high wave number

22

010203040r00.511.52g(r)Full MDFull MDDeePCG, type 1-1DeePCG, type 1-200.511.522.5300.10.20.30.40.50.60.7P()Arc=18Full MDFull MDFull MDDeePCG, 1-1-1DeePCG, 1-1-2DeePCG, 2-1-2(a)

(c)

(b)

(d)

Figure 2.8 The fluid density and the fluctuating interface of the two-component, immiscible fluid
system. (a) The interface defined by Eq. (2.16) with type-1 (blue) and type-2 molecules (red). (b)
Radial distribution function 𝑔(𝑟) of type-2 molecules on the 𝑥-𝑦 plane near the bulk (𝑧 = 30) and
the interface (𝑧 = 60). (c) The average density profile across the Gibbs dividing surface and the
instantaneous surface defined by Eq. (2.15). (d) The capillary wave spectrum of the fluctuating
interface.

due to the local spatial correlations between the molecules. The predictions from CG and full MD

models show good agreement over the full regime.

2.4 Summary

In this study, we constructed coarse-grained models of meso-scale interfacial polymeric fluids

based on the DeePCG scheme (Zhang et al., 2018b). In particular, the constructed CG potential can

accurately encode the many-body interactions arising from the unresolved atomistic interactions, as

well as the heterogeneous molecule distributions near the interface. This unique feature ensures that

the constructed CG models can retain the consistent invariant distribution with the full MD model

and faithfully capture the multi-facted, scale-dependent interfacial energy without additional human

intervention. The training process only requires the MD samples of the instantaneous force field

without further ad hoc assumptions and approximations of the CG potential functions.

23

010203040r00.511.522.5g(r)Full MD BulkFull MD InterfaceDeePCG BulkDeePCG Interface-5051015202530z00.511.522.5Full MD ISDeePCG ISFull MD GDSDeePCG GDS10-1100|k|10-1010-5100Full MDDeePCGWhile we focus on the polymeric fluids in this study, the present CG models can be generalized

for complex fluids and soft matter systems where the many-body and heterogeneous effects are

often pronounced. In particular, the constructed CG potential functions accurately reproduce the

pairwise and high-order correlation functions while the empirical approximations show limitations.

Moreover, the accurate predictions of the local compressibilty and the full-range spectrum of the

interfacial fluctuations demonstrate the validity of the CG models to probe the collective behaviors

across the molecular and continuum scales. More importantly, the CG models successfully predict

the probability of the void formation as a rare event and the transition of the volume- to area-scaling

of solvation energy. The accurate predictions on such properties show the promise of the present

models to study the challenging problems relevant to nanoscale assembly processes (Miller et al.,

2007), where the full MD simulations often show limitation to achieve the resolved spatio-temporal

scale.

Finally, we note that the present study focuses on the quasi-equilibrium properties of the reduced

model. The zero-rate shear viscosity predicted by the DeePCG model is 55.56% less than the value of

the full MD model. The predictive modeling of the dynamic properties further relies on the accurate

construction of the memory and fluctuation terms that represent the unresolved energy-dissipation

processes (Hijón et al., 2010; Lei et al., 2016; Lei and Li, 2021; She et al., 2023). Also, it is worth

exploring the construction of CG potential function with certain generalization abilities that account

for the different temperature (Zhang et al., 2020) and model resolution (Empereur-mot et al., 2022).

We will pursue these problems in future studies.

24

CHAPTER 3

DATA-DRIVEN LEARNING OF THE STATE-DEPENDENT MEMORY KERNEL

In this chapter, we focus on how to construct the memory kernel. In Ge et al. (2024), we present

a data-driven method to learn stochastic reduced models of complex systems that retain a state-

dependent memory beyond the standard generalized Langevin equation (GLE) with a homogeneous

kernel. The constructed model naturally encodes the heterogeneous energy dissipation by jointly

learning a set of state features and the non-Markovian coupling among the features. Numerical results

demonstrate the limitation of the standard GLE and the essential role of the broadly overlooked

state-dependency nature in predicting molecule kinetics related to conformation relaxation and

transition.

3.1

Introduction

Predicting the collective behavior of complex multiscale systems is often centered around

projecting the full-dimensional dynamics onto a set of resolved variables. However, an accurate

construction of such a reduced model remains a practical challenge for real applications such as

molecular modeling. While model reduction frameworks such as the Koopman operator (Koopman,

1931) and the Mori-Zwanzig projection formalism (Mori, 1965; Zwanzig, 1961) enable us to write

down the dynamic equations in terms of the resolved variables, the reduced model generally becomes

non-Markovian with a memory term that may further depend on the resolved variables; the direct

numerical evaluation involves solving the expensive full-dimensional orthogonal dynamics. In

practice, one common approximation is to ignore such state-dependency; the reduced model is

simplified as the standard generalized Langevin equation (GLE) (Zwanzig, 2001) with a memory

kernel that only depends on time. Several approaches (Lange and Grubmüller, 2006; Darve et al.,

2009; Ceriotti et al., 2009; Baczewski and Bond, 2013; Davtyan et al., 2015; Lei et al., 2016; Russo

et al., 2019; Jung et al., 2017; Lee et al., 2019; Ma et al., 2019; Wang et al., 2020; Zhu and Venturi,

2020; Klippenstein and van der Vegt, 2021; Vroylandt et al., 2022; She et al., 2023; Xie et al., 2022)

have been developed to construct the memory kernel such that certain dynamic properties (e.g., the

two-point correlations) can be properly reproduced. Despite its broad application, the validity of

25

the standard GLE for real multiscale systems remains less understood (Hänggi, 1997; Klippenstein

et al., 2021).

Intuitively, the above model reduction problem is somewhat analogous to hiking on a mountain

where the landscape map and the path roughness represent the free energy and the memory term,

respectively. In general, we should not expect homogeneous path roughness at the different locations

(e.g., the valleys and the ridges), which, conversely, needs to be inferred from the hiking records.

Indeed, studies based on full molecular dynamics (MD) simulations (Posch et al., 1984; Straub et al.,

1987, 1990; Plotkin and Wolynes, 1998; Luo et al., 2006; Best and Hummer, 2006, 2010; Hinczewski

et al., 2010; Satija et al., 2017; Morrone et al., 2012; Daldrop et al., 2017) and sophisticated projection

operator construction (Deutch and Oppenheim, 1971; Zwanzig, 1973, 1992; Berezhkovskii and

Szabo, 2011; Glatzel and Schilling, 2022; Vroylandt, 2022; Vroylandt and Monmarché, 2022; Ayaz

et al., 2022a; Jung and Jung, 2023) show that the extracted memory term can exhibit a pronounced

state-dependent nature, where the implications for the collective behaviors remain under-explored.

For extensive MD systems, a recent study (Lyu and Lei, 2023) on reduced modeling of polymer

melt shows that the heterogeneous inter-molecular energy dissipation (i.e., the memory) can be

crucial for transport on the hydrodynamic scale. However, for canonical non-extensive problems

such as biomolecule systems, a quantitative understanding of the state-dependent memory effect

on the reduced dynamics remains an open problem. Several recent works (Lei et al., 2016; Lee

et al., 2019; Satija and Makarov, 2019; Grogan et al., 2020; Singh et al., 2021; Ayaz et al., 2021;

Vroylandt et al., 2022; Dalton et al., 2023) model the non-Markovian effect for transition dynamics

based on the standard GLE. While elegant semi-analytical studies (Straub et al., 1988; Singh et al.,

1990; Carmeli and Nitzan, 1983; Tarjus and Kivelson, 1991; Krishnan et al., 1992; Voth, 1992;

Straus et al., 1993; Haynes et al., 1993, 1994; Cossio et al., 2015) on idealized 1D double-well

potential provide theoretical insights into the state-dependent nature, quantitative modeling that

retains the reduced dynamics consistent with the full MD model, including collective properties such

as transition and conformation relaxation, relies on accurate construction and efficient simulation of

a reduced model beyond the standard GLE.

26

This work presents a data-driven approach for learning a new stochastic reduced model that

retains a state-dependent memory for non-extensive systems. Instead of dealing with the orthogonal

dynamics (Darve et al., 2009; Vroylandt and Monmarché, 2022; Lyu and Lei, 2023), the training

only relies on the trajectory samples and does not directly solve the Mori-Zwanzig projection

formalism. The main idea is to seek a generalized representation of the memory as the composition

of a set of state-dependent features, which encodes the coupling between the resolved and unresolved

variables and will be learned using three-point correlation functions. Efficient training is achieved

by constructing the encoders using a set of sparse bases, whose correlations can be efficiently

pre-computed. The time-dependent component is directly learned in the Fourier space which enables

the efficient evaluation of the convolution term via the FFT and meanwhile ensures non-negative

energy dissipation (i.e., model stability). To simulate the model, coherent noise can be introduced

that strictly satisfies the second fluctuation-dissipation theorem (FDT) and retains a consistent

invariant distribution. The present model, with a new memory form, essentially reveals a caveat in

model reduction of multiscale systems and provides a reliable approach for simulating the stochastic

reduced dynamics beyond empirical models. It enables us to probe open problems such as the

effect of state-dependent memory on molecular kinetics. Numerical results show that the broadly

overlooked state-dependency can play a crucial role. In particular, the standard GLE is insufficient

to capture the collective properties such as conformation relaxation and transition rate distribution,

which, fortunately, can be reproduced by the present model.

3.2 Model Derivation

Let (q, p) ∈ R2𝑚 represent the resolved variables of a high-dimensional Hamiltonian system,

where q denotes the coarse-grained (CG) coordinates as a function of the position variables of the

full model, and p denotes the CG momenta. Following the Zwanzig’s formalism (Zwanzig, 2001;

Hijón et al., 2010), the reduced dynamics takes the form

(cid:164)q = M−1p,

(cid:164)p = −∇𝑈 (q) −

∫ 𝑡

0

K(q(𝜏), 𝑡 − 𝜏)v(𝜏)d𝜏 + R𝑡,

(3.1)

27

where M is the mass matrix, 𝑈 (q) is the free energy, v := (cid:164)q is the velocity, K(q, 𝑡) is the memory,

and R𝑡 is the noise whose covariance function is related to the memory following the second FDT

(Vroylandt and Monmarché, 2022). Before proceeding to the construction of K(q, 𝑡), we note that

the rigorous form based on Zwanzig’s formalism depends on both q and p. Here we focus on the

state-dependence on q and assume it is independent of p (Hijón et al., 2010). Furthermore, M

generally depends on q; the current choice of q leads to a constant mass matrix [see Refs. (Lee

et al., 2019; Ayaz et al., 2022a) and Section 3.5.1]. Also, the construction of the free energy 𝑈 (q)

can be nontrivial; several canonical methods based on enhanced sampling (Torrie and Valleau,

1977; Kumar et al., 1992a; Darve and Pohorille, 2001; Laio and Parrinello, 2002) and temperature

acceleration (Rosso et al., 2002; Maragliano and Vanden-Eijnden, 2006; Abrams and Tuckerman,

2008; Maragliano and Vanden-Eijnden, 2008) have been developed to facilitate the phase space

exploration. We assume the phase space can be effectively explored and 𝑈 (q) is known a priori.

Instead of rigorously constructing K(q, 𝑡) from the full model, we ask the question of which

forms of K can generate a memory effect. One common approach is to embed the memory in a

larger Markovian dynamics with a set of auxiliary variables. An essential observation is that the

memory term can be generally written as

K(q(𝜏), 𝑡 − 𝜏) ≈ C+ ◦ exp (cid:0)(𝑡 − 𝜏)Laux

(cid:1) ◦ C−,

(3.2)

where Laux is the Liouville operator corresponding to the auxiliary dynamics and C± are channels

representing the coupling of the resolved and auxiliary variables. As a special case, if the coupling

and the auxiliary dynamics take a linear form, the embedded memory recovers the standard GLE

kernel, i.e., K(q, 𝑡) = K(𝑡) (e.g., see Refs. (Lei and Li, 2021; She et al., 2023)). Therefore, to

construct the reduced model beyond the standard GLE, the coupling channels need to properly retain

certain kinds of state-dependency nature. This motivates us to represent C± by seeking a set of
state-dependent features 𝜙(q) = [𝜙1(q), · · · , 𝜙𝑛 (q)], where 𝜙 : R𝑚 → R𝑛×𝑚 essentially encode the

nonlinear coupling between the resolved and unresolved variables and the detailed form will be

specified later. exp (𝑡Laux) induces the non-Markovian interactions among the features with a time
lag 𝑡 characterized by a kernel function, i.e., C+ ◦exp ((𝑡 − 𝜏)Laux) ◦ C− = 𝜙(q(𝑡))𝑇 Θ(𝑡 −𝜏)𝜙(q(𝜏)),

28

where Θ : R+ → R𝑛×𝑛 and component Θ𝑖 𝑗 (𝑡 − 𝜏) represents the dissipation between features 𝜙𝑖 (q(𝑡))

and 𝜙 𝑗 (q(𝜏)). In the remainder of this work, we use 𝜙𝑡 to denote 𝜙(q(𝑡)).

With the above observation, we propose the following form to model the reduced dynamics

(3.1), i.e.,

(cid:164)q = M−1p,

(cid:164)p = −∇𝑈 (q) −

∫ 𝑡

0

𝜙𝑇
𝑡 Θ(𝑡 − 𝜏)𝜙𝜏v(𝜏)d𝜏 + R𝑡,

(3.3)

where encoders {𝜙𝑖 (q)}𝑛

𝑖=1 and kernel Θ(𝑡) need to be determined. As a special case, at the
Markovian limit Θ(𝑡) ∝ 𝛿(𝑡), Eq. (3.3) recovers the Langevin dynamics and the quadratic form

𝜙𝑇 𝜙 ensures positive energy dissipation. Also, by choosing Θ(𝑡) to be diagonal with individual

components corresponding to certain frequency modes, Eq. (3.3) reduces to the heat bath model

(Zwanzig, 1973) with a nonlinear coupling of bath coordinates. On the other hand, the present

model enables an adaptive choice of the number of spatial features and a more general form of Θ(𝑡)

with the off-diagonal components capturing the non-Markovian coupling among the features, which

turns out to be crucial for reproducing the collective dynamics.

3.2.1 Coherent Noise and Invariant Density of the Reduced Model

We emphasize that Eq. (3.3) should not be viewed as a direct approximation of Zwanzig’s

projection formalism. Rather, it serves as a reduced model that faithfully retains the state-dependent
memory effect. To construct the model, we represent encoders {𝜙𝑖 (q)}𝑛

𝑖=1 and kernel Θ(𝑡) in form of

𝜙𝑖 (q) = H𝑇

𝑖 𝜓(q),
𝑁 𝜔∑︁

Θ(𝑡) = e−𝛼𝑡

ˆΘ𝑘 cos(𝜔𝑘𝑡),

(3.4)

𝑘=0

where 𝜓(q) = (cid:2)𝜓1(q), · · · , 𝜓𝑁𝑏 (q)(cid:3) is a set of sparse bases and H = (cid:2)H𝑇
coefficients, 𝜔𝑘 = 2𝜋
𝑇𝑐
not be viewed as the bases to approximate Θ(𝑡) (e.g., (cid:8)e−𝛼𝑖𝑡 cos(𝛽𝑖𝑡), e−𝛼𝑖𝑡 sin(𝛽𝑖𝑡)(cid:9) 𝑁 𝛼

(cid:3) are trainable
𝑘 and 𝑇𝑐 is the time domain cut-off of the kernel. We note that e−𝛼𝑡 should

𝑖=1; see Refs.
(Lei et al., 2016; Lee et al., 2019)). Rather, Θ(𝑡) is mainly characterized by the Fourier series

, · · · , H𝑇
𝑛

1

expansion on [0, 𝑇], and the exponential term e−𝛼𝑡 is essentially a regularization term to eliminate

29

the periodicity while maintaining the semi-positive definiteness condition. Θ(𝑡) needs to preserve
𝑘 , where Γ𝑘 ∈ R𝑛×𝑛 is a

positive semi-definiteness. Hence, we represent Fourier modes ˆΘ𝑘 = Γ𝑘 Γ𝑇

low-triangular matrix to be determined along with 𝛼 ≥ 0.

For the fluctuation term R𝑡, we represent it as a noise in the form of R𝑡 = 𝜙𝑇

𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡)
is a Gaussian random process whose covariance function determined by Θ(𝑡), i.e., ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ =
𝑘 𝐵𝑇Θ(𝑡 − 𝜏). This choice avoids dealing with the orthogonal dynamics to calculate the fluctuation

term. Furthermore, we can show that this choice enables the reduced model to retain a consistent

invariant density function.
Proposition 3.2.1. For reduced model (3.3) with Θ(𝑡) = e−𝛼𝑡 (cid:205)𝑁 𝜔
𝑘=0
fluctuation term R𝑡 = 𝜙𝑇

𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡) is a Gaussian random process satisfying

ˆΘ𝑘 cos(𝜔𝑘𝑡), by choosing the

⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇Θ(𝑡 − 𝜏),

the reduced model has an invariant distribution

𝜌eq(q, p) ∝ exp (cid:8)− (cid:2)𝑈 (q) + p𝑇 M−1p/2(cid:3) /𝑘 𝐵𝑇 (cid:9) .

Proof. Let us introduce auxiliary variables

z𝑘,1 = −

z𝑘,2 = −

∫ 𝑡

0
∫ 𝑡

0

e−𝛼(𝑡−𝜏)Γ𝑘 cos(𝜔𝑘 (𝑡 − 𝜏))𝜙𝜏v(𝜏)d𝜏 + R𝑘,1(𝑡),

e−𝛼(𝑡−𝜏)Γ𝑘 sin(𝜔𝑘 (𝑡 − 𝜏))𝜙𝜏v(𝜏)d𝜏 + R𝑘,2(𝑡),

where Γ𝑇

𝑘 Γ𝑘 = ˆΘ𝑘 and R𝑘,1(𝑡) is a Gaussian random process satisfying

(3.5)

(3.6)

(3.7)

(cid:10)R 𝑗,1(𝑡)R𝑘,1(𝜏)𝑇 (cid:11) = 𝑘 𝐵𝑇 𝛿 𝑗 𝑘 e−𝛼(𝑡−𝜏) cos(𝜔𝑘 (𝑡 − 𝜏)),

(3.8)

where 𝛿 𝑗 𝑘 is the Kronecker delta. Accordingly, the second equation of Eq. (3.3) can be written as

(cid:164)p = −∇𝑈 (q) + 𝜙(q)𝑇 ∑︁

Γ𝑇
𝑘 z𝑘,1,

𝑘

(3.9)

and R 𝑗,2(𝑡) will be specified later.

30

Let z𝑘 = (cid:2)z𝑘,1, z𝑘,2

(cid:3) and R𝑘 = (cid:2)R𝑘,1, R𝑘,2

(cid:3), we can rewrite Eq. (3.7) by

z𝑘 = −

∫ 𝑡

0

cos(𝜔𝑘 (𝑡 − 𝜏))𝐼

e−𝛼(𝑡−𝜏) (cid:169)
(cid:173)
(cid:173)
− sin(𝜔𝑘 (𝑡 − 𝜏))𝐼
(cid:171)
−𝛼𝐼 𝜔𝑘 𝐼

∫ 𝑡

exp

= −

(cid:170)
(cid:174)
(cid:174)
(cid:172)
By taking the time derivative of Eq. (3.10) with respect to 𝑡, we have

d𝜏 + R𝑘 (𝑡).

−𝜔𝑘 𝐼 −𝛼𝐼

(𝑡 − 𝜏)

(cid:169)
(cid:173)
(cid:173)
(cid:171)

(cid:170)
(cid:174)
(cid:174)
(cid:172)

0

0

Γ𝑘 𝜙𝜏v(𝜏)



(cid:169)

(cid:173)

(cid:173)


(cid:171)










sin(𝜔𝑘 (𝑡 − 𝜏))𝐼

cos(𝜔𝑘 (𝑡 − 𝜏))𝐼

Γ𝑘 𝜙𝜏v(𝜏)

0

(cid:170)
(cid:174)
(cid:174)
(cid:172)

(cid:170)
(cid:174)
(cid:174)
(cid:172)

(cid:169)
(cid:173)
(cid:173)
(cid:171)

d𝜏 + R𝑘 (𝑡)

(3.10)

dz𝑘
d𝑡

= (cid:169)
(cid:173)
(cid:173)
(cid:171)
(cid:124)

−𝛼𝐼 𝜔𝑘 𝐼

−𝜔𝑘 𝐼 −𝛼𝐼
(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)
(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)

(cid:123)(cid:122)
≜J

(cid:170)
(cid:174)
(cid:174)
(cid:172)
(cid:125)

z𝑘 − (cid:169)
(cid:173)
(cid:173)
(cid:171)

Γ𝑘 𝜙𝜏v(𝑡)

0

+

dR𝑘
d𝑡

(cid:170)
(cid:174)
(cid:174)
(cid:172)

− JR𝑘 (𝑡).

(3.11)

Furthermore, we note that R𝑘 (𝑡) can be modeled as a generalized Ornstein–Uhlenbeck process and

dR𝑘

d𝑡 − JR𝑘 (𝑡) can be represented by

dR𝑘
d𝑡

− JR𝑘 (𝑡) = Λ𝑘 (cid:164)W𝑘,𝑡,

(3.12)

where (cid:164)W𝑘,𝑡 is the standard white noise and Λ𝑘 Λ𝑇
of R𝑘 (𝑡) = (cid:2)R𝑘,1, R𝑘,2

(cid:3) is given by

𝑘 = −𝑘 𝐵𝑇 (J + J𝑇 ). With this choice, the covariance

(cid:10)R𝑘 (𝑡)R𝑘 (𝜏)𝑇 (cid:11) = 𝑘 𝐵𝑇e−𝛼(𝑡−𝜏) (cid:169)
(cid:173)
(cid:173)
(cid:171)

cos(𝜔𝑘 (𝑡 − 𝜏))𝐼

sin(𝜔𝑘 (𝑡 − 𝜏))𝐼

− sin(𝜔𝑘 (𝑡 − 𝜏))𝐼

cos(𝜔𝑘 (𝑡 − 𝜏))𝐼

(cid:170)
(cid:174)
(cid:174)
(cid:172)

such that Eq. (3.8) remains valid. Using Eqs. (3.7)(3.9)(3.11), we can write the reduced model

(3.3) in the form of

d
d𝑡

q

p

· · ·

z𝑘,1

z𝑘,2

· · ·

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

=

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

0

−𝐼

0

𝐼

0

· · ·

0 −Γ𝑘 𝜙(q)

0

0

0

· · ·

· · ·

0

· · · 𝜙(q)𝑇 Γ𝑇
𝑘

· · ·

· · ·

· · ·

· · ·

· · ·

−𝛼𝐼

𝜔𝑘 𝐼

· · ·

0

0

· · ·

−𝜔𝑘 𝐼

−𝛼𝐼

· · ·

≜ K∇𝐹 (q, p, · · · , z𝑘,1, z𝑘,2, · · · ) + Λ (cid:164)W𝑡,

∇𝑈 (q)

v

· · ·

z𝑘,1

z𝑘,2

· · ·

· · ·

· · ·

· · ·

· · ·

· · ·

· · ·

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

+

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

0

0

· · ·

Λ𝑘 (cid:164)W𝑘,𝑡

· · ·

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

(3.13)

31

where K is the first matrix of the right-hand-side of Eq.

(3.13), 𝐹 (q, p, · · · , z𝑘,1, z𝑘,2, · · · ) =

𝑈 (q) + 1

2p𝑇 M−1p + 1

2

(cid:16)

(cid:205)𝑁 𝜔
𝑘=1

𝑘,1z𝑘,1 + z𝑇
z𝑇

𝑘,2z𝑘,2

(cid:17)

is the total free energy of the extended system, and

Λ = diag(0, 0, · · · , Λ𝑘 , · · · ). Using (3.12), it is easy to show ΛΛ𝑇 = −𝑘 𝐵𝑇 (K + K𝑇 ). Therefore,

the gradient system (3.13) (i.e., the reduced model (3.3)) has the invariant density function

𝜌eq(q, p, z) = exp[−𝐹 (q, p, z)/𝑘 𝐵𝑇].

3.2.2 Training Method of the Reduced Model

To learn the reduced model (3.3), we need to choose appropriate metrics such that the state-

dependent non-Markovian nature can be manifested. While auto-correlations such as 𝑐𝑣𝑣 (𝑡) =
(cid:10)v(𝑡)v(0)𝑇 (cid:11) merely characterize the overall memory effect, a crucial observation is that the

correlation conditional with different initial state q0 further depends on the local energy dissipation

and therefore naturally encode the signatures of the heterogeneous memory effect. Accordingly, we

right-multiply the second equation of (3.3) by v(0) and take the conditional expectation on q0 = q∗,

i.e.,

∫ 𝑡

0
∫ 𝑡

0
∫ 𝑡

g(𝑡; q∗) =

=

=

(cid:10)𝜙𝑇

𝑡 Θ𝑡−𝜏𝜙𝜏v𝜏v𝑇

0 |q0 = q∗(cid:11) d𝜏

(cid:10)Tr (cid:2)Θ𝑡−𝜏H𝜓𝜏v𝜏v𝑇
0

𝑡 H𝑇 (cid:3) |q0 = q∗(cid:11) d𝜏
𝜓𝑇

Tr (cid:2)Θ𝑡−𝜏HC𝜓,𝜓 (𝑡, 𝜏; q∗)H𝑇 (cid:3) d𝜏,

0
where g(𝑡; q∗) := (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]v𝑇
point correlation characterizing the coupling among the bases. Since 𝜓(q) is sparse, 𝜓𝜏𝜓𝑇

0 |q0 = q∗(cid:11) and C𝜓,𝜓 (𝑡, 𝜏; q∗) := (cid:10)𝜓𝜏v𝜏v𝑇

𝑡 |q0 = q∗(cid:11) is a three-
𝜓𝑇

𝑡 can

0

be evaluated with 𝑂 (1) complexity and hence C𝜓,𝜓 (𝑡, 𝜏; q∗) can be efficiently pre-computed.

Accordingly, we can train the reduced model in terms of coefficients H for encoders 𝜙(q) as well as
matrices {Γ𝑘 }𝑁 𝜔

𝑘=1 and 𝛼 for kernel Θ(𝑡) by minimizing the empirical loss

𝑁𝑞
∑︁

𝑁𝑡
∑︁

𝐿 =

(cid:13)
(cid:13)(cid:101)g(𝑡𝑘 ; q(𝑙)) − g(𝑡𝑘 ; q(𝑙))
(cid:13)

(cid:13)
2
(cid:13)
(cid:13)

,

(3.14)

𝑘=1

(cid:101)g(𝑡𝑘 ; q(𝑙)) =

𝑙=1
𝑘
∑︁

𝑗=1

Tr (cid:2)Θ(𝑡𝑘 − 𝑡 𝑗 )HC𝜓,𝜓 (𝑡𝑘 , 𝑡 𝑗 ; q(𝑙))H𝑇 (cid:3) 𝛿𝑡,

32

where (cid:8)q(𝑙)(cid:9) 𝑁𝑞
𝑙=1 represent configuration samples within the phase space. For systems with pronounced
free energy barriers, q(𝑙) can be collected along with free energy construction (Maragliano and

Vanden-Eijnden, 2008), whereas the conditional correlations for each q(𝑙) need to be sampled

from unbiased equilibrium trajectories. (cid:101)g(·) represents the prediction by the reduced model which
depends on the trainable variables and the pre-computed correlation C𝜓,𝜓. 𝛿𝑡 is the time step.

Besides the conditional correlation functions, we can also introduce the loss function with

respect to the overall correlation function, i.e.,

𝑁𝑡
∑︁

𝑘=1
𝑘
∑︁

∥(cid:101)g2(𝑡𝑘 ) − g2(𝑡𝑘 ) ∥2

(cid:104)

Θ(𝑡𝑘 − 𝑡 𝑗 )HC𝜓,𝜓 (𝑡𝑘 , 𝑡 𝑗 )H𝑇 (cid:105)

𝛿𝑡,

Tr

𝐿2 =

(cid:101)g2(𝑡𝑘 ) =

where g2(𝑡) = (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]v𝑇
particular, if there is scale separation between 𝑐𝑣𝑣 (𝑡) and 𝑐𝑞𝑞 (𝑡) (e.g., 𝑐𝑣𝑣 (𝑡) decays much faster than

𝑗=1
(cid:11) is the overall correlation and C𝜓,𝜓 (𝑡, 𝜏) = (cid:10)𝜓𝜏v𝜏v𝑇
0

𝜓𝑇
𝑡

(cid:11).

In

0

𝑐𝑞𝑞 (𝑡); see Fig. 3.4 and Fig. 3.5), we may approximate C𝜓,𝜓 (𝑡, 𝜏) by two-point correlations, i.e.,
C𝜓,𝜓 (𝑡, 𝜏) ≈ (cid:10)𝜓𝜏 ⊗ 𝜓𝑇
𝑡

(cid:11).

(cid:11) : (cid:10)v𝜏v𝑇
0

Efficient training is achieved by using the following numerical methods to evaluate (cid:101)g and (cid:101)g2.
𝑡 can be efficiently pre-computed with 𝑂 (1) complexity by using the sparse

Specifically, 𝜓𝜏𝜓𝑇

piecewise linear basis functions. Furthermore, we can use the low-rank representation (e.g., based on

the singular value decomposition) of C𝜓,𝜓 and C𝜓,𝜓 to accelerate the matrix production HC𝜓,𝜓H𝑇 .

In addition, the convolution on index 𝑗 can be efficiently evaluated by the Fast Fourier Transform

algorithm (Cooley and Tukey, 1965).

For the present study, 𝜓 is chosen as the uniform piecewise linear basis function defined on

[2.8, 4.1] with 𝑁𝑏 = 66, 𝑇𝑐 = 200 and 𝑁𝜔 = 2000. We use 2 × 106 short trajectories extracted

from the full MD simulation (see Sec. 3.3.1 for details) where each one consists of 300 time-series

samples to sample the correlation for q∗ at the saddle point. We use 2 × 106 ∼ 8 × 106 short

trajectories for other individual points.

While 𝐿2 alone is insufficient to characterize the emergence of the state-dependent memory, it

serves as a necessary condition and can facilitate the learning of the reduced model. In practice,

33

we can use both loss functions to train the reduced model with 𝑁𝑞 = 65, 𝑁𝑡 = 300 for 𝐿, and

𝑁𝑡 = 30000 for 𝐿2. Specifically, the training is conducted by the Adam (Kingma and Ba, 2015)

optimization method in three stages with 2000, 6000, and 6000 steps respectively. For the first

stage, we only use 𝐿2 to train the model with a constant learning rate of 0.04. For each step of
the following two stages, 16 initial states (i.e., 𝑞 (𝑙)) are randomly selected as one training batch to

evaluate the total loss 𝐿𝑡 = 𝐿 + 𝐿2. For both stages, the initial learning rate is 1 × 10−2 and the

exponential decay rate is 0.9 per 150 steps. In practice, the number of 𝑁𝑏 and 𝑁𝑞 can be further

reduced. As shown in the GitHub repository, with 𝑁𝑏 = 8, 𝑁𝑞 = 26, the constructed reduced model

can accurately recover the MD results.

3.2.3 Simulation of the Reduced Model

We simulate the reduced model (3.3) on 𝑡 ∈ [0, 𝑇] by generating the noise term R𝑡 = 𝜙𝑇
𝑡 (cid:101)R(𝑡),
where (cid:101)R : R+ → R𝑛 is a Gaussian random process, using the constructed Fourier modes rather

than Markovian embedding (e.g., see Ref. (Ayaz et al., 2022b)). Specifically, we have proved
that by choosing ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇Θ(𝑡 − 𝜏), the reduced model retains a consistent equilibrium
2p𝑇 M−1p(cid:3) (cid:9) (Prop. 3.2.1). Accordingly, we can generate
density, i.e., 𝜌eq(q, p) ∝ exp (cid:8)−𝛽 (cid:2)𝑈 (q) + 1
{(cid:101)R(𝑡𝑖)}𝑁

𝑖=0 similar to Refs. (Berkowitz et al., 1983; Ogorodnikov and Prigarin, 1996) by

(cid:101)R(𝑡𝑖) = 𝛽−1/2

2𝑁
∑︁

𝑘=0

(cid:101)Θ1/2

𝑘

[cos(𝜔𝑘𝑡𝑖)𝜉𝑘 + sin(𝜔𝑘𝑡𝑖)𝜂𝑘 ] ,

(3.15)

where (cid:101)Θ𝑘 are the Fourier (essentially cosine) modes of Θ (|𝑡|) on [−𝑇, 𝑇] (Berkowitz et al., 1983;
Ogorodnikov and Prigarin, 1996); 𝜉𝑘 and 𝜂𝑘 are independent Gaussian random vectors. Specifically,

for large simulation time 𝑇, the Fourier modes (cid:101)Θ𝑘 is given by

(cid:101)Θ𝑘 =

=

≈

𝑁 𝜔∑︁

∫ 𝑇

−𝑇

∫ 𝑇

𝑗=1
𝑁 𝜔∑︁

e−𝛼|𝑡| ˆΘ 𝑗 cos(𝜔 𝑗 𝑡) cos(𝜔𝑘𝑡)d𝑡

e−𝛼𝑡 ˆΘ 𝑗 (cid:0)cos (cid:0)(𝜔 𝑗 − 𝜔𝑘 )𝑡(cid:1) + cos (cid:0)𝜔 𝑗 + 𝜔𝑘 )𝑡(cid:1)(cid:1)

(3.16)

𝑗=1
𝑁 𝜔∑︁

(cid:32)

𝑗=1

0

𝛼 ˆΘ 𝑗
𝛼2 + (𝜔 𝑗 − 𝜔𝑘 )2

+

𝛼 ˆΘ 𝑗
𝛼2 + (𝜔 𝑗 + 𝜔𝑘 )2

(cid:33)

.

34

Therefore (cid:101)R(𝑡) can be generated using the Fast Fourier Transform algorithm (Cooley and Tukey,
1965) using 𝑂 (𝑁 log 𝑁) complexity. Also, the convolution term ∫ 𝑡
𝜙𝑇
𝑡 Θ(𝑡 − 𝜏)𝜙𝜏v(𝜏)d𝜏 in Eq.

0

(3.3) can be computed using the fast convolution method developed in Ref. (Schädle et al., 2006)

with 𝑂 (𝑁 log 𝑁) complexity.

3.3 Numerical Results

The present model enables us to simulate the reduced dynamics beyond the standard GLE and

systematically investigate open problems like the state-dependent memory effect on the collective

dynamics of complex systems such as molecule kinetics.

3.3.1 Full Atomistic Model

In this work, we consider the full micro-scale model of benzyl bromide (see Fig. 3.1 for a

sketch of the molecule structure) in an aqueous environment. The general AMBER (Wang et al.,

2004) force field is used for the benzyl bromide molecule and the partial charges of molecule atoms

were set by the restrained electrostatic potential (RESP) approach (Bayly et al., 1993). The rigid

TIP3P water model (Jorgensen et al., 1983) is used for the water molecules and the bond lengths and

angles were held constant through the SHAKE algorithm (Ryckaert et al., 1977; Miyamoto and

Kollman Peter, 2004). Long-range electrostatic interactions were calculated using a Particle Mesh

Ewald summation with a relative error set to be 10−4.

The full system consists of one benzyl bromide molecule and 2400 water molecules with the

periodic boundary condition imposed along each direction. The isothermal-isobaric thermostat

(Martyna et al., 1994) is used to equilibrate the system for 16 ns at 298K and 1 bar using a time step

of 1 fs. Following the equilibration, the box size is scaled to be near 41.5 × 41.5 × 41.5 ˚A3. The

simulation was run for a production period of 1.5 𝜇s in a canonical ensemble with a Nosé-Hoover

thermostat (Nosé, 1984; Hoover, 1985). The numerical results of this work are presented in ˚A for

length, picosecond for time, and gram per mole for mass.

The resolved variable 𝑞 is defined as the distance between the bromine atom and the ipso-carbon

atom. The free energy is obtained from the probability density function 𝜌(𝑞) (see the inset plot of

Fig. 3.3(a)), i.e., 𝑈 (𝑞) = −𝑘 𝐵𝑇 ln 𝜌(𝑞), where 𝜌(𝑞) is directly obtained from the full MD samples

35

Figure 3.1 Left: A sketch of the molecule benzyl bromide. The resolved variable is defined as
the distance between the bromine atom and the ipso-carbon atom. Right: The free energy of the
resolved variable 𝑞. The error bar represent the 95% confidence interval.

using the kernel density estimation. To verify the accuracy of the constructed 𝑈 (𝑞), we calculate

the expectation of 𝑞∇𝑈 (𝑞) on the sample. The numerical result gives 0.996𝑘 𝐵𝑇 and is close to the
theoretical prediction ⟨𝑞∇𝑈 (𝑞)⟩ = ∫ 𝑞∇𝑈 (𝑞)e−𝑈 (𝑞)/𝑘 𝐵𝑇 d𝑞 ≡ 𝑘 𝐵𝑇.

3.3.2 Limitation of the Standard GLE

Let us start with the standard GLE by setting features 𝜙(q) ≡ I in Eq. (3.3), which capture the

dynamics on the resolved scale considered in Refs. (Ayaz et al., 2021; Dalton et al., 2023). We right-
multiply q(0) to Eq. (3.3) and compute the correlation functions, i.e., ℎ(𝑡) = ∫ 𝑡
where ℎ(𝑡) = (cid:10)[ (cid:164)p𝑡 + ∇𝑈 (q𝑡)]q𝑇
0

Θ(𝑡 − 𝜏)𝑐𝑣𝑞 (𝜏)d𝜏,
(cid:11). The standard GLE kernel Θ(𝑡) (i.e., K(𝑡) in Eq. (3.1)) can

0

be obtained using the Fourier transform of the integral equation. If the reduced dynamics (3.1)

can be simplified as the standard GLE, then 𝑐𝑣𝑞 (𝑡) should be accurately reproduced. Fig. 3.2

shows the prediction of 𝑐𝑣𝑞 (𝑡) from the standard GLE and the full MD model. The apparent

deviations imply non-negligible state-dependency. To further probe this effect, we compute

ℎ′′(𝑡; q∗) = 𝑔′(𝑡; q∗) conditional with different initial states q∗. Unlike a unified short-time

correlation (i.e., 𝑔′(0; 𝑞∗) = −𝑘 𝐵𝑇Θ(0)/𝑚) predicted by the standard GLE, the large dispersion

reveals the heterogeneous nature of the energy dissipation process.

36

3.153.63.901020Figure 3.2 Correlation functions predicted by the standard GLE and the full MD: (a) Overall 𝑐𝑣𝑞 (𝑡)
and (b) −𝑔′(𝑡; q∗) conditional with q∗ representing various initial states (gray lines), including two
local minima and the saddle point (see inset of Fig. 3.3(a)). The large dispersion implies the
limitation of the standard GLE, which predicts a single curve in short time.

3.3.3 GLE with State-dependent Memory

To capture the state-dependent memory, we train the present model (3.3) with a different number

of features. Fig. 3.3(a-b) shows the obtained encoder 𝜙(·) using one feature and Θ(𝑡) is scaled with

Θ(0) = 1. 𝜙 exhibits apparent deviation from a uniform distribution. In particular, it shows a peak

value near the saddle point 𝑞 = 3.65, implying a larger effective friction near the regime. This result

supports a similar assumption in semi-analytical studies (Straus et al., 1993) on improving Kramers’

theory (Kramers, 1940). Also, it explains the short-time dispersion shown in Fig. 3.2, where

𝑔′(𝑡; q∗) at the saddle point is significantly larger than the local minima. Fig. 3.3(c-d) shows the

obtained encoders {𝜙𝑖 (·)}𝑛

𝑖=1 with 𝑛 = 4 features and the diagonal components of Θ(𝑡). Compared
with the case of 𝑛 = 1, the larger variation of 𝜙𝑖 enables a better representation of the state-dependent

memory.

Next, we examine the conditional correlations 𝑐𝑣𝑞 (𝑡; q∗) and 𝑐𝑣𝑣 (𝑡; q∗). As shown in Fig. 3.4,

for both local minima and the saddle point, the predictions of the present model using four features

agree well with the MD results. In contrast, the predictions of the standard GLE show apparent

deviations for q∗ at the saddle point. Also, the present model using four features with a diagonal Θ(𝑡)

(see Section 3.5.3) shows improved short-time predictions but remains insufficient for long-time

37

Figure 3.3 The state features 𝜙 and diagonal components of the matrix-valued kernel Θ(𝑡) for the
present model with state-dependent memory (SD-GLE) trained using (a-b) one feature and (c-d)
four features. Inset plots: (a) probability density function (PDF) of 𝑞, where 𝜙(𝑞) near the saddle
point shows a peak; (b) Fourier modes of Θ(𝑡).

correlations. This indicates the complex global variation of the memory term,which can not be

represented by a simple state-dependent re-scaling of a kernel function; the non-Markovian coupling

among multiple features is crucial to capture the heterogeneous energy dissipation over the full

space.

Finally, we examine the collective behavior related to molecule kinetics. Fig. 3.5(a) shows

the position correlation 𝑐𝑞𝑞 (𝑡) characterizing the molecule conformation relaxation. Compared

with the MD results, the standard GLE shows a significant underestimation of the relaxation time.

This discrepancy is possibly due to the larger effective friction near the saddle point (see Fig.

3.3(a)), which essentially dampens the transition between the two local minima. The standard

GLE overlooks such state-dependency and therefore yields a faster relaxation. This limitation is

38

Figure 3.4 Overall and conditional correlation functions predicted by the full MD and various
reduced models for two local minima and the saddle point: (a-b) 𝑐𝑣𝑞 and (c-d) 𝑐𝑣𝑣. Shaded regimes
represent the 95% confidence interval; same for Fig. 3.5.

consistently reflected in the distribution of the transition time. For this system, the free energy barrier

is approximately 3.5𝑘 𝐵𝑇 (see Fig. 3.1 right); the transition time is obtained from the simulation

trajectories of the MD and various reduced models. As shown in Fig. 3.5(b), the standard GLE

predicts a larger probability for the short transition time, indicating a smaller overall friction than the

local (i.e., saddle point) value. Fortunately, the heterogeneous non-Markovianity can be faithfully

retained in the present model. In particular, the constructed model using one feature yields a better

prediction than the standard GLE. As we increase to four features, the predictions recover the MD

results.

3.4 Summary

In summary, to plan an optimal hiking trail on a mountain, a landscape map is generally

insufficient; the local path roughness needs to be properly considered. Similarly, to predict the

39

Figure 3.5 Collective molecule behaviors predicted by the full MD and the various reduced models:
(a) overall conformation relaxation and (b) distribution of the transition time between the two local
minima.

reduced dynamics of a multi-scale system, the state-dependent memory may need to be modeled

to account for the heterogeneous energy dissipation arising from the unresolved dynamics, which,

however, has been broadly overlooked. While the crucial role of the non-Markovian effect that

complements the conservative free energy has been gradually recognized, the formulation of the

memory term remains largely empirical (e.g., the standard GLE). The current work focuses on this

caveat and presents a data-driven approach to learning such a stochastic reduced model beyond

the standard GLE, where the complex state-dependent memory can be naturally encoded in the

non-Markovian interactions among a set of features in terms of the resolved variables. The training

does not rely on the explicit knowledge of the full model and only utilizes the trajectory samples,

where the three-point correlations can be efficiently pre-computed. Numerical results of a molecule

system demonstrate the crucial role of the state-dependent non-Markovianity on collective behavior,

where the standard GLE shows limitations due to the over-simplified assumption of a homogeneous

memory kernel. In contrast, the present model accurately predicts the molecule kinetics including

the transition time distribution, and provides a reliable approach to simulate stochastic reduced

dynamics of multiscale problems that faithfully retains the collective behaviors and rare event

properties (E and Vanden-Eijnden, 2010) beyond empirical models.

40

3.5 Other Details

3.5.1 Mass Matrix of the Reduced Model

Generally the mass matrix should depend on the resolved variables for the general cases, and we

refer to Refs. (Ayaz et al., 2022a; Lee et al., 2019) for further discussions and the reduced dynamics

with position-dependent mass. However, in this study we focus on the effect of the state-dependent

non-Markovian memory on the collective behavior of complex systems. Therefore, we choose the

coarse-grained resolved variables such that the corresponding mass matrix is a constant.

Specifically, we define 𝑞 = ∥Q1 − Q2∥, where Q1 and Q2 are the atom coordinates of the full
(cid:164)Q12/𝑞 and its

model (see Fig. 3.1 and Sec. 3.3.1 for details). Accordingly, we have (cid:164)𝑞 = Q𝑇
12

covariance follows

Q𝑇
12

(cid:164)Q12

(cid:164)Q𝑇

12Q12

(cid:29)

Tr (cid:2)(Q12Q𝑇

(cid:29)

12)(cid:3)
(cid:164)Q𝑇

12) ( (cid:164)Q12
(cid:29) (cid:16)

Tr(Q12Q𝑇

12)

𝑀 −1

1 + 𝑀 −1

2

(cid:17)

𝑘 𝐵𝑇

(3.17)

(cid:28) 1
𝑞2
(cid:28) 1
𝑞2
(cid:28) 1
𝑞2

⟨ (cid:164)𝑞 (cid:164)𝑞⟩ =

=

=

=

(cid:16)

𝑀 −1

1 + 𝑀 −1

2

(cid:17)

𝑘 𝐵𝑇,

where 𝑀1 and 𝑀2 represent the mass of two atoms and we have used the fact that the distribution of
Q12 and (cid:164)Q12 are independent. Therefore, the mass matrix of 𝑞 is a constant 𝑀 ≡ 𝑀1𝑀2/(𝑀1 + 𝑀2).

3.5.2 Limitations of the Standard GLE near the Local Minima

Fig. 3.6 shows the predictions of the conditional correlations 𝑐𝑞𝑣 (𝑡, 𝑞∗) and 𝑐𝑣𝑣 (𝑡, 𝑞∗) for 𝑞∗

representing the two local minima. Similar to the results of the saddle point (see Fig. 3.5(b)),

the predictions of the standard GLE show apparent deviations from the full MD results due to the

ignorance of the state-dependent memory nature. In contrast, the predictions of the present model

with four features can accurately recover the MD predictions.

3.5.3 Other Forms of the State-dependent Memory Term

For comparison, we also consider other forms of the reduced model. In particular, we retain the

encoders 𝜙 in Eq. (3.3) but set Θ(𝑡) to be diagonal, i.e., we ignore the non-Markovian coupling

among the different state features. The reduced model is trained using four features. Fig. 3.7 shows

41

Figure 3.6 The conditional correlation functions 𝑐𝑣𝑞 (𝑡, 𝑞∗) and 𝑐𝑣𝑣 (𝑡, 𝑞∗) for the two local minima
predicted by the full MD, the standard GLE, and the present model (SD-GLE) constructed using
one and four spatio-features. Left: 𝑞∗ = 3.07; Right: 𝑞∗ = 3.87. The predictions by the standard
GLE show apparent discrepancies with the full MD results. Shaded regimes represent the 95%
confidence interval.

the conditional correlation 𝑐𝑣𝑣 (𝑡; 𝑞∗) obtained from the full MD and different reduced models. The

prediction of the constructed model (labeled by “SD-GLE-Diag”) shows apparent deviations from

the full MD result with incremental improvement over the stand GLE. The large discrepancy reveals

the complex state-dependent nature; the non-Markovian effect can be neither approximated by ansatz

like 𝛾(𝑞)𝜃 (𝑡) as a simple generalization/re-scaling of the initial value at 𝑡 = 0, nor represented by

the coupling with the independent bath variables. Instead, the non-Markovian coupling among

the various state-features retained in the present model plays a crucial for accurately modeling

the heterogeneous energy dissipation arising from the unresolved intramolecular interactions and

reproducing the collective dynamics.

Furthermore, we note that Ref. (Vroylandt and Monmarché, 2022) develops an efficient approach

42

0123t-0.2-0.100.10.20.30.4MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.2-0.100.10.20.30.4MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.02-0.015-0.01-0.00500.0050.01MDGLESD-GLE(n=1)SD-GLE(n=4)0123t-0.04-0.0200.020.040.06MDGLESD-GLE(n=1)SD-GLE(n=4)Figure 3.7 The conditional correlation functions 𝑐𝑣𝑣 (𝑡, 𝑞∗) for the saddle point predicted by the full
MD, the standard GLE, the reduced models constructed using four state-features with diagonal Θ(𝑡)
(SD-GLE-Diag) and full Θ(𝑡) (SD-GLE). The large discrepancy between the SD-GLE-Diag model
and the full MD results implies the complexity of the state-dependency of the memory term, which
can not be well represented by the coupling of independent bath variables. The non-Markovian
interactions among the state-features are essential to capture the heterogeneous energy dissipation
process.

to compute the memory function based on the finite-rank approximation of the Zwanzig’s projection

formalism. The method can be used to efficiently extract the state-dependent memory and probe the

physics insights from the trajectory samples of the full MD model. The present work focuses on

training a reduced model with heterogeneous memory that enables generating coherent noise and

conducting stochastic simulations.

Specifically, the memory term takes the form 𝐾 (𝑞(𝜏), 𝑡 − 𝜏) (with a simple change of variable

𝑠 = 𝑡 − 𝜏 following the notation) in Ref. (Vroylandt and Monmarché, 2022) and ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 − 𝜏) =

𝜙(𝑞(𝑡))𝑇 Θ(𝑡 − 𝜏)𝜙(𝑞(𝜏)) in the present work. In particular, by setting 𝑞(𝑡) = 𝑞(𝜏), the memory

term in two forms should have similar prediction. Following this argument, we use the method in Ref.

(Vroylandt and Monmarché, 2022) to calculate 𝐾 (𝑞(𝜏), 𝑡 −𝜏) and compare it with ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 −𝜏)

of the present model with 𝑞(𝑡) = 𝑞(𝜏) = 𝑞∗. Fig. 3.8 shows the obtained memory functions for 𝑞∗

taking the saddle point and two local minima. The prediction by the two approaches show good

agreement.

43

0123t-0.200.20.4MDGLESD-GLE-Diag(n=4)SD-GLE(n=4)Figure 3.8 The conservative force and memory kernel obtained from Ref. (Vroylandt and Monmarché,
2022) (labeled as Volterra Basis) and the present model. The memory term takes the form
𝐾 (𝑞(𝜏), 𝑡 − 𝜏) and ˜𝐾 (𝑞(𝑡), 𝑞(𝜏), 𝑡 − 𝜏) = 𝜙(𝑞(𝑡))𝑇 Θ(𝑡 − 𝜏)𝜙(𝑞(𝜏)), respectively. Specifically, we
set 𝑞(𝑡) = 𝑞(𝜏) = 𝑞∗; the predictions of the two models show good agreement for the saddle point
and the two local minima.

3.5.4 Generalization of the Present Reduced Model Formulation

So far, we have constructed the reduced model (3.3) by assuming the matrix-valued kernel Θ(𝑡)

is symmetry. In fact, this form can be generalized by introducing an anti-symmetry part, i.e.,

Θ(𝑡) = e−𝛼𝑡

𝑁 𝜔∑︁

𝑘=0

(Γ𝑇

𝑘,1Γ𝑘,1 + Γ𝑇

𝑘,2Γ𝑘,2) cos(𝜔𝑘𝑡) + (Γ𝑇

𝑘,1Γ𝑘,2 − Γ𝑇

𝑘,2Γ𝑘,1) sin(𝜔𝑘𝑡),

(3.18)

where Γ𝑘,1 and Γ𝑘,2 are lower-triangular matrices representing the Fourier modes of Θ(𝑡). The form
is general non-symmetric except for 𝑡 = 0 and satisfies Θ(−𝑡) = Θ(𝑡)𝑇 .

Similar to the symmetry form, we can model the fluctuation term R𝑡 as a noise in the form of R𝑡 =
𝑡 (cid:101)R(𝑡), where (cid:101)R(𝑡) is a Gaussian random process satisfying ⟨(cid:101)R(𝑡)(cid:101)R(𝜏)𝑇 ⟩ = 𝑘 𝐵𝑇e−𝛼(𝑡−𝜏)Θ(𝑡 − 𝜏).
𝜙𝑇

Similar to Prop. 3.2.1, we can show that this choice retains a consistent invariant density function.

44

33.54-20-10010SD-GLE(n=4)VolterraBasis0123time-1000100200300400SD-GLE(n=4)VolterraBasis0123time-1000100200300400500600SD-GLE(n=4)VolterraBasis0123time-200-1000100200300400SD-GLE(n=4)VolterraBasisIn practice, we can generate the noise term (cid:101)R(𝑡) on [0, 𝑇] by

(cid:101)R(𝑡) = 𝛽−1/2

2𝑁
∑︁

(cid:104)

𝑘=0
𝑘,1(cid:101)Θ𝑇
𝑄1,𝑘 = (cid:101)Θ𝑘,2(cid:101)Θ−1
𝑘,2

𝑘,1 cos(𝜔𝑘𝑡)𝜉𝑘 + sin(𝜔𝑘𝑡) (𝑄1/2
(cid:101)Θ1/2

1,𝑘 𝜉𝑘 + 𝑄1/2

2,𝑘 𝜂𝑘 )

(cid:105)

,

𝑄2,𝑘 = (cid:101)Θ𝑘,1 − (cid:101)Θ𝑘,2(cid:101)Θ−1

𝑘,1(cid:101)Θ𝑇
𝑘,2

,

where 𝛽−1 = 𝑘 𝐵𝑇, (cid:101)Θ𝑘,1, (cid:101)Θ𝑘,2 are the Fourier cosine and sine modes on [−𝑇, 𝑇] with Θ(−𝑡) = Θ(𝑡)𝑇 ,
𝜉𝑘 and 𝜂𝑘 are independent Gaussian random vectors, and 𝑁 is the total number of simulation step.

Here (cid:101)R(𝑡) can still be generated using the Fast Fourier Transform algorithm (Cooley and Tukey,
1965) using 𝑂 (𝑁 log 𝑁) complexity. We will investigate this generalized formulation for model

reduction in future studies.

45

CHAPTER 4

DEEP LEARNING-BASED NON-NEWTONIAN FLUID MODEL

In this chapter, we focus on a micro-to-macro model named DeePN2 (Fang et al., 2022). A long

standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the

availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying

micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time,

the complex molecular structure and heterogeneous interaction. DeePN2, a deep learning-based non-

Newtonian hydrodynamic model, has been proposed and has shown some success in systematically

passing the micro-scale structural mechanics information to the macro-scale hydrodynamics for

suspensions with simple polymer conformation and bond potential. The model retains a multi-scaled

nature by mapping the polymer configurations into a set of symmetry-preserving macro-scale

features. The extended constitutive laws for these macro-scale features can be directly learned from

the kinetics of their micro-scale counterparts. In this paper, we develop DeePN2 using more complex

micro-structural models. We show that DeePN2 can faithfully capture the broadly overlooked

viscoelastic differences arising from the specific molecular structural mechanics without human

intervention.

4.1

Introduction

Accurate modeling of non-Newtonian hydrodynamics plays a central role in the modeling of the

transport, diffusion, and synthesis processes in many scientific and engineering applications. Unlike

simple fluids, non-Newtonian fluids may exhibit enormously complex flow behavior as a result

of the micro-scale polymer dynamics. In particular, the polymer relaxation time often becomes

comparable to the hydrodynamic time scale. As a result, the macro-scale fluid evolution can not be

uniquely determined by the instantaneous flow field and the memory effect is generally important.

To close the hydrodynamic equations, existing models are primarily based on the following two

approaches. The first approach relies on empirical constitutive models (Larson, 1988; Owens and

Phillips, 2002). Notable examples include the Hookean model (Oldroyd and Wilson, 1950; Lin

et al., 2005), the FENE-P model (Peterlin, 1966; Bird et al., 1980), the Giesekus model (Giesekus,

46

1982), and the Phan-Thien and Tanner models (Thien and Tanner, 1977). Despite their popularity,

the accuracy of these models is almost always in doubt. The second approach resorts to various

sophisticated micro-macro coupling algorithms, e.g., by directly solving the Fokker-Planck equation,

or sampling the polymer configuration via micro-scale simulations (Laso and Öttinger, 1993; Hulsen

et al., 1997; Ren and E, 2005). While the effects of the polymer interaction can be carried over

to the macro-scale model, the computational cost can be exceedingly large due to the retaining of

the micro-scale description. Methods based on asymptotic analysis (Warner, 1972a) or the direct

fitting of the strain-stress relationship (Zhao et al., 2018) are limited to simple flows such as the

steady flow. Several semi-analytical approaches have been proposed (Grosso et al., 2000; Feng et al.,

1998; Wang, 1997; Forest et al., 2003; Lielens et al., 1999; Yu et al., 2005; Hyon et al., 2008) using

moment closure to approximate the micro-scale polymer configuration probability density function

(PDF) and to derive the constitutive equations for the FENE dumbbell solution (Lielens et al., 1999;

Yu et al., 2005; Hyon et al., 2008). However, these approaches are all based on restricted ansatz for

the PDF and therefore are not reliable for more general flow regimes.

To construct truly reliable and interpretable hydrodynamic models with molecular-level fidelity,

it is essential to be able to efficiently code the information from the micro-scale interaction into the

macro-scale transport equations. Ideally, the construction should meet the following requirements:

• be interpretable;

• be reliable – it should be accurate for all kinds of practical situations that one might encounter;

• respect physical constraints, including symmetries and conservation laws;

• be numerically robust and efficient.

As a first step towards constructing models that meet these requirements, we developed a machine

learning-based approach (Lei et al., 2020), “deep learning-based non-Newtonian hydrodynamic

model” or DeePN2, that learns the non-Newtonian hydrodynamic model from the underlying

micro-scale description of the dumbbell solution. Rather than approximating the closure with

standard moments, DeePN2 finds a set of encoders, i.e., a set of macro-scale features that best

represent the micro-scale dumbbell structure. It also finds accurate closed-form equation for these

47

macro-scale features. The constructed model retains a clear physical interpretation and accurately

captures the nonlinear viscoelastic responses, where the conventional Hookean and FENE-P models

show limitations.

Beyond dumbbell suspensions, one major challenge towards constructing truly reliable hydrody-

namic models arises from the heterogeneous polymer micro-structural mechanics. In this work, we

aim to fill the gap by developing the generalized DeePN2 model for multi-bead polymer molecules

with arbitrary structure and interaction. Firstly, with the proper design of the generalized micro-macro

encoders and the machine learning-based symmetry-preserving constitutive dynamics, we demon-

strate that the heterogeneous molecular structural-induced interaction can be systematically encoded

into the macro-scale hydrodynamics. Unlike moment closure approximations, the encoders are not

designed to recover the high-dimensional configuration PDF. Instead, they take an interpretable

form and are learned to probe the optimal approximation of the polymer stress and constitutive

dynamics. This essential difference enables DeePN2 to circumvent the high-dimensionality of the

polymer configuration PDF. Secondly, the explicit form of the micro-macro encoders enables us

to reliably learn the dynamics of the macro-scale features directly from the kinetic equations of

their micro-scale analog. In this sense, this learning framework retains a multi-scaled nature where

micro-scale interaction and physical constraints can be seamlessly inherited. Moreover, the learning

only requires instantaneous micro-scale samples. This unique property differs from the common

sophisticated data-driven approaches (Rudy et al., 2017; Schaeffer et al., 2018; Raissi et al., 2019b;

Qin et al., 2019; Han et al., 2019b; Seryo et al., 2020; Huang et al., 2021), where time-derivative

samples are often needed to learn the governing dynamics. This is particularly suited for multi-scale

fluid models where accurate time-derivative samples may not be readily accessible. We demonstrate

the power of the DeePN2 model for polymer molecules of three distinct shapes with training samples

collected from one-dimensional (1D) homogeneous shear flow. Numerical results show that the

broadly overlooked heterogeneous molecular structural mechanics plays an important role in the

rheology of non-Newtonian fluids, which, fortunately, can be faithfully encoded into DeePN2. The

constructed model successfully captures the hydrodynamics with different viscoelastic responses for

48

a variety of 1D and 2D flows when compared with the micro-scale simulation results. The present

work also paves the way towards constructing truly reliable non-Newtonian hydrodynamic models

for general 3D flows.

4.2 Methods

4.2.1 Micro-scale and continuum hydrodynamic models

Let us start with the micro-scale description of the semi-dilute polymer suspension. We assume

each molecule consists of 𝑁 particles with the position vector q = [q1; q2; · · · ; q𝑁 ], where q𝑖 ∈ R3

is the position of the 𝑖-th particle. The intramolecular potential energy 𝑉 (q) takes the form

𝑉 (q) =

𝑁𝑏∑︁

𝑗=1

𝑉𝑏 (cid:0)|q 𝑗1 − q 𝑗2 |(cid:1) , 𝑉𝑏 (𝑙) = −

(cid:34)

𝑙2
0 log

1 −

𝑘 𝑠
2

(cid:35)

,

𝑙2
𝑙2
0

(4.1)

where 𝑁𝑏 is the bond number and ( 𝑗1, 𝑗2) represents the indices of beads associated with the 𝑗-th

bond. Without loss of generality, the individual bond interaction 𝑉𝑏 takes the form of the FENE

potential (Warner, 1972b), where 𝑘 𝑠 is the spring constant and 𝑙0 is the maximum of the extension

length. It is worth mentioning that the polymer molecule is not restricted to the dumbbell shape.

Instead, it generally consists of multiple particles with arbitrary structure and bond connection.

Fig. 4.1 shows a sketch of the polymer molecules with three different structures. As we will show,

given the same form of the individual bond interaction 𝑉𝑏, the different polymer micro-structural

mechanics leads to distinct non-Newtonian hydrodynamics.

In principle, the viscoelastic response of the system is determined by the full micro-scale

Figure 4.1 A sketch of 7-bead polymer molecules with chain-, star- and net-shaped structures
(from left to right). The solid lines represent the FENE bond potential with the same interaction
parameters. The dashed lines of the net-shaped molecule represent the three additional side chains
connecting the polymer arms. While both the chain- and the star-shaped molecules are connected
with six bonds; the suspensions exhibit different hydrodynamics due to the different micro-structural
mechanics as shown below.

49

interaction. However, direct simulation for the full micro-scale interaction is often limited by the

prohibited computational cost. Continuum hydrodynamics models based on various empirical

constitutive models are often used, with the general form

∇ · u = 0,
𝜌 du
d𝑡

= −∇𝑝 + ∇ · (τs + τp) + fext,

(4.2)

where 𝜌, u and 𝑝 represent the fluid density, velocity and pressure field, respectively. fext is the
external body force and τs = 𝜂s(∇u + ∇u𝑇 ) is the solvent stress tensor with shear viscosity 𝜂𝑠.

τp is the polymer stress tensor whose detailed form is generally unknown. To construct τp, the

DeePN2 model seeks the approximation in terms of a set of macro-scale features c1, · · · , c𝑛, and

simultaneously, the constitutive dynamics of these features, i.e.,

τp = G(c1, · · · , c𝑛),

Dc𝑖
D𝑡

= H𝑖 (c1, · · · , c𝑛),

𝑖 = 1, · · · , 𝑛,

(4.3a)

(4.3b)

where G and H𝑖 represent the stress and constitutive models, respectively. D

D𝑡 denotes the objective

tensor derivative.

Eqs. (4.2) and (4.3) take the form similar to the conventional hydrodynamics. Instead of using

empirical approximation to close the equation, we aim to construct a model directly from the

micro-scale description (4.1) with the help of machine learning, such that the constructed model

can naturally encode the molecular-specific interaction beyond empirical approximations with clear

physical interpretation.

4.2.2 DeePN2 for arbitrary molecular structural mechanics

To learn Eq. (4.2) from the full model (4.1), one essential problem lies in how to seamlessly

pass the micro-scale interaction to the continuum model. To bridge the scales, we learn a set of
micro-to-macro encoders, denoted by {b𝑖 (q)}𝑛

𝑖=1, such that the continuum modeling terms (e.g., the

polymer stress τp) can be well approximated in terms of the corresponding macro-scale features
{c𝑖 (q)}𝑛
(cid:205) 𝑗 ⟨q 𝑗 ⊗ ∇q 𝑗𝑉 (q)⟩, c𝑖 = ⟨b𝑖 (q)⟩, 𝑛p is the polymer number

𝑖=1 via Eq. (4.3a), where τp := 𝑛p

density and ⟨·⟩ denotes the average with respect to the configuration PDF. In particular, the features

50

c𝑖 need to satisfy the proper invariant and symmetry conditions inherited from the encoders b𝑖 (·)

such that the constructed continuum model can strictly preserve frame-indifference condition:

(cid:101)τp = QτpQ𝑇 , G((cid:101)c1, · · · ,

(cid:101)c𝑛) = QG(c1, · · · , c𝑛)Q𝑇 ,

(4.4)

where the superscript(cid:101)· denotes the corresponding values under an arbitrary orthogonal transformation
by Q ∈ SO(3).

To construct the encoder b(·), we note that the micro-scale potential 𝑉 (q) is translational and

rotational invariant. Accordingly, let r∗(q) ∈ R3𝑁−6 (we consider the general case 𝑁 ≥ 3 here)

denote the translational-rotational-invariant configuration vector and r(q) ∈ R3𝑁−3 denote the

translational-invariant configuration vector consisting of 𝑁 − 1 linearly independent position vectors.

Since 𝑁𝑏 ≥ 𝑁 − 1 for all molecules, one straightforward choice is the first 𝑁 − 1 bond connection

vectors, i.e.,

r = [r1; r2; · · · ; r𝑁−1] ,
r 𝑗 = q 𝑗1 − q 𝑗2
r∗ = (cid:2)|r1| , |r2| , |r12| , |r3| , |r13| , |r23| , |r4| , |r24| , |r34| , · · · , |r𝑁−1| , (cid:12)

1 ≤ 𝑗 ≤ 𝑁 − 1,

,

(cid:12)r(𝑁−2)(𝑁−1)

(4.5)

(cid:3) ,

(cid:12)
(cid:12)

where r 𝑗 𝑘 := r 𝑗 − r𝑘 . We note that this form applies to general molecular structures; r determines

the molecular structure up to translations. Specifically, r∗ represents the 3𝑁 − 6 degrees of freedom

after eliminating translational and rotational degrees of freedom, and r suffices to fully determine

the translational invariant polymer configuration and strictly retains the rotational symmetry in

accordance with q, i.e.,

r 𝑗 (Qq) = Qr 𝑗 (q),

r∗(Qq) = r∗(q).

To preserve rotational symmetry, one straightforward approach is to represent b(·) in the linear
𝑗=1 . However, this choice yields the trivial macro-scale feature, i.e., (cid:10)r 𝑗 (cid:11) ≡ 0,

space spanned by (cid:8)r 𝑗 (cid:9) 𝑁−1

due to the rotational symmetry. Alternatively, we construct the following second-order tensor

c𝑖 = ⟨b𝑖 (r)⟩,

b𝑖 = f𝑖f𝑇
𝑖 ,

1 ≤ 𝑖 ≤ 𝑛,

f𝑖 = 𝑔𝑖 (r∗)

𝑁−1
∑︁

𝑗=1

𝑤𝑖 𝑗 r 𝑗 ,

51

(4.6)

where [𝑤𝑖 𝑗 ]1≤𝑖≤𝑛,1≤ 𝑗 ≤𝑁−1 are the weights and {𝑔𝑖 (·)}𝑛

𝑖=1 is a set of scalar functions that encodes the

polymer intramolecular interaction. Both terms will be learned from the micro-scale description

and represented by deep neural networks (DNNs). Rotational symmetries can be naturally inherited,
i.e., (cid:101)c = ⟨b((cid:101)r)⟩ ≡ QcQ𝑇 . Compared with the special form for dumbbell molecules in Ref. (Lei
et al., 2020), Eq. (4.6) provides a general form of c applicable to multi-bead molecules of arbitrary

structure since r and r∗ fully determine the 3𝑁 − 3 translational invariant polymer configuration. In

the remaining of the paper, we will abuse the notation and denote b(q) as b(r).

Besides the polymer stress model (4.3a), the remaining task to close Eq. (4.2) is the construction

of the constitutive dynamics (4.3b) of the macro-scale features {c𝑖}𝑛

𝑖=1. There are two issues to deal

with: the proper form of the objective time derivative of c𝑖 and the accurate estimation of their time

evolution. In the literature, the objective tensor derivative, denoted by Dc𝑖

D𝑡 , is often chosen to take

some heuristic forms (e.g. the convected (Oldroyd and Wilson, 1950) and corotational (Zaremba,

1903) forms). Moreover, the time-series samples collected from the micro-scale simulations are

generally super-imposed with pronounced sampling error; direct estimation of the time derivative as

was done in (Rudy et al., 2017; Raissi et al., 2019b; Seryo et al., 2020) will end with noisy data.

Fortunately, both challenges are addressed in DeePN2 using an explicit micro-macro correspondence.

The dynamics of c𝑖 can be derived from the its micro-scale correspondence b𝑖 (r) in the form of the

micro-scale configuration r, i.e.,

d
d𝑡 c𝑖 − κ :

(cid:42)𝑁−1
∑︁

𝑗=1

(cid:43)

r 𝑗 ⊗ ∇r 𝑗 ⊗ b𝑖

=

𝑘 𝐵𝑇
𝛾

(cid:42) 𝑁−1
∑︁

(cid:43)

𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖

𝑗,𝑘=1
𝑁𝑏∑︁

(cid:42)𝑁−1
∑︁

−

1
𝛾

𝐴 𝑗 𝑘 ∇r𝑘𝑉 (r1, · · · , r𝑁𝑏) · ∇r 𝑗 b𝑖

,

(cid:43)

(4.7)

𝑗=1
where κ := ∇u𝑇 , 𝛾 is the friction coefficient and r 𝑗 is the connection vector as defined in Eq. (4.5)
for 𝑗 > 𝑁 − 1. We abuse the notation and denote 𝑉 (q) as 𝑉 (r1, · · · , r𝑁𝑏) = (cid:205)𝑁𝑏
𝑗=1
molecular structure and interaction are specified via A ∈ R𝑁𝑏×𝑁𝑏, which is defined by

𝑉𝑏 (𝑟 𝑗 ). The

𝑘=1

A = SS𝑇 ,

𝑆 𝑗 𝑘 =

+1,

𝑘 = 𝑗1,

−1, 𝑘 = 𝑗2,

0,

else






52

1 ≤ 𝑗 ≤ 𝑁𝑏,

1 ≤ 𝑘 ≤ 𝑁,

(4.8)

where 𝑗1 and 𝑗2 are the same notations as those in Eq. (4.1). We note that Eq. (4.7) only requires

the first (𝑁 − 1) rows of A since the polymer configuration can be fully determined by r1, · · · , r𝑁−1.

As a special case, if the molecule takes the chain shape, A recovers the standard Rouse matrix (Bird

et al., 1987; Rouse, 1953).

Eq. (4.7) defines the dynamics for the features {c𝑖}𝑛

𝑖=1, derived from their micro-scale correspon-

dences. In particular, given the proposed form of the encoder functions (4.6), we can show that the

two combined terms of the left-hand-side of Eq. (4.7) strictly preserve rotational symmetry (see

Section 4.2.3). This leads to an important observation that the two combined terms provide the

generalized form for the macro-scale objective tensor derivative Dc𝑖

D𝑡 . Unlike the heuristic choices in

empirical models, the new form retains a clear micro-scale physical interpretation. Furthermore, all

the modeling terms in the form of ⟨·⟩ can be directly evaluated using samples collected from the

micro-scale simulations under the corresponding flow condition. This enables us to avoid estimating

the time derivative values from the noise-prone time-series samples. Accordingly, the macro-scale

constitutive dynamics takes the form

dc𝑖
d𝑡

− κ : E𝑖 =

𝑘 𝐵𝑇
𝛾 H1,𝑖 (c1, · · · , c𝑛) −

1
𝛾 H2,𝑖 (c1, · · · , c𝑛),

(4.9)

where the individual terms will be represented by proper neural networks and parameterized by

matching their micro-scale correspondences, i.e.,

E𝑖 (c1, · · · , c𝑛) =

(cid:42)𝑁−1
∑︁

r 𝑗 ⊗ ∇r 𝑗 ⊗ b𝑖

(cid:43)

,

(cid:43)

𝑗=1
(cid:42) 𝑁−1
∑︁

H1,𝑖 (c1, · · · , c𝑛) =

H2,𝑖 (c1, · · · , c𝑛) =

𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖

,

(4.10)

𝑗,𝑘=1
(cid:42)𝑁−1
∑︁

𝑁𝑏∑︁

𝑗=1

𝑘=1

𝐴 𝑗 𝑘 ∇r𝑘𝑉 (r1, · · · , r𝑁−1) · ∇r 𝑗 b𝑖

.

(cid:43)

4.2.3 Rotational frame-indifference of the constitutive dynamics for the multi-bead encoder

function

We consider a polymer molecule consisting of 𝑁 particles. Let r = [r1; r2; · · · ; r𝑁−1] denote the
𝑖=1 q𝑖/𝑁 (cid:3)

polymer configuration, so that there exists an invertible linear transformation between (cid:2)r; (cid:205)𝑁

53

and [q1; q2; · · · ; q𝑁 ], where q𝑖 is the position of the 𝑖-th particle. In fact, there are multiple choices

for r, including the one we have applied in Eq. (4.5), where r consists of (𝑁 − 1) edges of a spanning

tree in the bead-bond structure.

We consider a second-order tensor taking the general form

b = f (1) (r)f (2) (r)𝑇 ,

f (1) (r) =

𝑁−1
∑︁

𝑗=1

𝑔(1)
𝑗

(r∗)r 𝑗 ,

f (2) (r) =

𝑁−1
∑︁

𝑗=1

𝑔(2)
𝑗

(r∗)r 𝑗 ,

(4.11)

where r∗ is a translational-rotational-invariant vector and 𝑔(1) and 𝑔(2) are two scalar functions. We

note that the encoder in the form of Eq. (4.11) is more general than Eq. (4.6).

In this section and the next, we consider two frames: frame 1 is static inertial, and frame 2 is

rotating with respect to frame 1 with an time dependent orthogonal transformation Q(𝑡). Let ˜x, ˜v, ˜b

and x, v, b denote the positions, velocities, and second-order tensors in frame 1 and 2 respectively.

They have the following relations:

˜x = Qx,

˜v = Qv + (cid:164)Qx,

˜b = QbQ𝑇 .

The material derivatives in both frames are

d
d𝑡

(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 1

:=

𝜕
𝜕𝑡

+ ˜v · ∇˜x,

d
d𝑡

(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 2

:=

𝜕
𝜕𝑡

+ v · ∇x.

Proposition 4.2.1. With b defined by Eq. (4.11), we have

(4.12)

(4.13)

d
d𝑡 c − κ :

(cid:42)𝑁−1
∑︁

𝑗=1

(cid:43)

r 𝑗 ⊗ ∇r 𝑗 ⊗ b

=

𝑘 𝐵𝑇
𝛾

(cid:42) 𝑁−1
∑︁

(cid:43)

𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b

𝑗,𝑘=1
𝑁𝑏∑︁

(cid:42)𝑁−1
∑︁

𝑗=1

𝑘=1

−

1
𝛾

(cid:43)

(4.14)

𝐴 𝑗 𝑘 ∇r𝑘𝑉p(r) · ∇r 𝑗 b

,

obeys rotational symmetry.
Proof. Let us choose the vector r∗ = (cid:2)|r1|, |r2|, |r12|, |r3|, |r13|, |r23|, · · · , |r𝑁−2,𝑁−1|(cid:3). Denote by
𝑖 the 𝑖-th element of r∗ and r∗
𝑟 ∗

𝑖 the corresponding the 3-dimensional vector, i.e., 𝑟 ∗

6 = |r23| and

r∗
6 = r23. Following Eq. (4.11), b consists of

b =

𝑁−1
∑︁

𝑗,𝑘=1

b 𝑗 𝑘 ,

b 𝑗 𝑘 = 𝑔(r∗)r 𝑗 r𝑇
𝑘 ,

(4.15)

54

where 𝑔(r∗) denotes 𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗) for simplicity. With this general form, we have

d
d𝑡

(cid:10) ˜b 𝑗 𝑘 (cid:11) (cid:12)

(cid:12)frame 1 = (cid:164)Q (cid:10)b 𝑗 𝑘 (cid:11) Q𝑇 + Q (cid:10)b 𝑗 𝑘 (cid:11) (cid:164)Q𝑇 + Q

d
d𝑡

(cid:10)b 𝑗 𝑘 (cid:11) (cid:12)

(cid:12)frame 2Q𝑇 .

(4.16)

Moreover, we note that

˜κ :

(cid:32)𝑁−1
∑︁

𝑖=1

(cid:33)

˜r𝑖 ⊗ ∇ ˜r𝑖 ⊗ ˜b 𝑗 𝑘

=

=

=

𝑁−1
∑︁

𝑖=1
𝑁−1
∑︁

𝑖=1
𝑁−1
∑︁

𝑖=1

(cid:104)(cid:16)

QκQ𝑇 + (cid:164)QQ𝑇 (cid:17)

(cid:105)

· Qr 𝑗

· Q ⊗ ∇r𝑖 ⊗

(cid:16)

Qb 𝑗 𝑘 Q𝑇 (cid:17)

(κ · r𝑖) · ∇r𝑖

(cid:16)

Qb 𝑗 𝑘 Q𝑇 (cid:17)

+ (Q𝑇 (cid:164)Qr𝑖) · ∇r𝑖

(cid:16)

Qb 𝑗 𝑘 Q𝑇 (cid:17)

Q(κ · r𝑖) · ∇r𝑖 b 𝑗 𝑘 Q𝑇 + Q

(cid:16)

Q𝑇 (cid:164)Qb 𝑗 𝑘 + b 𝑗 𝑘 (cid:164)Q𝑇 Q

(cid:17)

Q𝑇

+ Q

(cid:32)𝑁−1
∑︁

𝑖=1

𝑇 ( (cid:164)Q𝑇 Q)∇r𝑖 𝑔(r∗)

r𝑖

(cid:33)

r 𝑗𝑟𝑇

𝑘 Q𝑇

=

𝑁−1
∑︁

𝑖=1

Q(κ · r𝑖) · ∇r𝑖 b 𝑗 𝑘 Q𝑇 + (cid:164)Qb 𝑗 𝑘 Q𝑇 + Qb 𝑗 𝑘 (cid:164)Q𝑇 ,

(4.17)

where we have used r𝑖

𝑇 ( (cid:164)Q𝑇 Q)r𝑖 ≡ 0 since (cid:164)Q𝑇 Q is anti-symmetric. Eq. (4.16) and Eq. (4.17) shows

that the combination of the two terms on the left-hand-side of Eq. (4.14) rigorously preserve the

rotational symmetry, i.e.,

(cid:32)

d
d𝑡

(cid:10) ˜b(cid:11) − ˜κ :

𝑁−1
∑︁

𝑖=1

(cid:10) ˜r𝑖 ⊗ ∇ ˜r𝑖 ⊗ ˜b(cid:11)

(cid:33)(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 1

≡ Q

(cid:32)

d
d𝑡

⟨b⟩ − κ :

𝑁−1
∑︁

𝑖=1

(cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11)

(cid:33)(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 2

Q𝑇 .

It is straightforward to prove rotational symmetry for the other terms in Eq (4.14).

4.2.4 Symmetry-preserving DNN models

To complete the DeePN2 model, we need to specify the DNN models. These DNN models

should also strictly preserve rotational symmetry. Different from the rotational-invariant scalar

stress model considered in Ref. (Zhou et al., 2021), the second-order tensors G, H1,𝑖, H2,𝑖 need to

satisfy the symmetry condition (4.4) and the fourth-order tensors E𝑖 need to retain the objectivity of

Dc𝑖
D𝑡 . However, there does not exist such a reference frame in which these symmetry constraints can

be satisfied by the macro-scale modeling terms.

55

To handle this problem, we consider the eigen-space of the feature c1 with a fixed form of the

encoder b1(·), e.g., by setting 𝑔1(·) = 𝑤1,: ≡ 1 and let other b𝑖 (·) involved in the training. Consider
the eigen-decomposition c1 = UΛU𝑇 has distinct eigenvalues, where U is the matrix whose columns

are the eigenvectors of c1. U is not unique due to the non-uniqueness of the eigenvectors. Without

loss of generality, we further assume that the first element of u1 to be positive. With the following
lemma, we show that the general form of U can be always written as U( 𝑗) := U𝑆( 𝑗) with 𝑗 = 1, · · · , 4,

where 𝑆( 𝑗) is given by

𝑆(1) =

+1

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

+1

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

+1

, 𝑆(2) =

+1

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

−1

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

+1

, 𝑆(3) =

+1

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

+1

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

−1

, 𝑆(4) =

+1

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

−1

.

(cid:170)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:174)
(cid:172)

−1

Lemma 4.2.2. For a symmetry matrix 𝑀 ∈ R3×3, let 𝑆𝑀 denote the set of matrices with the
transformation of 𝑆( 𝑗), i.e., 𝑆𝑀 := (cid:8)𝑆(1) 𝑀𝑆(1), · · · , 𝑆(4) 𝑀𝑆(4)(cid:9). For any 𝑀 ( 𝑗) := 𝑆( 𝑗) 𝑀𝑆( 𝑗) ∈ 𝑆𝑀,
𝑆(𝑘) 𝑀 ( 𝑗)𝑆(𝑘) ∈ 𝑆𝑀, 1 ≤ 𝑗, 𝑘 ≤ 4. Furthermore, 𝑆𝑀 can be constructed by 𝑀 ( 𝑗), i.e., 𝑆𝑀 ≡
(cid:8)𝑆(1) 𝑀 ( 𝑗)𝑆(1), · · · , 𝑆(4) 𝑀 ( 𝑗)𝑆(4)(cid:9).
Proof. By applying 𝑆( 𝑗) to 𝑀, it is easy to see that the diagonal part of 𝑀 ( 𝑗) remains the same.

Since 𝑀 ( 𝑗) is also symmetric, we only need to check the upper-triangular part, taking the four

possible operations

∗ + +
(cid:170)
(cid:174)
(cid:174)
∗ +
(cid:174)
(cid:174)
(cid:174)
(cid:172)

∗

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

∗ − +
(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

(cid:170)
(cid:174)
(cid:174)
∗ −
(cid:174)
(cid:174)
(cid:174)
(cid:172)

∗

∗ + −
(cid:170)
(cid:174)
(cid:174)
∗ −
(cid:174)
(cid:174)
(cid:174)
(cid:172)

∗

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

(cid:169)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:173)
(cid:171)

∗ − +

(cid:170)
(cid:174)
(cid:174)
∗ −
(cid:174)
(cid:174)
(cid:174)
(cid:172)

∗

,

where “+” represents that the element remains the same and “−” represents a sign change. We see

that number of “−” operations is either 0 or 2. Starting from any of the above choice for 𝑀 ( 𝑗), all of

the four operators yields either 0 or 2 “−” operations. Therefore, 𝑆(𝑘) 𝑀 ( 𝑗)𝑆(𝑘) ∈ 𝑆𝑀. Furthermore,

if the upper triangular part of 𝑀 has distinct absolute values, then ∀𝑀 ( 𝑗), 𝑆𝑘 𝑀 𝑗 𝑆𝑘 ≠ 𝑆𝑘 ′ 𝑀 𝑗 𝑆𝑘 ′

with

𝑘 ≠ 𝑘′, hence 𝑆𝑀 can be constructed by 𝑀 𝑗 . Otherwise, if some upper triangular entries of 𝑀 share

the same absolute value, we can draw the same conclusion accordingly.

56

Now we consider the matrix whose columns are the eigenvectors of ˜c1 = Qc1Q𝑇 , denoted by ˜U.
We can write ˜U = QU𝑆( 𝑗), where 𝑗 ∈ {1, 2, 3, 4}. Accordingly, the DNN input of c𝑖 takes the form

˜U𝑇 ˜c𝑖 ˜U =

(cid:16)

QU𝑆( 𝑗)(cid:17)𝑇

Qc𝑖Q𝑇 (cid:16)

QU𝑆( 𝑗)(cid:17)

= 𝑆( 𝑗)U𝑇 c𝑖U𝑆( 𝑗).

Let 𝑀 = U𝑇 c𝑖U, by using Lemma 4.2.2, it is easy to see that 𝑆U𝑇 c𝑖U can be constructed by taking
𝑗 = 1, · · · , 4.

Proposition 4.2.3. Let U be the matrix whose columns are the eigenvectors of c1. Let the DNN
input be ˆc( 𝑗)

𝑖 = 𝑆( 𝑗)U𝑇 c𝑖U𝑆( 𝑗). The following form of τp

G(c1, · · · , c𝑛) =

1
4

4
∑︁

𝑗=1

U( 𝑗) ˆG(ˆc( 𝑗)
1

, · · · , ˆc( 𝑗)

𝑛 )U( 𝑗)𝑇

, U( 𝑗) = U𝑆( 𝑗).

(4.18)

satisfies the rotational symmetry constraint (4.4).

Finally, to account for the swap of the eigenvectors when the eigenvalues cross over, we consider

the 6 permutations of the three eigenvalues of c1, i.e.,

G(c1, · · · , c𝑛) =

1
24

5
∑︁

4
∑︁

𝑘=0

𝑗=1

U( 𝑗,𝑘) ˆG(ˆc( 𝑗,𝑘)

1

, · · · , ˆc( 𝑗,𝑘)

𝑛

)U( 𝑗,𝑘)𝑇

,

(4.19)

where 𝑘 represents the rank of permutation (e.g., in lexicographical order) and U( 𝑗,𝑘) is a variation

of U( 𝑗) with corresponding column permutation.

During simulation, the eigenvalues of c1 may cross each other. To account for this, we consider

all the 6 permutations of the three eigenvalues, i.e.,

G(c1, · · · , c𝑛) =

1
24

5
∑︁

4
∑︁

𝑘=0

𝑗=1

U( 𝑗,𝑘) ˆG(ˆc( 𝑗,𝑘)

1

, · · · , ˆc( 𝑗,𝑘)

𝑛

)U( 𝑗,𝑘)𝑇

,

(4.20)

where 𝑘 represents the rank of permutation (e.g., in lexicographical order) and U( 𝑗,𝑘) is a variation

of U( 𝑗) with corresponding column permutation. Furthermore, to avoid the eigenvector degeneracy,

we set a threshold value 𝜖 for the eigenvalues. When two eigenvalues approach each other, e.g.,

|𝜆2 − 𝜆3| < 𝜖, we freeze all the eigenvectors until |𝜆2 − 𝜆3| ≥ 𝜖. In this work, we take 𝜖 = 10−3, and

we refer to Section 4.3.6 for detailed numerical studies.

Eq. (4.20) provides the rotation-symmetric form for the second-order stress tensor G, where ˆG

is represented by DNNs. The constitutive model terms H1,𝑖 and H2,𝑖 can be constructed in a similar

57

manner. Finally, we can show the fourth-order tensors {E𝑖}𝑛

𝑖=1 associated with the encoders (4.6)

can be constructed in the form

9
∑︁
κ : E𝑖 = κc𝑖 + c𝑖κ𝑇 + κ : (cid:169)
(cid:173)
(cid:171)

𝑗=1

,

1,𝑖 ⊗ E( 𝑗)
E( 𝑗)
2,𝑖 (cid:170)
(cid:174)
(cid:172)

(4.21)

where E( 𝑗)

1,𝑖 and E( 𝑗)

2,𝑖 are second-order tensors which respect the symmetry condition (4.4) and can
be constructed in the form of Eq. (4.20) (see Prop. 4.2.4). The constructed DeePN2 model takes the

form similar to the general hydrodynamic equations (4.2) and (4.3), where some of the model terms

are represented by DNNs in the form of Eqs. (4.20) and (4.21).
Proposition 4.2.4. The following ansatz of (cid:10)(cid:205)𝑁−1

𝑖=1 r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) ensures that the dynamic of

evolution of c retains rotational invariance.

𝑁−1
∑︁

𝑖=1

(cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) =

𝑁−1
∑︁

(cid:68)

𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗ r 𝑗 r𝑇
𝑘

(cid:69)

𝑗,𝑘=1

+

9
∑︁

𝑘=1

E(𝑘)
1

(c) ⊗ E(𝑘)
2

(c),

(4.22)

where c = (c1, · · · , c𝑛), ˜c = (˜c1, · · · , ˜c𝑛), and E1 and E2 satisfy

˜E1 := E1(˜c) = QE1(c)Q𝑇 ,

˜E2 := E2(˜c) = QE2(c)Q𝑇 .

(4.23)

Proof. Without loss of generality, we represent the fourth order tensor by the following two bases

F1(c) ⊗ F2(c) ⊗ F3(c) + F3(c) ⊗ (F2(c) ⊗ F1(c))𝑇{2,3} ,

F1(c), F3(c) ∈ R3, F2(c) ∈ R3×3,

E1(c) ⊗ E2(c),

E1(c), E2(c) ∈ R3×3,

(4.24)

where the super-script 𝑇{2,3} represents the transpose between the 2nd and 3rd indices; also F1, F2,

F3, E1 and E2 satisfy the symmetry conditions

F1(˜c) = QF1(c), F3(˜c) = QF3(c),

E1(˜c) = QE1(c)Q𝑇 , E2(˜c) = QE2(c)Q𝑇 , F2(˜c) = QF2(c)Q𝑇 .

For the term E1(c) ⊗ E2(c), we have

(4.25)

κ : E1(c) ⊗ E2(c) = Tr(κE1(c))E2(c)

(4.26)

58

and

˜κ : ˜E1 ⊗ ˜E2

(cid:12)
(cid:12)frame 1 =

(cid:16)

QκQ𝑇 + (cid:164)QQ𝑇 (cid:17)

:

(cid:16)

QE1(c)Q𝑇 ⊗ ˜E2

(cid:17)

= Tr(κE1(c)) ˜E2 + Tr( (cid:164)QQ𝑇 QE1(c)Q𝑇 ) ˜E2

= Tr(κE1(c)) ˜E2

(cid:16)

≡ Q

κ : E1(c) ⊗ E2(c)(cid:12)

(cid:12)frame 2

(cid:17)

Q𝑇 ,

(4.27)

where we have used Tr( (cid:164)QQ𝑇 ) ≡ 0.

For the term F1(c) ⊗ F2(c) ⊗ F3(c) + F3(c) ⊗ (F2(c) ⊗ F1(c))𝑇{2,3} , we have

κ : F1(c) ⊗ F2(c) ⊗ F3(c) = F2(c)𝑇 κF1(c)F3(c)𝑇

(4.28)

and

˜κ : ˜F1 ⊗ ˜F2 ⊗ ˜F3 = QF2(c)𝑇 κF1(c)F3(c)𝑇 Q𝑇 + QF2(c)𝑇 Q𝑇 (cid:164)QF1(c)F3(c)𝑇 Q𝑇 .

(4.29)

On the other hand, we note that

d ˜b
d𝑡

(cid:12)
(cid:12)frame 1 = (cid:164)QbQ𝑇 + Qb (cid:164)Q𝑇 + Q

db
d𝑡

(cid:12)
(cid:12)frame 2Q𝑇 .

(4.30)

To ensure the rotational symmetry of Db

D𝑡 , we have

F2 ≡ I, ∑︁

𝑖

F(𝑖)
1

⊗ I ⊗ F(𝑖)

3 =

𝑁−1
∑︁

(cid:68)

𝑗,𝑘=1

𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗)r 𝑗 ⊗ I ⊗ r𝑘

(cid:69)

.

(4.31)

Hence, we have

d
d𝑡 ˜c − ˜κ :
(cid:32)

≡ Q

(cid:32)

∑︁

𝑖

˜F(𝑖)
1

⊗ ˜F(𝑖)
2

⊗ ˜F(𝑖)
3

+ ˜F(𝑖)
3

⊗

(cid:16) ˜F(𝑖)

2

⊗ ˜F(𝑖)
1

(cid:17)𝑇{2,3}

d
d𝑡 c − κ :

(cid:32)

∑︁

𝑖

F(𝑖)
1

⊗ F(𝑖)
2

⊗ F(𝑖)
3

+ F(𝑖)
3

⊗

(cid:16)

F(𝑖)
2

⊗ F(𝑖)
1

(cid:33)(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 1
(cid:17)𝑇{2,3}

(cid:33)(cid:33)(cid:12)
(cid:12)
(cid:12)
(cid:12)
(cid:12)frame 2

(4.32)

Q𝑇 .

Furthermore, using Eq. (4.31), we obtain

∑︁

F(𝑖)
1

⊗ F(𝑖)
2

⊗ F(𝑖)
3

+ F(𝑖)
3

⊗

(cid:16)

F(𝑖)
2

⊗ F(𝑖)
1

(cid:17)𝑇{2,3}

𝑖

=

𝑁−1
∑︁

(cid:68)

𝑗,𝑘=1

𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗ r 𝑗 r𝑘

𝑇 (cid:69)

.

(4.33)

59

Accordingly, the remaining part of (cid:205)𝑁−1
𝑖=1

(cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) is expanded by

(cid:42)𝑁−1
∑︁

𝑖=1

r𝑖 ⊗ ∇r𝑖

𝑁−1
∑︁

𝑗,𝑘=1

𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗) ⊗ r 𝑗 r𝑇
𝑘

(cid:43)

=

9
∑︁

𝑖=1

E(𝑖)
1

(c) ⊗ E(𝑖)
2

(c).

(4.34)

Combining Eq. (4.32), (4.33) and (4.34), we conclude that the decomposition

𝑁−1
∑︁

𝑖=1

(cid:10)r𝑖 ⊗ ∇r𝑖 ⊗ b(cid:11) =

𝑁−1
∑︁

(cid:68)

𝑗,𝑘=1

𝑔(1)
𝑗

(r∗)𝑔(2)
𝑘

(r∗) (r 𝑗 ⊗ ∇r 𝑗 + r𝑘 ⊗ ∇r𝑘 ) ⊗r 𝑗 r𝑇
𝑘

(cid:11)

+

9
∑︁

𝑘=1

E(𝑘)
1

(c) ⊗ E(𝑘)
2

(c)

ensures the objectivity of the time-derivative of c.

4.3 Numerical results

(4.35)

The present DeePN2 model is trained using micro-scale samples collected from the homogeneous

shear flow. We demonstrate the model accuracy and generalization ability by considering various

flows in comparison with the results of the micro-scale simulations for the suspensions with three

different polymer structural models as shown in Fig. 4.1. As we will see, the micro-scale structure

does play an important role in the viscoelastic response. We will use this to examine the DeePN2

model fidelity.

4.3.1 Micro-scale model of the polymer solutions

In the present study, we consider suspensions with three different polymer structures as shown

in Fig. 4.1. Each polymer molecule consists of 𝑁 = 7 beads connected with 𝑁𝑏 FENE bonds, i.e.,

𝑉 (q) =

𝑁𝑏∑︁

𝑗=1

𝑉𝑏 (cid:0)|q 𝑗1 − q 𝑗2 |(cid:1) , 𝑉𝑏 (𝑙) = −

(cid:34)

𝑙2
0 log

1 −

𝑘 𝑠
2

(cid:35)

,

𝑙2
𝑙2
0

(4.36)

where 𝑘 𝑠 represents the spring constant and 𝑙0 is the maximum of the extension length. The chain-

and star-shaped molecules have 𝑁𝑏 = 6 bonds with the same bond parameters 𝑘 𝑠 = 0.1 and 𝑙0 = 2.3

(in reduced unit). The net-shaped molecule is similar to the star-shaped molecule with the same

parameters for the first 6 bonds; 3 additional bonds connect the side chain particles with 𝑘 𝑠 = 0.1

and 𝑙0 = 3.7. The polymer number density of the three suspensions is 𝑛p = 0.3. The solvent is

modeled by the dissipative particle dynamics (DPD) (Hoogerbrugge and Koelman, 1992; Groot and

60

Warren, 1997) with number density 𝑛𝑠 = 4.0. The pairwise interaction between particle 𝑖 and 𝑗

takes the standard form

F𝑖 𝑗 = F𝐶

𝑖 𝑗 + F𝐷

𝑖 𝑗 + F𝑅
𝑖 𝑗 ,

F𝐶
𝑖 𝑗 =

F𝐷
𝑖 𝑗 =





−𝛾𝑤𝐷 (𝑟𝑖 𝑗 )(v𝑖 𝑗 · e𝑖 𝑗 )e𝑖 𝑗 ,

𝑟𝑖 𝑗 < 𝑟𝑐

0,

𝑟𝑖 𝑗 > 𝑟𝑐

,

F𝑅
𝑖 𝑗 =

𝑎(1.0 − 𝑟𝑖 𝑗 /𝑟𝑐)e𝑖 𝑗 ,

𝑟𝑖 𝑗 < 𝑟𝑐

,

0,

𝑟𝑖 𝑗 > 𝑟𝑐

𝜎𝑤 𝑅 (𝑟𝑖 𝑗 )𝜉𝑖 𝑗 e𝑖 𝑗 ,

𝑟𝑖 𝑗 < 𝑟𝑐

,

0,

𝑟𝑖 𝑗 > 𝑟𝑐








where r𝑖 𝑗 = r𝑖 − r 𝑗 , 𝑟𝑖 𝑗 = |r𝑖 𝑗 |, e𝑖 𝑗 = r𝑖 𝑗 /𝑟𝑖 𝑗 , and v𝑖 𝑗 = v𝑖 − v 𝑗 , 𝜉𝑖 𝑗 are independent identically

distributed (i.i.d.) Gaussian random variables with zero mean and unit variance. 𝛾 and 𝜎 are related

with the system temperature by the second fluctuation-dissipation theorem (Español and Warren,

1995) as 𝜎2 = 2𝛾𝑘 𝐵𝑇, where 𝑘 𝐵𝑇 is set to 0.25. The detailed parameters are given in Tab. 4.1.

Table 4.1 Parameters (in reduced unit) of the micro-scale model of the polymer solution (S-solvent,
P-polymer).

𝑎
S-S 4.0
S-P 0.0
P-P 4.0

𝛾
5.0
40.0
0.01

𝜎
1.58
4.47
0.071

𝑘
0.25
0.0
1.0

𝑟𝑐
1.0
1.0
0.7

4.3.2 Collecting training samples

Collecting training samples is one of the most important steps in the construction of DeePN2.

To obtain reliable models, we need to ensure that the training sample set is representative enough of

all the practical situations that the model is intended for. In the present study, we collect the training

samples in shear flow with shear rate (cid:164)𝛾 ∈ [0, 0.09]. Since the training of the DeePN2 model only

requires discrete polymer configurations rather than time-series samples, one convenient approach

is to consecutively increase the shear rate and collect the discrete configurations during the shear

extension and relaxation process, where the inclusion of the relaxation process can facilitate the

sampling of polymer configuration phase space due to the viscoelastic hysteresis effect. 32000

samples are collected where each sample consists of 5000 polymer configurations, which will be

employed to evaluate the constitutive dynamics terms ⟨·⟩. Due to the permutation symmetry of the

61

the particle label, the effective number of configurations per sample is 1 × 104 for the chain-shaped

molecule and 3 × 104 for the star- and net-shaped molecules.

4.3.3 Training procedure

The DeePN2 model is constructed via the training of the NN representations of the encoder

𝑗=1, stress model G, evolution dynamics (cid:8)H1, 𝑗 (cid:9)𝑛

mappings (cid:8)𝑔 𝑗 (r∗)(cid:9)𝑛
𝑗=1 and the 4th order
tensors (cid:8)E 𝑗 (cid:9)𝑛
𝑗=1 of the objective tensor derivatives. In this study, we choose 𝑛 = 3 encoders and
fix 𝑔1(r∗) ≡ 1. For the chain-shaped molecule, we set 𝑤1,𝑖 = 1 − 𝑖/𝑁, 1 ≤ 𝑖 ≤ 𝑁 − 1 and (cid:205)𝑖 𝑤1,𝑖r𝑖

𝑗=1, (cid:8)H2, 𝑗 (cid:9)𝑛

represents the orientation between the free-end particle and the center of mass. For the star- and

net-shaped molecules, we set 𝑤1,1 = 1 and 𝑤1,𝑖 = 0 for 𝑖 ≥ 2. All terms are represented by the

fully connected NN. The number of hidden layers are set to be (120, 120, 120), (300, 300, 300),

(400, 400, 400), (450, 450, 450), (560, 560, 560), respectively. The activation function is taken to
be the hyperbolic tangent. We emphasize that the mappings (cid:8)𝑔 𝑗 (r∗)(cid:9)𝑛
𝑗=1 and weights 𝑤 ∈ R𝑛×(𝑁−1)
involve in the training process for the joint learning of the encoders (cid:8)b 𝑗 (r)(cid:9)𝑛
𝑗=1 defined in Eq. (4.6)
and the macro-scale features (cid:8)c 𝑗 (cid:9)𝑛
𝑗=1, although they do not appear explicitly in the macro-scale

hydrodynamic equations.

The DNNs are trained by the Adam stochastic gradient descent method (Kingma and Ba, 2015)

for 20 epochs, using 5 samples per batch size. The initial learning rate is 2.8 × 10−4 and decay rate

is 0.75 per 20000 steps.

Similar to Ref. (Lei et al., 2020), the loss function is defined by

𝐿 = 𝜆𝐺 𝐿𝐺 + 𝜆𝐻1

𝐿𝐻1 + 𝜆𝐻2

𝐿𝐻2 + 𝜆E 𝐿E,

where 𝜆𝐺 = 0.2, 𝜆𝐻1 = 0.1, 𝜆𝐻2 = 0.6 and 𝜆E = 0.1 are hyperparameters. For each training batch
of 𝑚 training samples, 𝐿𝐺, 𝐿𝐻1, 𝐿𝐻2, 𝐿E of the system are given by

62

(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)

𝐿𝐺 =

𝐿𝐻1 =

𝐿𝐻2 =

𝑚
∑︁

𝑛
∑︁

𝑙=1

𝑖=1

𝑚
∑︁

𝑛
∑︁

𝑙=1

𝑖=1

𝑚
∑︁

𝑛
∑︁

𝑙=1

𝑖=1

G𝑖 (c(𝑙)) −

(cid:42) 𝑁𝑏∑︁

𝑘=1

r𝑘 ⊗ ∇r𝑘𝑉

2

(cid:43) (𝑙)(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)

H1,𝑖 (c(𝑙)) −

(cid:42) 𝑁−1
∑︁

𝑗,𝑘=1

𝐴 𝑗 𝑘 ∇r 𝑗 · ∇r𝑘 b𝑖

H2,𝑖 (c(𝑙)) −

(cid:42)𝑁−1
∑︁

𝑁𝑏∑︁

𝑗=1

𝑘=1

𝐴 𝑗 𝑘 ∇r𝑘𝑉 · ∇r 𝑗 b𝑖

2

(cid:43) (𝑙)(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:43) (𝑙)(cid:13)
2
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)

(4.37)

9
∑︁

𝑛
∑︁

𝑚
∑︁

𝐿E =

(cid:43) (𝑙)(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
(cid:13)
, · · · , c(𝑙)
where ∥ · ∥2 denotes the total sum of squares of the entries in the tensor, and c(𝑙) = (c(𝑙)
𝑛 ).
1

1,𝑖 (c(𝑙)) ⊗ E(𝑠)
E(𝑠)

2,𝑖 (c(𝑙)) −

𝑤𝑖 𝑗 𝑤𝑖 𝑗 ′r 𝑗 r 𝑗 ′

r𝑘 ⊗ ∇r𝑘 𝑔2

(cid:42)𝑁−1
∑︁

𝑁−1
∑︁

𝑗, 𝑗 ′=1

𝑖 ⊗

𝑘=1

𝑠=1

𝑙=1

𝑖=1

𝑇

2

,

4.3.4 Numerical Result of Reverse Poiseuille flow

First, we consider the reverse Poiseuille flow in a 60 × 100 × 60 domain (in reduced unit) with

the opposite body force fext = (0.016, 0, 0) applied to each half of the domain divided by the plane

𝑦 = 50 starting from 𝑡 = 0. At 𝑡 = 800, the external force is removed. The relaxation process of

the flow field is recorded until the total simulation time 𝑡 = 1600. For all the three systems, the

predictions from DeePN2 agree well with the micro-scale simulations results, as shown in Fig. 4.2.

In particular, the flow velocity fields of the three systems are nearly identical at the initial stage

𝑡 ∈ [0, 200], as the development of the flow field is dominated by the solvent and the near-equilibrium

responses of the polymer molecules in this regime. Starting from 𝑡 = 250, the velocity fields of

the three systems exhibit distinct evolution processes. The velocity of the chain-shaped molecule

suspension exhibits the largest oscillation and the longest development stage during 𝑡 ∈ [250, 800].

In contrast, the velocity of the star-shaped molecule suspension exhibits moderate oscillation and

shows an apparent increase during 𝑡 ∈ [400, 800], indicating that the polymer elastic energy reaches

a plateau earlier than the chain-shaped system. Moreover, the velocity of the net-shaped molecule

suspension exhibits the smallest oscillation, indicating that the three additional side-chains further

affect the rheological properties of the polymer suspension.

Such differences can also be studied by examining the polymer stress development. As shown

63

Figure 4.2 The velocity 𝑢𝑥 (left) and polymer stress τp (right) of the reverse Poiseuille flow (𝑦 = 6) of
the polymer suspensions of three different molecule structures shown in Fig. 4.1. τp is normalized
by polymer number density 𝑛p, i.e., it is the stress energy per polymer (the same for the remaining
figures). With the same FENE bond, the polymer suspensions exhibit different flow responses due
to the different molecule structural mechanics. The dark blue lines with rough oscillations denote
the micro-scale simulation results; the solid lines with symbols denote the DeePN2 predictions.

in Fig. 4.2, the value of τp𝑥𝑥 for the chain-shaped molecule suspension keeps increasing through
the development stage 𝑡 ∈ [0, 800] while for the star-shaped molecule, τp𝑥𝑥 shows only a moderate
increase. In contrast, the net-shaped molecule suspension reaches steady state at about 𝑡 = 400.

Moreover, the steady value of the shear stress τp𝑥𝑦 of the chain-shaped molecule is also larger than

the star-shaped and the net-shaped molecules, indicating the largest restored elastic energy. This

result is also consistent with the larger velocity oscillation from the minimal values to 0 during the

relaxation process with 𝑡 ∈ [800, 1000].

The different rheological properties of the three polymer suspensions can be understood as

follows. Although both the chain-shaped and star-shaped molecules have 6 identical FENE bonds,

the chain-shaped molecule is less symmetric than the star-shaped molecule. Accordingly, it shows

larger dispersion in the R18 configuration space, and hence, is more flexible than the star-shaped

molecule. The elastic response time of the chain-shaped molecule suspension is longer than that

of the star-shaped molecule suspension; larger elastic energy can be restored during the relaxation

stage. On the other hand, the net-shaped molecule is more rigid than the star-shaped molecule due

to the additional bond interaction.

Another important feature of non-Newtonian fluids is the hysteresis effect. Classical models such

64

050010001500 0.0500.050.10.15tτP xx,  τP xy0500100015000246810Figure 4.3 The evolution of the polymer stress τp and conformation tensor c1 obtained from the
reverse Poiseuille flow (𝑦 = 6) of the polymer suspensions. The clockwise loops represent the
development and relaxation processes. For the visualization, the conformation tensor component
𝑐1𝑥𝑥 is rescaled by the maximum value obtained from the micro-scale simulation.

as Hookean and FENE-P cannot capture such effects (Doyle et al., 1998; Lielens et al., 1998). Fig.

4.3 shows the evolution of the polymer stress and conformation tensor for the chain- and star-shaped

molecule suspensions. The clockwise loops show the hysteresis effects during the development and

relaxation processes; the non-unique stress values indicate that linear and mean field approximations

are insufficient in describing the viscoelastic response of the system. In contrast, these effects

are accurately captured with the DeePN2 model. Similar to Fig. 4.2, the chain-shaped molecule

suspension shows more pronounced hysteresis effect due to the larger dispersion in the configuration

space, reflected as the larger “loop area” than the results for star-shaped molecule suspension.

4.3.5 Numerical Result of Womersley flow

Next, we investigate the Womersley flow (Womersley, 1955) by applying the opposite oscillating

body force fext = (± 𝑓0 cos(2𝜋𝜔𝑡), 0, 0) to each half of the domain along the z-direction, where we

set 𝑓0 = 0.012 and 𝜔 = 1/3000. Fig. 4.4 shows the velocity development of the star- and net-shaped

molecule suspensions. Similar to the reverse Poiseuille flow, the net-shaped molecule suspension

shows less pronounced viscoelastic responses, reflected as the slower decay near 𝑡 ∈ [200, 400]

and the larger oscillation due to the less elastic energy storage. For comparison, we also show the

prediction from the conventional FENE-P model. The parameters are chosen to match the dynamics

of the orientation tensor (the vector between two free-end particles) near equilibrium. As expected,

65

c1 xxτP xx00.250.50.75102.557.510DeePN2   chainDeePN2   starMD   chainMD   starc1 xxτP xy00.250.50.7510123Figure 4.4 The oscillating Womersley flow of the star- and net-shaped molecule suspensions
predicted from the micro-scale simulation, DeePN2 and the FENE-P model. The FENE-P model
parameters are chosen to match the dynamics of the orientation tensor (the vector between two
free-end particles) near equilibrium. Left: the velocity evolution 𝑢𝑥 (𝑦, 𝑡) at 𝑦 = 6. Right: the
velocity profile 𝑢𝑥 (𝑦, 𝑡) at 𝑡 = 6450.

the FENE-P model shows limitations for predicting the flow responses of the two suspensions.

The distinct viscoelastic responses of the different suspensions can be further elucidated by

examining the elongation flow. We impose the traceless flow gradient ∇u = diag( (cid:164)𝜖, − (cid:164)𝜖, 0) where

the strain rate (cid:164)𝜖 is set to be 4 × 10−4. Fig. 4.5 shows the stress development of the chain- and

star-shaped molecule suspensions. The micro-scale simulations are imposed by the generalized

uniaxial extension flow boundary conditions (Nicholson and Rutledge, 2016; Murashima et al.,

2018). Compared with the shear flow, the elongation flow yields larger extension and longer

processes, as was shown in experimental studies (Smith et al., 1999); the steady state is achieved at

about 𝑡 = 2.5 × 103 and 𝑡 = 104 for the star- and chain-shaped molecule suspensions, respectively.

Moreover, the steady stress value 𝜏p𝑥𝑥 of the chain-shaped molecule suspension is much larger

than the value of the star-shaped molecule suspension. Such differences are also due to the larger

flexibility of the chain-shaped molecule, which produces a stronger extension under external flow.

DeePN2 successfully captures the different responses and shows good agreement with the micro-scale

simulations for both cases.

Finally, we consider the Taylor-Green vortex flow (Taylor, 1934; Thomases and Shelley, 2007)

in a 100 × 100 × 160 domain (in reduced unit) of the micro-scale simulation. The external force

66

tux02000400060008000 0.100.10.2DeePN2   starDeePN2   netFENE PMD   star/netyux0255075100 0.1 0.0500.050.1Figure 4.5 The elongation flow of the chain- and star-shaped molecule suspensions predicted from the
micro-scale simulation and DeePN2. With the same bond potential and strain rate, the chain-shaped
molecule suspension yields larger elongation stress. The lines with rough oscillations denote the
micro-scale simulation results; the solid lines with symbols denote the DeePN2 predictions.

fext = ( 𝑓𝑥, 𝑓𝑦, 0) is applied to the domain following

𝑓𝑥 (𝑥, 𝑦) = −2 𝑓0 sin

(cid:19)

(cid:18) 2𝜋𝑥
𝐿

cos

(cid:19)

(cid:18) 2𝜋𝑦
𝐿

,

𝑓𝑦 (𝑥, 𝑦) = 2 𝑓0 cos

(cid:19)

(cid:18) 2𝜋𝑥
𝐿

(cid:19)

(cid:18) 2𝜋𝑦
𝐿

,

sin

where 𝐿 = 100 and 𝑓0 = 6 × 10−3. Periodic boundary conditions are imposed along all of the

three directions. The force field imposes an elongation to the flow field along the x-direction and

a compression along the y-direction. The flow near the center (𝐿/2, 𝐿/2) resembles the planar

elongation flow. Four vortices appear at (𝐿/2 ± 𝐿/4, 𝐿/2 ± 𝐿/4). Figure. 4.6(a-b) shows the

steady-state velocity field. Compared with the star-shaped molecule suspension, the velocity field of

the chain-shaped molecule suspension shows larger deviation from the symmetric structure of the

Newtonian flow (i.e., ∝ [− sin (2𝜋𝑥/𝐿) cos (2𝜋𝑦/𝐿) , cos (2𝜋𝑥/𝐿) sin (2𝜋𝑦/𝐿)]) due to the larger

polymer stress across the flow regime. Furthermore, the two suspensions yield different velocity

magnitude, as shown in Fig. 4.6(c). Fig. 4.6(d) shows the velocity development at (75, 49). The

velocities of both suspensions achieve a similar maximum value near 𝑡 = 30 and decay along with

the polymer stress development. However, the star-shaped molecule suspension reaches the steady

state much earlier with a larger velocity than the chain-shaped molecule suspension.

Fig. 4.7 (a-b) shows the steady-state stress field for the two suspensions. We see that the

chain-shaped molecule suspension exhibits larger polymer stress variation along the elongation

and contraction directions, reflected in the larger loop area in Fig. 4.7(b). Such difference is also

67

tτP xx,  τP yy0100002000011.522.53DeePN2   chainDeePN2   starMD   chainMD   starFigure 4.6 The velocity field of the Taylor-Green vortex flow of the chain- and star-shaped molecule
suspensions predicted from the micro-scale simulations and DeePN2. (a-b) The 2D steady-state
velocity field of the chain- and star-shaped molecule suspensions from the micro-scale simulations.
The velocity field of the chain-shaped system yields more pronounced deviations from the symmetric
Newtonian flow due to the more pronounced polymer stress across the flow regime.
(c) The
steady-state 1D velocity profile 𝑢𝑥 (𝑥, 𝑦 = 49). The solid and dashed lines represent the predictions
from the micro-scale simulations and the DeePN2 model, respectively. (d) The time history of
𝑢𝑥 (𝑥 = 75, 𝑦 = 49).

consistent with the more pronounced asymmetric velocity field shown in Fig. 4.6(a-b). In addition,

we also examine the transient states where the flow undergoes intricate and heterogeneous process.

Fig. 4.7(c) shows the stress development at point (49, 35), where 𝜏p𝑥𝑥 and 𝜏p𝑦𝑦 cross over during
the evolution. During the initial stage, 𝜏p𝑦𝑦 increases along with the flow development towards
to the stagnation point. At 𝑡 > 150, 𝜏p𝑦𝑦 decreases due to the compression along the y-direction.
Meanwhile, 𝜏p𝑥𝑥 increases and achieves a steady state slightly larger than 𝜏p𝑦𝑦 for the star-shaped

solution. On the other hand, the chain-shaped solution ends up with a significantly larger value of

𝜏p𝑥𝑥 due to the larger molecule flexibility and further extension along the x-direction. The different
viscoelastic responses are also reflected in the stress development at point (49, 49). As shown in

68

xy025507510002550751000.020.0150.010.0050 0.005 0.01 0.015 0.02uxxy02550751000255075100xux0255075100 0.03 0.01500.0150.03chainstartux05001000150020000.030.060.091011021030.030.060.09Figure 4.7 The stress field of the Taylor-Green vortex flow of the chain- and star-shaped molecule
suspensions predicted from the micro-scale simulations and DeePN2. (a) The 2D steady-state
stress field of the chain-shaped molecule suspension from the micro-scale simulations. (b) The
1D steady-state stress profiles 𝜏p𝑥𝑥 (𝑥, 𝑦 = 49) and 𝜏p𝑥𝑥 (𝑥 = 49, 𝑦). The chain-shaped molecule
suspension yields larger stress variations (i.e., the “loop area”) along the flow domain. (c-d) The
stress evolution of 𝜏p𝑥𝑥 (𝑡) and 𝜏p𝑦𝑦 (𝑡) at the points (49, 35) and (49, 49), respectively. The dashed
and the solid lines denote the micro-scale simulations and the DeePN2 predictions, respectively.

Fig. 4.7(d), the chain-shaped solution exhibits longer evolution of 𝜏p𝑥𝑥 and larger steady value than

the star-shaped solution. DeePN2 successfully captures such micro-structure-induced rheological

differences and shows good agreement with the micro-scale simulation results.

4.3.6 Validation of the rotational-symmetry preserving NN representation

To validate the performance of the proposed DNN representation, we check the accuracy of the

modeling terms given a set of conformation tensors c1, · · · , c𝑛 under different unitary transformations.

Fig. 4.8 shows the relative error under each transformation. The DNN representation (4.18) yields

the same results under all the transformation. In contrast, the DNN without accounting for the four

transformations yields significant error due to the non-uniqueness of the eigenvectors of c1.

69

xy0255075100025507510010.59.58.57.56.55.54.53.52.51.5τPxxx, yτP xx51015202504812chainstartτP xx, τP yy101102103123τPxx   chain τPxx   starτPyy   chain τPyy   startτP xx,  τP yy101102103036912Figure 4.8 The relative 𝑙∞ error of the model prediction under randomly chosen orthogonal
transformations without (left) and with (right) accounting for the four eigen-space transformations
in Eq. (4.18).

In addition, we examine the 2D Taylor-Green vortex flow where the evolution of c1 becomes

degenerate at certain points. Fig. 4.9 shows the stress evolution at (45, 37). At 𝑡 = 1080, the

eigenvalues 𝜆2 and 𝜆3 cross over. Concurrently, the prediction of the polymer stress τp from the

model without considering the swap of u2 and u3 shows apparent deviations near the regime as

shown in Fig. 4.9. In contrast, the prediction from the model retaining the eigenvalue permutation

trained by Eq. (4.20) shows good agreement with the MD results.

Figure 4.9 Stress evolution of the Taylor-Green vortex flow at position (45, 37) of the chain-shaped
molecule suspension. Left: prediction without considering the swap of eigenvectors when the two
eigenvalues approaches near 𝑡 = 1255 as shown in the inset plot. Right: predictions from the model
retaining the eigenvalue permutation trained by Eq. (4.20). The dashed and the solid lines denote
the micro-scale simulations and the DeePN2 predictions, respectively.

70

rotation test24681010 310 210 1100101rotation test24681010 810 710 6GΕH1H2tτP xx, τP yy, τP zz01500300045001234tλ1, λ2, λ30.20.40.6tτP xx, τP yy, τP zz01500300045001234τPxxτPyyτPzz4.4 Discussion

We have developed a general machine-learning based model, DeePN2, for describing the non-

Newtonian hydrodynamics for polymer solutions with arbitrary molecular structure and interaction.

The constructed model retains a clear physical interpretation and faithfully encodes the micro-scale

structural information into the macro-scale hydrodynamics, where conventional models based on

empirical closures generally show limitations. In particular, for the chain- and star-shaped molecule

suspensions with the same bead number and bond interaction, DeePN2 successfully captures the

different viscoelastic responses arising from the different molecular structural symmetry (i.e., the

effective rigidity) in the configuration space without additional human intervention. Unlike the

direct evaluation or moment-closure representations of the configurational PDF, the present DeePN2

model directly learns a set of micro-to-macro mappings to probe the optimal approximations of the

constitutive dynamics in terms of the macro-scale features, and thereby circumventing the numerical

challenges due to the high-dimensionality of the polymer configuration space. This multi-scaled

nature enables us to learn the constitutive dynamics of the macro-scale features directly from the

kinetic equations of their micro-scale counterparts using only discrete rather than the time-derivative

samples commonly used in the machine learning-based models of complex dynamic problems.

One thing we have not investigated systematically is the generation of training samples. For

DeePN2 to be truly reliable, the training samples should be representative enough for all the practical

situations that one might encounter. However, due to the cost associated with generating such

training samples, we would also like the training set to be as small as possible. This calls for an

adaptive procedure for generating the training sample, such as the concurrent learning procedure

discussed in (E et al., 2021). The present DeePN2 models are trained with samples collected

from homogeneous shear flow. Even though the numerical predictions show good agreement with

micro-scale simulations for a variety of flows, one should not expect this to be generally the case.

Further work on sampling is needed to make sure that one can produce truly reliable DeePN2 models.

Furthermore, instead of the general form (4.6), a specific design of the encoders b(·) accounting for

the molecule symmetry and rigidity may facilitate the extraction of the macro-scale features c. In

71

addition, more accurate micro-scale kinetic models accounting for the heterogeneous hydrodynamic

interactions and non-Markovianity (Lei et al., 2016; Lei and Li, 2021) can be used to construct the

macro-scale constitutive dynamics. Finally, the adaptive choice of the number of features and the

enhanced sampling of the discrete micro-scale configurations may further improve the performance

of the DeePN2 model. We leave these issues for future work.

72

CHAPTER 5

CONCLUSION AND OUTLOOK

This dissertation presents advancements in multi-scale modeling by utilizing machine learning to

construct reliable and structure-preserved reduced models with interpretable micro-scale to macro

and meso-scale mapping. We explored two main directions in reduced modeling: (1) micro-to-meso

data-driven stochastic reduced model, and (2) micro-to-macro deep-learning-based non-Newtonian

hydrodynamic model.

The first part on micro-to-meso modeling developed a data-driven method to learn stochastic

reduced models of complex systems that retain a state-dependent memory beyond the standard

GLE with a homogeneous kernel. The main idea is to seek a generalized representation of the

state-dependent memory which satisfies the second fluctuation-dissipation theorem with a coherent

colored noise. To parameterize the representation, traditional two-point correlation functions are

not enough for extracting state-dependent information where we use the state-dependent three-point

correlation functions instead. Efficient training is achieved by constructing the state-dependent

encoders using a set of sparse bases, whose correlations can be efficiently precomputed. The

convolution part in simulation can be efficiently evaluated using the fast convolution algorithm.

Numerical results demonstrate the limitation of the standard GLE and the essential role of the

broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation

relaxation and transition.

The second part on micro-to-macro modeling developed a deep learning-based non-Newtonian

hydrodynamic model, which focused on learning accurate non-Newtonian hydrodynamic models

from micro-scale polymer kinetics. The model aims to close the empirical constitutive hydrodynamics

form with micro-scale level fidelity. First, we establish a micro-macro correspondence via a set

of encoders for the micro-scale polymer configurations and their macro-scale counterparts, a set

of conformation tensors. Instead of directly approximating the momenta of distribution, we use

neural networks to construct the nonlinear encoders. Then the dynamics (including generalized

objective tensor derivative) of conformation tensors can be directly derived from the micro-scale

73

model by the Fokker-Planck equation, where the molecular structural mechanics are automatically

included. Here the dynamics are only related to instantaneous samples instead of time series.

Finally, we use the symmetry-preserving neural networks to parameterize the dynamics based on the

conformation tensors. By learning the model only based on reverse Poiseuille flow, we demonstrate

its accuracy and generalization ability by considering various flows in comparison with the results

of the micro-scale simulations.

Despite these advancements, several challenges remain open for future research.

5.1 High Dimensional Stochastic Reduced Model

Although our work (Ge et al., 2024) shows high promising results, it only works for 1D reduced

systems due to the curse of dimensionality. The main challenge arises from the state-dependent

features, which are directly related to the dimension. Traditional methods are limited for such

situations. However, neural networks provide a powerful tool to construct high-dimensional functions

efficiently. By carefully designing the training process and the formulation of the state-dependent

features, we can develop a practical high-dimensional stochastic reduced model. Furthermore, the

high-dimensional stochastic reduced model can also help build a more accurate DeePN2 model with

state-dependent memory accounted.

5.2 Variational-informed Structure-preserving Macro-scale Reduced Model

Structure-preserving is important in multi-scale modeling, especially for constructing stable

and reliable reduced models. Some existing methods such as deep learning-based coarse-grained

molecular dynamics Zhang et al. (2018b) follow some physical constraints. One way to impose

energy stability is by pre-building the constitutive dynamics with a generalized extendable energy

functional structure. One possible formulation is the GENERIC formalism Grmela and Öttinger

(1997) governed by the coupling of a reversible and an irreversible process:

dX
d𝑡

L

𝛿𝑆
𝛿X

= L

+ M

𝛿𝑆
𝛿X

≡ 0

𝛿𝐸
𝛿X

𝛿𝐸
𝛿X

= M

74

(5.1)

(5.2)

where X : Ω → R𝑑 is the field variables, 𝐸 [X] = ∫
the entropy. 𝐿 is the Poisson matrix satisfied L = −L𝑇 , M is the friction matrix satisfied M ≻ 0.

ˆ𝐸 (X)dΩ is the energy, 𝑆[X] = ∫

ˆ𝑆(X)dΩ is

Ω

Ω

The degeneracy condition Eq. (5.2) ensures the energy conservation d𝐸/d𝑡 ≡ 0 and the entropy

production d𝑆/d𝑡 ≥ 0 and therefore the free energy stability.

Our main idea is to seek a generalized extendable energy variational form of our DeePN2 model

where X = (𝜌, u, c1, · · · , c𝑛). 𝜌 is the density, u is the velocity, {c𝑖}𝑛

𝑖=1 is our conformation tensor.

By doing so, we can use existing numerical schemes such as the scalar auxiliary variable approach

Shen et al. (2018) and invariant energy quadratization method Yang (2016) to ensure energy stability.

75

BIBLIOGRAPHY

Abrams, J. B. and Tuckerman, M. E. (2008). Efficient and direct generation of multidimensional
free energy surfaces via adiabatic dynamics without coordinate transformations. The Journal of
Physical Chemistry B, 112(49):15742–15757.

Allen, E. C. and Rutledge, G. C. (2008). A novel algorithm for creating coarse-grained, density

dependent implicit solvent models. The Journal of Chemical Physics, 128(15):154115.

Allen, E. C. and Rutledge, G. C. (2009). Evaluating the transferability of coarse-grained, density-
dependent implicit solvent models to mixtures and chains. The Journal of Chemical Physics,
130(3):034904.

Ayaz, C., Scalfi, L., Dalton, B. A., and Netz, R. R. (2022a). Generalized langevin equation with
a non-linear potential of mean force and non-linear memory friction from a hybrid projection
scheme. Physical Review E, 105:054138.

Ayaz, C., Tepper, L., Brünig, F. N., Kappler, J., Daldrop, J. O., and Netz, R. R. (2021). Non-
markovian modeling of protein folding. Proceedings of the National Academy of Sciences,
118(31):e2023856118.

Ayaz, C., Tepper, L., and Netz, R. R. (2022b). Markovian embedding of generalized langevin
equations with a nonlinear friction kernel and configuration-dependent mass. Turkish Journal of
Physics, 46(6):194 – 205.

Baczewski, A. D. and Bond, S. D. (2013). Numerical integration of the extended variable generalized
Langevin equation with a positive Prony representable memory kernel. The Journal of chemical
physics, 139(4):044107.

Bayly, C. I., Cieplak, P., Cornell, W., and Kollman, P. A. (1993). A well-behaved electrostatic
potential based method using charge restraints for deriving atomic charges: the resp model. The
Journal of Physical Chemistry, 97(40):10269–10280.

Berezhkovskii, A. and Szabo, A. (2011). Time scale separation leads to position-dependent diffusion

along a slow coordinate. The Journal of Chemical Physics, 135(7):074108.

Berkowitz, M., Morgan, J., and McCammon, J. A. (1983). Generalized Langevin dynamics

simulations with arbitrary time-dependent memory kernels. J. Chem. Phys., 78:3256.

Berne, B. J., Weeks, J. D., and Zhou, R. (2009). Dewetting and hydrophobic interaction in physical

and biological systems. Annual Review of Physical Chemistry, 60(1):85–103.

Berressem, F., Scherer, C., Andrienko, D., and Nikoubashman, A. (2021). Ultra-coarse-graining of ho-
mopolymers in inhomogeneous systems. Journal of Physics: Condensed Matter, 33(25):254002.

76

Best, R. B. and Hummer, G. (2006). Diffusive model of protein folding dynamics with kramers

turnover in rate. Phys. Rev. Lett., 96:228104.

Best, R. B. and Hummer, G. (2010). Coordinate-dependent diffusion in protein folding. Proceedings

of the National Academy of Sciences, 107(3):1088–1093.

Bird, R. B., Curtiss, C. F., Armstrong, R. C., and Hassager, O. (1987). Dynamics of Polymeric

Liquids, Volume 2: Kinetic Theory, 2nd Edition. Wiley, 2nd edition.

Bird, R. B., Dotson, P. J., and Johnson, N. (1980). Polymer solution rheology based on a finitely
extensible bead—spring chain model. Journal of Non-Newtonian Fluid Mechanics, 7(2):213 –
235.

Buff, F. P., Lovett, R. A., and Stillinger, F. H. (1965). Interfacial density profile for fluids in the

critical region. Phys. Rev. Lett., 15:621–623.

Carmeli, B. and Nitzan, A. (1983). Theory of activated rate processes: Position dependent friction.

Chemical Physics Letters, 102(6):517–522.

Ceriotti, M., Bussi, G., and Parrinello, M. (2009). Langevin equation with colored noise for
constant-temperature molecular dynamics simulations. Physical review letters, 102(2):020601.

Chan, H., Cherukara, M. J., Narayanan, B., Loeffler, T. D., Benmore, C., Gray, S. K., and
Sankaranarayanan, S. K. R. S. (2019). Machine learning coarse grained models for water. Nature
Communications, 10(1):379.

Chandler, D. (2005).
437(7059):640–647.

Interfaces and the driving force of hydrophobic assembly. Nature,

Cooley, J. W. and Tukey, J. W. (1965). An algorithm for the machine calculation of complex fourier

series. Mathematics of Computation, 19(90):297–301.

Cossio, P., Hummer, G., and Szabo, A. (2015). On artifacts in single-molecule force spectroscopy.

Proceedings of the National Academy of Sciences, 112(46):14248–14253.

Daldrop, J. O., Kowalik, B. G., and Netz, R. R. (2017). External potential modifies friction of

molecular solutes in water. Physical Review X, 7(4):041065.

Dalton, B. A., Ayaz, C., Kiefer, H., Klimek, A., Tepper, L., and Netz, R. R. (2023). Fast protein
folding is governed by memory-dependent friction. Proceedings of the National Academy of
Sciences, 120(31):e2220068120.

Darve, E. and Pohorille, A. (2001). Calculating free energies using average force. The Journal of

Chemical Physics, 115(20):9169–9183.

77

Darve, E., Solomon, J., and Kia, A. (2009). Computing generalized Langevin equations and

generalized fokker-planck equations. Proc. Natl. Acad. Sci., 106(27):10884–10889.

Das, A. and Andersen, H. C. (2010). The multiscale coarse-graining method. v. isothermal-isobaric

ensemble. The Journal of Chemical Physics, 132(16):164106.

Das, A. and Andersen, H. C. (2012). The multiscale coarse-graining method. ix. a general method
for construction of three body coarse-grained force fields. The Journal of chemical physics,
136(19):194114.

Davtyan, A., Dama, J. F., Voth, G. A., and Andersen, H. C. (2015). Dynamic force matching: A
method for constructing dynamical coarse-grained models with realistic time dependence. J.
Chem. Phys., 142.

DeLyser, M. and Noid, W. G. (2020). Bottom-up coarse-grained models for external fields and

interfaces. The Journal of Chemical Physics, 153(22):24103.

DeLyser, M. R. and Noid, W. G. (2017). Extending pressure-matching to inhomogeneous systems

via local-density potentials. The Journal of chemical physics, 147(13):134111.

DeLyser, M. R. and Noid, W. G. (2019). Analysis of local density potentials. The Journal of

Chemical Physics, 151(22):224106.

DeLyser, M. R. and Noid, W. G. (2022). Coarse-grained models for local density gradients. The

Journal of Chemical Physics, 156(3):034106.

Deutch, J. M. and Oppenheim, I. (1971). Molecular theory of brownian motion for several particles.

The Journal of Chemical Physics, 54(8):3547–3555.

Dinpajooh, M. and Guenza, M. G. (2017). On the density dependence of the integral equation
coarse-graining effective potential. The Journal of Physical Chemistry B, 122(13):3426–3440.

Doyle, P. S., Shaqfeh, E. S., McKinley, G. H., and Spiegelberg, S. H. (1998). Relaxation of
dilute polymer solutions following extensional flow. Journal of Non-Newtonian Fluid Mechanics,
76(1):79–110.

Dunn, N. J. H. and Noid, W. G. (2015). Bottom-up coarse-grained models that accurately describe
the structure, pressure, and compressibility of molecular liquids. The Journal of Chemical Physics,
143(24):243148.

Dunn, N. J. H. and Noid, W. G. (2016). Bottom-up coarse-grained models with predictive accuracy
and transferability for both structural and thermodynamic properties of heptane-toluene mixtures.
The Journal of Chemical Physics, 144(20):204124.

E, W., Han, J., and Zhang, L. (2021). Machine-learning-assisted modeling. Physics Today,

78

74(7):36–41.

E, W., Lei, H., Xie, P., and Zhang, L. (2023). Machine learning-assisted multi-scale modeling.

Journal of Mathematical Physics, 64(7):071101.

E, W. and Vanden-Eijnden, E. (2010). Transition-path theory and path-finding algorithms for the

study of rare events. Annual Review of Physical Chemistry, 61(1):391–420.

Empereur-mot, C., Capelli, R., Perrone, M., Caruso, C., Doni, G., and Pavan, G. M. (2022).
Automatic multi-objective optimization of coarse-grained lipid force fields using swarmcg. The
Journal of Chemical Physics, 156(2):024801.

Español, P. and Warren, P. (1995). Statistical mechanics of dissipative particle dynamics. Europhysics

Letters, 30(4):191–196.

Evans, R. (1979). The nature of the liquid-vapour interface and other topics in the statistical

mechanics of non-uniform, classical fluids. Advances in Physics, 28(2):143–200.

Fang, L., Ge, P., Zhang, L., E, W., and Lei, H. (2022). DeePN2: A deep learning-based non-newtonian

hydrodynamic model. Journal of Machine Learning, 1(1):114–140.

Feng, J., Chaubal, C. V., and Leal, L. G. (1998). Closure approximations for the doi theory:
Which to use in simulating complex flows of liquid-crystalline polymers? Journal of Rheology,
42(5):1095–1119.

Forest, G. M., Zhou, R., and Wang, Q. (2003). Full-tensor alignment criteria for sheared nematic

polymers. Journal of Rheology, 47(1):105–127.

Galvelis, R. and Sugita, Y. (2017). Neural network and nearest neighbor algorithms for enhancing
sampling of molecular dynamics. Journal of Chemical Theory and Computation, 13(6):2489–
2500.

Ge, P., Zhang, L., and Lei, H. (2023). Machine learning assisted coarse-grained molecular dynamics
modeling of meso-scale interfacial fluids. The Journal of Chemical Physics, 158(6). 064104.

Ge, P., Zhang, Z., and Lei, H. (2024). Data-driven learning of the generalized langevin equation

with state-dependent memory. Phys. Rev. Lett., 133:077301.

Giesekus, H. (1982). A simple constitutive equation for polymer fluids based on the concept of
deformation-dependent tensorial mobility. Journal of Non-Newtonian Fluid Mechanics, 11(1):69
– 109.

Glatzel, F. and Schilling, T. (2022). The interplay between memory and potentials of mean force: A
discussion on the structure of equations of motion for coarse-grained observables. Europhysics
Letters, 136(3):36001.

79

Grmela, M. and Öttinger, H. C. (1997). Dynamics and thermodynamics of complex fluids. i.

development of a general formalism. Physical Review E, 56(6):6620.

Grogan, F., Lei, H., Li, X., and Baker, N. A. (2020). Data-driven molecular modeling with the

generalized Langevin equation. J. Comput. Phys., 418:109633–109641.

Groot, R. D. and Warren, P. B. (1997). Dissipative particle dynamics: Bridging the gap between

atomistic and mesoscopic simulation. Journal of Chemical Physics, 107(11):4423–4435.

Grosso, M., Maffettone, P., Halin, P., Keunings, R., and Legat, V. (2000). Flow of nematic polymers
in eccentric cylinder geometry: influence of closure approximations. Journal of Non-Newtonian
Fluid Mechanics, 94(2):119–134.

Han, J., Ma, C., Ma, Z., and E, W. (2019a). Uniformly accurate machine learning-based hydrodynamic
models for kinetic equations. Proceedings of the National Academy of Sciences, 116(44):21983–
21991.

Han, J., Ma, C., Ma, Z., and E, W. (2019b). Uniformly accurate machine learning-based hydrodynamic
models for kinetic equations. Proceedings of the National Academy of Sciences, 116(44):21983–
21991.

Hänggi, P. (1997). Generalized langevin equations: A useful tool for the perplexed modeller
of nonequilibrium fluctuations? In Schimansky-Geier, L. and Pöschel, T., editors, Stochastic
Dynamics, pages 15–22, Berlin, Heidelberg. Springer Berlin Heidelberg.

Haynes, G. R., Voth, G. A., and Pollak, E. (1993). A theory for the thermally activated rate constant

in systems with spatially dependent friction. Chemical Physics Letters, 207(4):309–316.

Haynes, G. R., Voth, G. A., and Pollak, E. (1994). A theory for the activated barrier crossing rate
constant in systems influenced by space and time dependent friction. The Journal of Chemical
Physics, 101(9):7811–7822.

Hijón, C., Español, P., Vanden-Eijnden, E., and Delgado-Buscalioni, R. (2010). Mori–Zwanzig

formalism as a practical computational tool. Faraday discuss., 144:301–322.

Hinczewski, M., von Hansen, Y., Dzubiella, J., and Netz, R. R. (2010). How the diffusivity profile
reduces the arbitrariness of protein folding free energies. The Journal of Chemical Physics,
132(24):245103.

Hoogerbrugge, P. J. and Koelman, J. M. V. A. (1992). Simulating microscopic hydrodynamic

phenomena with dissipative particle dynamics. Europhys. Lett., 19(3):155–160.

Hoover, W. G. (1985). Canonical dynamics: equilibrium phase-space distributions. Physical Review

A, 31(3):1695–1697.

80

Huang, J., Ma, Z., Zhou, Y., and Yong, W.-A. (2021). Learning thermodynamically stable and galilean
invariant partial differential equations for non-equilibrium flows. Journal of Non-Equilibrium
Thermodynamics, 46(4):355–370.

Hulsen, M., van Heel, A., and van den Brule, B. (1997). Simulation of viscoelastic flows using
brownian configuration fields. Journal of Non-Newtonian Fluid Mechanics, 70(1):79 – 101.

Hummer, G., Garde, S., García, A. E., Paulaitis, M. E., and Pratt, L. R. (1998). Hydrophobic effects

on a molecular scale. The Journal of Physical Chemistry B, 102(51):10469–10482.

Hummer, G., Garde, S., García, A. E., Pohorille, A., and Pratt, L. R. (1996). An information
theory model of hydrophobic interactions. Proceedings of the National Academy of Sciences,
93(17):8951–8955.

Hyon, Y., Du, Q., and Liu, C. (2008). An enhanced macroscopic closure approximation to the micro-
macro fene model for polymeric materials. Multiscale Modeling & Simulation, 7(2):978–1002.

Izvekov, S., Chung, P. W., and Rice, B. M. (2010). The multiscale coarse-graining method: Assessing
its accuracy and introducing density dependent coarse-grain potentials. The Journal of Chemical
Physics, 133(6):064109.

Izvekov, S. and Voth, G. A. (2005). A multiscale coarse-graining method for biomolecular systems.

The Journal of Physical Chemistry B, 109(7):2469–2473.

Jin, J. and Voth, G. A. (2018). Ultra-coarse-grained models allow for an accurate and transferable
treatment of interfacial systems. Journal of Chemical Theory and Computation, 14(4):2180–2197.

John, S. T. and Csányi, G. (2017). Many-body coarse-grained interactions using gaussian approxi-

mation potentials. The Journal of Physical Chemistry B, 121(48):10934–10949.

Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., and Klein, M. L. (1983).
Comparison of simple potential functions for simulating liquid water. The Journal of Chemical
Physics, 79(2):926–935.

Jung, B. and Jung, G. (2023). Dynamic coarse-graining of linear and non-linear systems:

Mori–Zwanzig formalism and beyond. The Journal of Chemical Physics, 159(8):084110.

Jung, G., Hanke, M., and Schmid, F. (2017). Iterative reconstruction of memory kernels. Journal of

Chemical Theory and Computation, 13(6):2481–2488.

Kingma, D. and Ba, J. (2015). Adam: A method for stochastic optimization.

International

Conference on Learning Representations (ICLR).

Klippenstein, V., Tripathy, M., Jung, G., Schmid, F., and van der Vegt, N. F. (2021). Introduc-
ing memory in coarse-grained molecular simulations. The Journal of Physical Chemistry B,

81

125(19):4931–4954.

Klippenstein, V. and van der Vegt, N. F. A. (2021). Cross-correlation corrected friction in

(generalized) langevin models. The Journal of Chemical Physics, 154(19):191102.

Koopman, B. O. (1931). Hamiltonian systems and transformation in Hilbert space. Proceedings of

the National Academy of Sciences, 17(5):315–318.

Kramers, H. (1940). Brownian motion in a field of force and the diffusion model of chemical

reactions. Physica, 7(4):284–304.

Krishnan, R., Singh, S., and Robinson, G. W. (1992). Space-dependent friction in the theory
of activated rate processes: The hamiltonian approach. The Journal of Chemical Physics,
97(8):5516–5521.

Kumar, S., Rosenberg, J. M., Bouzida, D., Swendsen, R. H., and Kollman, P. A. (1992a). The
weighted histogram analysis method for free-energy calculations on biomolecules. i. the method.
Journal of Computational Chemistry, 13(8):1011–1021.

Kumar, S., Rosenberg, J. M., Bouzida, D., Swendsen, R. H., and Kollman, P. A. (1992b). The
weighted histogram analysis method for free-energy calculations on biomolecules. I. The method.
Journal of Computational Chemistry, 13(8):1011–1021.

Laio, A. and Parrinello, M. (2002). Escaping free-energy minima. Proceedings of the National

Academy of Sciences, 99(20):12562–12566.

Landau, L. D. and Lifshitz, E. M. (1987). Fluid Mechanics, Second Edition: Volume 6 (Course of
Theoretical Physics). Course of theoretical physics / by L. D. Landau and E. M. Lifshitz, Vol. 6.
Butterworth-Heinemann, 2 edition.

Lange, O. F. and Grubmüller, H. (2006). Collective Langevin dynamics of conformational motions

in proteins. J. Chem. Phys., 124:214903.

Larini, L., Lu, L., and Voth, G. A. (2010). The multiscale coarse-graining method. vi. implementation
of three-body coarse-grained potentials. The Journal of Chemical Physics, 132(16):164107.

Larson, R. G. (1988). Constitutive Equations for Polymer Melts and Solutions. Butterworth-

Heinemann Press.

Laso, M. and Öttinger, H. C. (1993). Calculation of viscoelastic flow using molecular models: the

connffessit approach. Journal of Non-Newtonian Fluid Mechanics, 47:1 – 20.

Lee, H. S., Ahn, S.-H., and Darve, E. F. (2019). The multi-dimensional generalized Langevin
equation for conformational motion of proteins. The Journal of Chemical Physics, 150(17).

82

Lei, H., Baker, N. A., and Li, X. (2016). Data-driven parameterization of the generalized Langevin

equation. Proc. Natl. Acad. Sci., 113(50):14183–14188.

Lei, H., Caswell, B., and Karniadakis, G. E. (2010). Direct construction of mesoscopic models from

microscopic simulations. Phys. Rev. E, 81:026704.

Lei, H. and Li, X. (2021). Petrov–Galerkin methods for the construction of non-Markovian dynamics

preserving nonlocal statistics. The Journal of Chemical Physics, 154(18):184108.

Lei, H., Mundy, C. J., Schenter, G. K., and Voulgarakis, N. K. (2015). Modeling nanoscale
hydrodynamics by smoothed dissipative particle dynamics. J. Chem. Phys., 142(19):194504.

Lei, H., Wu, L., and E, W. (2020). Machine learning based non-newtonian fluid model with

molecular fidelity. Physics Review E, 102:043309.

Lemke, T. and Peter, C. (2017). Neural network based prediction of conformational free energies
Journal of Chemical Theory and

- a new route toward coarse-grained simulation models.
Computation, 13(12):6213–6221.

Li, Z., Kovachki, N. B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar,
A. (2021). Fourier neural operator for parametric partial differential equations. In International
Conference on Learning Representations.

Lielens, G., Halin, P., Jaumain, I., Keunings, R., and Legat, V. (1998). New closure approximations
for the kinetic theory of finitely extensible dumbbells. Journal of Non-Newtonian Fluid Mechanics,
76(1):249–279.

Lielens, G., Keunings, R., and Legat, V. (1999). The fene-l and fene-ls closure approximations to
the kinetic theory of finitely extensible dumbbells. Journal of Non-Newtonian Fluid Mechanics,
87(2):179 – 196.

Lin, F.-H., Liu, C., and Zhang, P. (2005). On hydrodynamics of viscoelastic fluids. Communications

on Pure and Applied Mathematics, 58(11):1437–1471.

Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. (2021). Learning nonlinear operators via
deeponet based on the universal approximation theorem of operators. Nature machine intelligence,
3(3):218–229.

Lum, K., Chandler, D., and Weeks, J. D. (1999). Hydrophobicity at small and large length scales.

The Journal of Physical Chemistry B, 103(22):4570–4577.

Luo, G., Andricioaei, I., Xie, X. S., and Karplus, M. (2006). Dynamic distance disorder in proteins

is caused by trapping. The Journal of Physical Chemistry B, 110(19):9363–9367.

Lyu, L. and Lei, H. (2023). Construction of coarse-grained molecular dynamics with many-body

83

non-markovian memory. Phys. Rev. Lett., 131:177301.

Lyubartsev, A. P. and Laaksonen, A. (1995). Calculation of effective interaction potentials from
radial distribution functions: A reverse monte carlo approach. Phys. Rev. E, 52:3730–3737.

Ma, L., Li, X., and Liu, C. (2019). Coarse-graining langevin dynamics using reduced-order

techniques. Journal of Computational Physics, 380:170–190.

Maragliano, L. and Vanden-Eijnden, E. (2006). A temperature accelerated method for sampling free
energy and determining reaction pathways in rare events simulations. Chemical Physics Letters,
426(1):168–175.

Maragliano, L. and Vanden-Eijnden, E. (2008). Single-sweep methods for free energy calculations.

The Journal of Chemical Physics, 128(18):184110.

Martyna, G. J., Tobias, D. J., and Klein, M. L. (1994). Constant pressure molecular dynamics

algorithms. The Journal of Chemical Physics, 101(5):4177–4189.

Miller, T. F., Vanden-Eijnden, E., and Chandler, D. (2007). Solvent coarse-graining and the string
method applied to the hydrophobic collapse of a hydrated chain. Proceedings of the National
Academy of Sciences, 104(37):14559–14564.

Miyamoto, S. and Kollman Peter, A. (2004). Settle: An analytical version of the shake and rattle

algorithm for rigid water models. Journal of Computational Chemistry, 13(8):952–962.

Molinero, V. and Moore, E. B. (2009). Water modeled as an intermediate element between carbon

and silicon. The Journal of Physical Chemistry B, 113(13):4008–4016.

Mones, L., Bernstein, N., and Csányi, G. (2016). Exploration, sampling, and reconstruction of free
energy surfaces with gaussian process regression. Journal of Chemical Theory and Computation,
12(10):5100–5110.

Moore, J. D., Barnes, B. C., Izvekov, S., Lísal, M., Sellers, M. S., Taylor, D. E., and Brennan, J. K.
(2016). A coarse-grain force field for rdx: Density dependent and energy conserving. The Journal
of Chemical Physics, 144(10):104501.

Mori, H. (1965). Transport, collective motion, and Brownian motion. Progress of Theoretical

Physics, 33(3):423–455.

Morris, J. P., Fox, P. J., and Zhu, Y. (1997). Modeling low reynolds number incompressible flows

using sph. J. Comput. Phys., 136(1):214–226.

Morrone, J. A., Li, J., and Berne, B. J. (2012). Interplay between hydrodynamics and the free
energy surface in the assembly of nanoscale hydrophobes. The Journal of Physical Chemistry B,
116(1):378–389.

84

Murashima, T., Hagita, K., and Kawakatsu, T. (2018). Elongational viscosity of weakly entangled
polymer melt via coarse-grained molecular dynamics simulation. Nihon Reoroji Gakkaishi,
46(5):207–220.

Nicholson, D. A. and Rutledge, G. C. (2016). Molecular simulation of flow-enhanced nucleation in
n-eicosane melts under steady shear and uniaxial extension. The Journal of Chemical Physics,
145(24):244903.

Nielsen, S. O., Lopez, C. F., Srinivas, G., and Klein, M. L. (2004). Coarse grain models and the
computer simulation of soft materials. Journal of Physics: Condensed Matter, 16(15):R481–R512.

Noé, F., Tkatchenko, A., Müller, K.-R., and Clementi, C. (2020). Machine learning for molecular

simulation. Annual Review of Physical Chemistry, 71(1):361–390.

Noid, W. G. (2013). Perspective: coarse-grained models for biomolecular systems. J. Chem. Phys.,

139(9):090901.

Noid, W. G., Chu, J.-W., Ayton, G. S., Krishna, V., Izvekov, S., Voth, G. A., Das, A., and Andersen,
H. C. (2008). The multiscale coarse-graining method. I. a rigorous bridge between atomistic and
coarse-grained models. J. Chem. Phys., 128(24):244114.

Nosé, S. (1984). A molecular dynamics method for simulations in the canonical ensemble. Molecular

Physics, 52(2):255–268.

Ogorodnikov, V. A. and Prigarin, S. M. (1996). Numerical Modelling of Random Processes and

Fields: Algorithms and Applications. De Gruyter, Berlin, Boston.

Oldroyd, J. G. and Wilson, A. H. (1950). On the formulation of rheological equations of state.
Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences,
200(1063):523–541.

Owens, R. G. and Phillips, T. N. (2002). Computational Rheology. Imperial College Press.

Pagonabarraga, I. and Frenkel, D. (2001). Dissipative particle dynamics for interacting systems.

The Journal of Chemical Physics, 115(11):5015–5026.

Patel, A. J., Varilly, P., and Chandler, D. (2010). Fluctuations of water near extended hydrophobic

and hydrophilic surfaces. The Journal of Physical Chemistry B, 114(4):1632–1637.

Peterlin, A. (1966). Hydrodynamics of macromolecules in a velocity field with longitudinal gradient.

Journal of Polymer Science Part B: Polymer Letters, 4(4):287–291.

Plotkin, S. S. and Wolynes, P. G. (1998). Non-Markovian configurational diffusion and reaction

coordinates for protein folding. Phys. Rev. Lett., 80:5015–5018.

85

Posch, H. A., Balucani, U., and Vallauri, R. (1984). On the relative dynamics of pairs of atoms in

simple liquids. Physica A, 123:516–534.

Qin, T., Wu, K., and Xiu, D. (2019). Data driven governing equations approximation using deep

neural networks. Journal of Computational Physics, 395:620–635.

Raissi, M., Perdikaris, P., and Karniadakis, G. (2019a). Physics-informed neural networks: A
deep learning framework for solving forward and inverse problems involving nonlinear partial
differential equations. Journal of Computational Physics, 378:686–707.

Raissi, M., Perdikaris, P., and Karniadakis, G. E. (2019b). Physics-informed neural networks: A
deep learning framework for solving forward and inverse problems involving nonlinear partial
differential equations. Journal of Computational Physics, 378:686–707.

Rein ten Wolde, P., Sun, S. X., and Chandler, D. (2001). Model of a fluid at small and large length

scales and the hydrophobic effect. Phys. Rev. E, 65:011201.

Reith, D., Pütz, M., and Müller-Plathe, F. (2003). Deriving effective mesoscale potentials from

atomistic simulations. Journal of Computational Chemistry, 24(13):1624–1636.

Ren, W. and E, W. (2005). Heterogeneous multiscale method for the modeling of complex fluids

and micro-fluidics. Journal of Computational Physics, 204(1):1 – 26.

Rosso, L., Mináry, P., Zhu, Z., and Tuckerman, M. E. (2002). On the use of the adiabatic molecular
dynamics technique in the calculation of free energy profiles. The Journal of Chemical Physics,
116(11):4389–4402.

Rouse, P. E. (1953). A theory of the linear viscoelastic properties of dilute solutions of coiling

polymers. The Journal of Chemical Physics, 21(7):1272–1280.

Rowlinson, J. J. S. and Widom, B. (2002). Molecular theory of capillarity, volume 8. Courier Dover

Publications.

Rudd, R. E. and Broughton, J. Q. (1998). Coarse-grained molecular dynamics and the atomic limit

of finite elements. Phys. Rev. B, 58:R5893–R5896.

Rudy, S. H., Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2017). Data-driven discovery of partial

differential equations. Science Advances, 3(4).

Russo, A., Durán-Olivencia, M. A., Kevrekidis, I. G., and Kalliadasis, S. (2019). Deep learning as
closure for irreversible processes: A data-driven generalized Langevin equation. arXiv preprint
arXiv:1903.09562.

Ryckaert, J.-P., Ciccotti, G., and Berendsen, H. J. (1977). Numerical integration of the cartesian
equations of motion of a system with constraints: molecular dynamics of n-alkanes. Journal of

86

Computational Physics, 23(3):327–341.

Sanyal, T. and Shell, M. S. (2016). Coarse-grained models using local-density potentials optimized
with the relative entropy: Application to implicit solvation. The Journal of chemical physics,
145(3):034109.

Sanyal, T. and Shell, M. S. (2018). Transferable coarse-grained models of liquid-liquid equilibrium
using local density potentials optimized with the relative entropy. The Journal of Physical
Chemistry B, 122(21, SI):5678–5693.

Satija, R., Das, A., and Makarov, D. E. (2017). Transition path times reveal memory effects
and anomalous diffusion in the dynamics of protein folding. The Journal of Chemical Physics,
147(15):152707.

Satija, R. and Makarov, D. E. (2019). Generalized langevin equation as a model for barrier crossing

dynamics in biomolecular folding. The Journal of Physical Chemistry B, 123(4):802–810.

Schädle, A., López-Fernández, M., and Lubich, C. (2006). Fast and oblivious convolution quadrature.

SIAM Journal on Scientific Computing, 28(2):421–438.

Schaeffer, H., Tran, G., and Ward, R. (2018). Extracting sparse high-dimensional dynamics from

limited data. SIAM Journal on Applied Mathematics, 78(6):3279–3295.

Schneider, E., Dai, L., Topper, R. Q., Drechsel-Grau, C., and Tuckerman, M. E. (2017). Stochastic
neural network approach for learning high-dimensional free energy surfaces. Phys. Rev. Lett.,
119:150601.

Serrano, M. and Español, P. (2001). Thermodynamically consistent mesoscopic fluid particle model.

Phys. Rev. E, 64:046115.

Seryo, N., Sato, T., Molina, J. J., and Taniguchi, T. (2020). Learning the constitutive relation of

polymeric flows with memory. Phys. Rev. Research, 2:033107.

Shahidi, N., Chazirakis, A., Harmandaris, V., and Doxastakis, M. (2020). Coarse-graining of
polyisoprene melts using inverse monte carlo and local density potentials. The Journal of
Chemical Physics, 152(12):124902.

She, Z., Ge, P., and Lei, H. (2023). Data-driven construction of stochastic reduced dynamics encoded

with non-markovian features. The Journal of Chemical Physics, 158(3):034102.

Shen, J., Xu, J., and Yang, J. (2018). The scalar auxiliary variable (sav) approach for gradient flows.

Journal of Computational Physics, 353:407–416.

Shinoda, W., DeVane, R., and Klein, M. L. (2008). Coarse-grained molecular modeling of non-ionic

surfactant self-assembly. Soft Matter, 4(12):2454–2462.

87

Singh, D., Mondal, K., and Chaudhury, S. (2021). Effect of memory and inertial contribution
on transition-time distributions: Theory and simulations. The Journal of Physical Chemistry B,
125(17):4536–4545.

Singh, S., Krishnan, R., and Robinson, G. (1990). Theory of activated rate processes with

space-dependent friction. Chemical Physics Letters, 175(4):338–342.

Smith, D. E., Babcock, H. P., and Chu, S. (1999). Single-polymer dynamics in steady shear flow.

Science, 283(5408):1724–1727.

Soper, A. (1996). Empirical potential monte carlo simulation of fluid structure. Chemical Physics,

202(2):295–306.

Stecher, T., Bernstein, N., and Csányi, G. (2014). Free energy surface reconstruction from umbrella
samples using gaussian process regression. Journal of Chemical Theory and Computation,
10(9):4079–4097.

Straub, J. E., Berne, B. J., and Roux, B. (1990). Spatial dependence of time-dependent friction for

pair diffusion in a simple fluid. The Journal of Chemical Physics, 93(9):6804–6812.

Straub, J. E., Borkovec, M., and Berne, B. J. (1987). Calculation of dynamic friction on intramolecular

degrees of freedom. The Journal of Physical Chemistry, 91(19):4995–4998.

Straub, J. E., Borkovec, M., and Berne, B. J. (1988). Molecular dynamics study of an isomerizing

diatomic in a lennard-jones fluid. The Journal of Chemical Physics, 89(8):4833–4847.

Straus, J. B., Gomez Llorente, J. M., and Voth, G. A. (1993). Manifestations of spatially dependent
friction in classical activated rate processes. The Journal of Chemical Physics, 98(5):4082–4097.

Tarjus, G. and Kivelson, D. (1991). Solvent effect on activated rate processes: On the validity of the

gle approach. Chemical Physics, 152(1):153–167.

Taylor, G. I. (1934). The formation of emulsions in definable fields of flow. Proceedings of the
Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character,
146(858):501–523.

Thien, N. P. and Tanner, R. I. (1977). A new constitutive equation derived from network theory.

Journal of Non-Newtonian Fluid Mechanics, 2(4):353–365.

Thomases, B. and Shelley, M. (2007). Emergence of singular structures in oldroyd-b fluids. Physics

of Fluids, 19(10):103103.

Torrie, G. and Valleau, J. (1977). Nonphysical sampling distributions in monte carlo free-energy

estimation: Umbrella sampling. Journal of Computational Physics, 23(2):187–199.

88

Voth, G. A. (1992). A theory for treating spatially-dependent friction in classical activated rate

processes. The Journal of Chemical Physics, 97(8):5908–5910.

Vroylandt, H. (2022). On the derivation of the generalized langevin equation and the fluctuation-

dissipation theorem. Europhysics Letters, 140(6):62003.

Vroylandt, H., Goudenège, L., Monmarché, P., Pietrucci, F., and Rotenberg, B. (2022). Likelihood-
based non-markovian models from molecular dynamics. Proceedings of the National Academy
of Sciences, 119(13):e2117586119.

Vroylandt, H. and Monmarché, P. (2022). Position-dependent memory kernel in generalized Langevin
equations: Theory and numerical estimation. The Journal of Chemical Physics, 156(24):244105.

Wagner, J. W., Dannenhoffer-Lafage, T., Jin, J., and Voth, G. A. (2017). Extending the range
and physical accuracy of coarse-grained models: Order parameter dependent interactions. The
Journal of chemical physics, 147(4):044113.

Wang, J., Olsson, S., Wehmeyer, C., Pérez, A., Charron, N. E., de Fabritiis, G., Noé, F., and Clementi,
C. (2019). Machine learning of coarse-grained molecular dynamics force fields. ACS Central
Science, 5(5):755–767.

Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A., and Case, D. A. (2004). Development and
testing of a general amber force field. Journal of Computational Chemistry, 25(9):1157–1174.

Wang, Q. (1997). Comparative studies on closure approximations in flows of liquid crystal polymers:

I. elongational flows. Journal of Non-Newtonian Fluid Mechanics, 72(2):141 – 162.

Wang, S., Ma, Z., and Pan, W. (2020). Data-driven coarse-grained modeling of polymers in solution

with structural and dynamic properties conserved. Soft Matter, 16(36):8330–8344.

Warner, H. R. (1972a). Kinetic theory and rheology of dilute suspensions of finitely extendible

dumbbells. Industrial & Engineering Chemistry Fundamentals, 11(3):379–387.

Warner, H. R. (1972b). Kinetic theory and rheology of dilute suspensions of finitely extendible

dumbbells. Industrial & Engineering Chemistry Fundamentals, 11(3):379–387.

Willard, A. P. and Chandler, D. (2010). Instantaneous liquid interfaces. The Journal of Physical

Chemistry B, 114(5):1954–1958.

Womersley, J. R. (1955). Method for the calculation of velocity, rate of flow and viscous drag in
arteries when the pressure gradient is known. The Journal of Physiology, 127(3):553–563.

Xie, P., Car, R., and E, W. (2022). Ab Initio Generalized Langevin Equations. arXiv preprint

arXiv:2211.06558.

89

Yang, X. (2016). Linear, first and second-order, unconditionally energy stable numerical schemes for
the phase field model of homopolymer blends. Journal of Computational Physics, 327:294–316.

Yu, P., Du, Q., and Liu, C. (2005). From micro to macro dynamics via a new closure approximation

to the fene model of polymeric fluids. Multiscale Modeling & Simulation, 3(4):895–917.

Zaremba, S. (1903). Sur une forme perfectionee de la theorie de la relaxation. Bull. Int. Acad. Sci.

Cracovie, pages 594–614.

Zavadlav, J., Marrink, S. J., and Praprotnik, M. (2018). Multiscale simulation of protein hydration
using the swinger dynamical clustering algorithm. Journal of Chemical Theory and Computation,
14(3):1754–1761.

Zhang, L., Han, J., Wang, H., Car, R., and E, W. (2018a). Deep potential molecular dynamics: A

scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett., 120:143001.

Zhang, L., Han, J., Wang, H., Car, R., and E, W. (2018b). Deepcg: Constructing coarse-grained

models via deep neural networks. The Journal of Chemical Physics, 149(3):034101.

Zhang, L., Han, J., Wang, H., Saidi, W., Car, R., and E, W. (2018c). End-to-end symmetry preserving
inter-atomic potential energy model for finite and extended systems. In Bengio, S., Wallach, H.,
Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural
Information Processing Systems 31, pages 4436–4446. Curran Associates, Inc.

Zhang, L., Wang, H., and E, W. (2018d). Reinforced dynamics for enhanced sampling in large

atomic and molecular systems. The Journal of Chemical Physics, 148(12):124113.

Zhang, Y., Gao, C., Liu, Q., Zhang, L., Wang, H., and Chen, M. (2020). Warm dense matter
simulation via electron temperature dependent deep potential molecular dynamics. Physics of
Plasmas, 27(12):122704.

Zhao, L., Li, Z., Caswell, B., Ouyang, J., and Karniadakis, G. E. (2018). Active learning of
constitutive relation from mesoscopic dynamics for macroscopic modeling of non-newtonian
flows. Journal of Computational Physics, 363:116 – 127.

Zhou, X.-H., Han, J., and Xiao, H. (2021). Frame-independent vector-cloud neural network for
nonlocal constitutive modeling on arbitrary grids. Computer Methods in Applied Mechanics and
Engineering.

Zhu, Y. and Venturi, D. (2020). Generalized langevin equations for systems with local interactions.

Journal of Statistical Physics.

Zwanzig, R. (1961). Statistical mechanics of irreversiblity. Lectures in Theoretical Physics,

3:106–141.

90

Zwanzig, R. (1973). Nonlinear generalized Langevin equations. J. Stat. Phys., 9:215 – 220.

Zwanzig, R. (1992). Diffusion past an entropy barrier. The Journal of Physical Chemistry,

96(10):3926–3930.

Zwanzig, R. (2001). Nonequilibrium Statistical Mechanics. Oxford University Press.

91