Machine Learning-based Stochastic Reduced Modeling of GLE and State-dependent-GLE
Predictive modeling of high-dimensional dynamical systems remains a central challenge in numerous scientific fields, including biology, materials science, and fluid mechanics. When clear scale separation is lacking, a reduced model must accurately capture the pronounced memory effects arising from unresolved variables, making non-Markovian modeling essential. In this thesis, we develop and analyze data-driven methods for constructing generalized Langevin equations (GLEs) and extended stochastic differential equations that faithfully encode non-Markovian behaviors.Building on the Mori–Zwanzig formalism, we first propose an approach to learn a set of non-Markovian features—auxiliary variables that incorporate the history of the resolved coordinates—so that the effective dynamics inherits long-time correlations. By matching evolution of correlation functions in the extended variable space, our method systematically approximates the multi-dimensional GLE without requiring direct estimates of complicated memory kernels. We show that this approach yields stable, high-fidelity reduced models for molecular systems, enabling significantly lower-dimensional simulations that nonetheless reproduce key statistical and dynamical properties of the original system.We then extend this framework to incorporate state-dependent memory kernels, facilitating enhanced sampling across diverse regions of phase space. We demonstrate that constructing heterogeneous memory kernels—reflecting the local variations in unresolved degrees of freedom—improves the model’s accuracy and robustness, especially in systems exhibiting multiple metastable states. Through both numerical experiments and theoretical analysis, we highlight how these data-driven non-Markovian models outperform traditional Markovian or fixed-memory approaches.To address complex, multi-modal distributions in high-dimensional data, we then modify the latent variable of a KRNet normalizing-flow architecture from a single Gaussian to a mixture-of-Gaussians (MoG). This richer latent representation not only improves the model’s expressiveness and training stability but also facilitates discovering collective variables (CVs), as the multi-modal latent space reveals distinct modes corresponding to relevant metastable states or slow degrees of freedom. Through both numerical experiments and theoretical analysis, we show that integrating a MoG prior into KRNet yields superior density estimation, enhanced sampling of metastable basins, and a more interpretable set of learned CVs.Altogether, this thesis provides a comprehensive methodology for deriving scalable, memory-embedded reduced dynamics augmented by advanced latent representations. Such models open new possibilities for multi-scale simulations by merging fine-grained molecular fidelity with tractable coarse-grained representations, all while systematically leveraging the benefits of multi-modal latent spaces to identify key low-dimensional features. Our results underscore the practical advantages of incorporating non-Markovian features and a mixture-based flow model in capturing the full complexity of real-world molecular and dynamical systems.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
She, Zhiyuan
- Thesis Advisors
-
Lei, Huan
- Committee Members
-
Xiao, Yimin
Liu, Di
Murillo, Michael
- Date Published
-
2025
- Subjects
-
Mathematics
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 76 pages
- Permalink
- https://doi.org/doi:10.25335/ws7w-zn88