headgraphic
loader graphic

Loading content ...

Bayesian Learning for High-Dimensional Nonlinear Dynamical Systems: Methodologies, Numerics and Applications to Fluid Flows

Lin, J., 2020. Bayesian Learning for High-Dimensional Nonlinear Dynamical Systems: Methodologies, Numerics and Applications to Fluid Flows. PhD thesis, Massachusetts Institute of Technology, Department of Mechanical Engineering, September 2020.

The rapidly-growing computational power and the increasing capability of uncertainty quantification, statistical inference, and machine learning have opened up new opportunities for utilizing data to assist, identify and refine physical models. In this thesis, we focus on Bayesian learning for a particular class of models: high-dimensional nonlinear dynamical systems, which have been commonly used to predict a wide range of transient phenomena including fluid flows, heat transfer, biogeochemical dynamics, and other advection-diffusion-reaction-based transport processes. Even though such models often express the differential form of fundamental laws, they commonly contain uncertainty in their initial and boundary values, parameters, forcing and even formulation. Learning such components from sparse observation data by principled Bayesian inference is very challenging due to the systems’ high-dimensionality and nonlinearity.

We systematically study the theoretical and algorithmic properties of a Bayesian learning methodology built upon previous efforts in our group to address this challenge. Our systematic study breaks down into the three hierarchical components of the Bayesian learning and we develop new numerical schemes for each. The first component is on uncertainty quantification for stochastic dynamical systems and fluid flows. We study dynamic low-rank approximations using the dynamically orthogonal (DO) equations including accuracy and computational costs, and develop new numerical schemes for re-orthonormalization, adaptive subspace augmentation, residual-driven closure, and stochastic Navier-Stokes integration. The second part is on Bayesian data assimilation, where we study the properties of and connections among the different families of nonlinear and non-Gaussian filters. We derive an ensemble square-root filter based on minimal-correction second-moment matching that works especially well under the adversity of small ensemble size, sparse observations and chaotic dynamics. We also obtain a localization technique for filtering with high-dimensional systems that can be applied to nonlinear non-Gaussian inference with both brute force Monte Carlo (MC) and reduced subspace modeling in a unified way. Furthermore, we develop a mutual-information-based adaptive sampling strategy for filtering to identify the most informative observations with respect to the state variables and/or parameters, utilizing the sub-modularity of mutual information due to the conditional independence of observation noise. The third part is on active Bayesian model learning, where we have a discrete set of candidate dynamical models and we infer the model formulation that best explains the data using principled Bayesian learning. To predict the observations that are most useful to learn the model formulation, we further extend the above adaptive sampling strategy to identify the data that are expected to be most informative with respect to both state variables and the uncertain model identity.

To investigate and showcase the effectiveness and efficiency of our theoretical and numerical advances for uncertainty quantification, Bayesian data assimilation, and active Bayesian learning with stochastic nonlinear high-dimensional dynamical systems, we apply our dynamic data-driven reduced subspace approach to several dynamical systems and compare our results against those of brute force MC and other existing methods. Specifically, we analyze our advances using several drastically different dynamical regimes modeled by the nonlinear Lorenz-96 ordinary differential equations as well as turbulent bottom gravity current dynamics modeled by the 2-D unsteady incompressible Reynolds-averaged Navier-Stokes (RANS) partial differential equations. We compare the accuracy, efficiency, and robustness of different methodologies and algorithms. With the Lorenz-96 system, we show how the performance differs under periodic, weakly chaotic, and very chaotic dynamics and under different observation layouts. With the bottom gravity current dynamics, we show how model parameters, domain geometries, initial fields, and boundary forcing formulations can be identified and how the Bayesian methodology performs when the candidate model space does not contain the true model. The results indicate that our active Bayesian learning framework can better infer the state variables and dynamical model identity with fewer observations than many alternative approaches in the literature.