Session: Methods for Uncertainty Quantification, Sensitivity Analysis, and Prediction
Paper Number: 158333
158333 - Two-Step Models for Data-Driven Closures in Sparsely Observed Dynamical Systems
Abstract:
Many scientific problems involve sparse, possibly noisy observations of a dynamical system based on partially known dynamics. Such problems are typically modeled using closure models based on intimate knowledge of the underlying process and the approximations made in the known dynamics. Inspired by recent progress in machine learning, data-driven closure models (DDCMs) seek to estimate the closure term using a nonparametric function estimation based on neural networks or penalized regression models (e.g., LASSO). A limitation of these approaches is their lack of uncertainty quantification. In this work we propose a two-step approach to DDCMs which quantifies model uncertainty: First, we use any data assimilation (DA) method to estimate the closure term from sparse observations. Second, using the data assimilated quantities, we apply a fully Bayesian dynamic discovery method similar to SINDy to learn the closure term. The DA stage may be omitted if dense observations of the system are available, in which case dynamic discovery is carried out directly on the observations. When observations are sparse in time, DA is necessary to obtain a dense time series of the state of the system.
The DA step may be carried out using one method or an ensemble of methods. Ensembles of DA methods may be estimated independently and combined using Bayesian model averaging or stacking to improve predictive performance on the original process. The dynamic discovery step uses a hierarchical Bayesian model with a smooth latent process constructed via smoothing splines to learn the full dynamics of the problem or a restricted subset of parameters (the closure term). Importantly, this allows us to compute gradients with respect to the latent process, removing the need to approximate derivatives from the noisy observation layer of the model. As with SINDy, the method requires strong prior knowledge of the form of the dynamics in order to choose a nonlinear transformation of the state vector.
The proposed method is demonstrated on benchmarks with relevance to chaotic systems and ecological models, including the Lorenz-63 chaotic attractor and the Lotka--Volterra predator-prey model. We demonstrate sensitivity to the level of sparsity in the observations and the quality of the DA method used. We also compare the proposed method to other DDCMs, including the popular SINDy method, Gaussian process regression, and neural network-based approaches. These results demonstrate the potential of this method for learning closure terms in many problems where the underlying dynamics are driven by an ODE.
Due to the spline representation of the latent process, the method is limited in its ability to generalize to systems involving PDEs or other problems involving spatial and spatiotemporal applications. Further work is needed to extend to these domains in a computationally scalable manner.
Our contributions are the following:
* A two-step method for data-driven closure models that includes error
propagation in each step
* Restricting the dynamic discovery step to a subset of parameters, allowing for
closure modeling as a special case of the full dynamics
* Ensemble methods for data assimilation and dynamic discovery which potentially
improve the predictive performance of the model on the original process
Funding: This work was supported by Sandia National Laboratories.
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. This presentation describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the presentation do not necessarily represent the views of the U.S. Department of Energy or the United States Government.
Presenting Author: Daniel Drennan Texas A&M University
Presenting Author Biography: Dan Drennan is a graduate student studying statistics at Texas A&M University under the supervision of Toryn Schafer and Matthias Katzfuss. His work focuses on developing Bayesian models of spatial and dynamic processes with applications to climate systems.
Authors:
Daniel Drennan Texas A&M UniversityToryn Schafer Texas A&M University
Rileigh Bandy Sandia National Laboratories
Teresa Portone Sandia National Laboratories
Moe Khalil Sandia National Laboratories
Kyle Neal Sandia National Laboratories
Two-Step Models for Data-Driven Closures in Sparsely Observed Dynamical Systems
Paper Type
Technical Presentation Only