loader graphic

Loading content ...

A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Applications

Lolla, T. and P.F.J. Lermusiaux, 2017b. A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Applications. Monthly Weather Review. doi:10.1175/MWR-D-16-0065.1.

The Gaussian–Mixture–Model Dynamically–Orthogonal (GMM–DO) smoother is exemplified and contrasted with other smoothers by applications to three dynamical systems, all of which admit far–from–Gaussian statistics. A double–well–diffusion experiment is first used to examine the capabilities of the smoother and compare its performance to that of the Ensemble Kalman Smoother. A passive tracer advected by a reversible shear flow is then employed. The exact smoothed solution is obtained and utilized to validate the GMM–DO smoother and its results. Finally, the third example illustrates the applicability of the smoother in more complex ocean flows consisting of variable jets and eddies. To illustrate the non-Gaussian effects, comparisons are then made with the update of the Error Subspace Statistical Estimation smoother. In each application, the properties of the GMM–DO smoother and of its posterior probabilities are studied and quantified. Rigorous evaluation of Bayesian smoothers for nonlinear high-dimensional dynamical systems is challenging in itself. The present three dynamical system examples provide complementary and effective benchmarks for such evaluation.

A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Theory and Scheme

Lolla, T. and P.F.J. Lermusiaux, 2017a. A Gaussian--Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Theory and Scheme. Monthly Weather Review. doi:10.1175/MWR-D-16-0064.1.

Retrospective inference through Bayesian smoothing is indispensable in geophysics, with crucial applications in ocean estimation, numerical weather prediction, climate dynamics and Earth system modeling. However, dealing with the high–dimensionality and nonlinearity of geophysical processes remains a major challenge in the development of Bayesian smoothers. Addressing this issue, we obtain a novel smoothing methodology for high– dimensional stochastic fields governed by general nonlinear dynamics. Building on recent Bayesian filters and classic Kalman smoothers, the equations and forward–backward algorithm of the new smoother are derived. The smoother uses the stochastic Dynamically–Orthogonal (DO) field equations and their time–evolving stochastic subspace to predict the prior probabilities. Bayesian inference, both forward and backward in time, is then analytically carried out in the dominant DO subspace, after fitting semi–parametric Gaussian Mixture Models (GMMs) to joint DO realizations. The theoretical properties and computational cost of the new GMM-DO smoother are presented and discussed.

Validation of Genetic Algorithm Based Optimal Sampling for Ocean Data Assimilation

Heaney, K. D., P. F. J. Lermusiaux, T. F. Duda and P. J. Haley Jr., 2016.Validation of Genetic Algorithm Based Optimal Sampling for Ocean Data Assimilation. Ocean Dynamics. 66: 1209-1229. doi:10.1007/s10236-016-0976-5.

Regional ocean models are capable of forecasting conditions for usefully long intervals of time (days) provided that initial and ongoing conditions can be measured. In resource-limited circumstances, the placement of sensors in optimal locations is essential. Here, a nonlinear optimization approach to determine optimal adaptive sampling that uses the Genetic Algorithm (GA) method is presented. The method determines sampling strategies that minimize a user-defined physics-based cost function. The method is evaluated using identical twin experiments, comparing hindcasts from an ensemble of simulations that assimilate data selected using the GA adaptive sampling and other methods. For skill metrics, we employ the reduction of the ensemble root-mean-square-error (RMSE) between the “true” data-assimilative ocean simulation and the different ensembles of data-assimilative hindcasts. A 5-glider optimal sampling study is set up for a 400 km x 400 km domain in the Middle Atlantic Bight region, along the New Jersey shelf-break. Results are compared for several ocean and atmospheric forcing conditions.

Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part II: Applications

Sondergaard, T. and P.F.J. Lermusiaux, 2013b. Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part II: Applications. Monthly Weather Review, 141, 6, 1761-1785, doi:10.1175/MWR-D-11-00296.1.

The properties and capabilities of the GMM-DO filter are assessed and exemplified by applications to two dynamical systems: (1) the Double Well Diffusion and (2) Sudden Expansion flows; both of which admit far-from-Gaussian statistics. The former test case, or twin experiment, validates the use of the EM algorithm and Bayesian Information Criterion with Gaussian Mixture Models in a filtering context; the latter further exemplifies its ability to efficiently handle state vectors of non-trivial dimensionality and dynamics with jets and eddies. For each test case, qualitative and quantitative comparisons are made with contemporary filters. The sensitivity to input parameters is illustrated and discussed. Properties of the filter are examined and its estimates are described, including: the equation-based and adaptive prediction of the probability densities; the evolution of the mean field, stochastic subspace modes and stochastic coefficients; the fitting of Gaussian Mixture Models; and, the efficient and analytical Bayesian updates at assimilation times and the corresponding data impacts. The advantages of respecting nonlinear dynamics and preserving non-Gaussian statistics are brought to light. For realistic test cases admitting complex distributions and with sparse or noisy measurements, the GMM-DO filter is shown to fundamentally improve the filtering skill, outperforming simpler schemes invoking the Gaussian parametric distribution.

Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part I: Theory and Scheme

Sondergaard, T. and P.F.J. Lermusiaux, 2013a. Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part I. Theory and Scheme. Monthly Weather Review, 141, 6, 1737-1760, doi:10.1175/MWR-D-11-00295.1.

This work introduces and derives an efficient, data-driven assimilation scheme, focused on a time-dependent stochastic subspace, that respects nonlinear dynamics and captures non-Gaussian statistics as it occurs. The motivation is to obtain a filter that is applicable to realistic geophysical applications but that also rigorously utilizes the governing dynamical equations with information theory and learning theory for efficient Bayesian data assimilation. Building on the foundations of classical filters, the underlying theory and algorithmic implementation of the new filter are developed and derived. The stochastic Dynamically Orthogonal (DO) field equations and their adaptive stochastic subspace are employed to predict prior probabilities for the full dynamical state, effectively approximating the Fokker-Planck equation. At assimilation times, the DO realizations are fit to semiparametric Gaussian mixture models (GMMs) using the Expectation-Maximization algorithm and the Bayesian Information Criterion. Bayes’ Law is then efficiently carried out analytically within the evolving stochastic subspace. The resulting GMM-DO filter is illustrated in a very simple example. Variations of the GMM-DO filter are also provided along with comparisons with related schemes.

Progress and Prospects of U.S. Data Assimilation in Ocean Research

Lermusiaux, P.F.J., P. Malanotte-Rizzoli, D. Stammer, J. Carton, J. Cummings and A.M. Moore, 2006. "Progress and Prospects of U.S. Data Assimilation in Ocean Research". Oceanography, Special issue on "Advances in Computational Oceanography", T. Paluszkiewicz and S. Harper, Eds., 19, 1, 172-183.

THIS REPORT summarizes goals, activities, and recommendations of a workshop on data assimilation held in Williamsburg, Virginia on September 9-11, 2003, and sponsored by the U.S. Office of Naval Research (ONR) and National Science Foundation (NSF). The overall goal of the workshop was to synthesize research directions for ocean data assimilation (DA) and outline efforts required during the next 10 years and beyond to evolve DA into an integral and sustained component of global, regional, and coastal ocean science and observing and prediction systems. The workshop built on the success of recent and existing DA activities such as those sponsored by the National Oceanographic Partnership Program (NOPP) and NSF-Information Technology Research (NSF-ITR). DA is a quantitative approach to optimally combine models and observations. The combination is usually consistent with model and data uncertainties, which need to be represented. Ocean DA can extract maximum knowledge from the sparse and expensive measurements of the highly variable ocean dynamics. The ultimate goal is to better understand and predict these dynamics on multiple spatial and temporal scales, including interactions with other components of the climate system. There are many applications that involve DA or build on its results, including: coastal, regional, seasonal, and inter-annual ocean and climate dynamics; carbon and biogeochemical cycles; ecosystem dynamics; ocean engineering; observing-system design; coastal management; fisheries; pollution control; naval operations; and defense and security. These applications have different requirements that lead to variations in the DA schemes utilized. For literature on DA, we refer to Ghil and Malanotte-Rizzoli (1991), the National Research Council (1991), Bennett (1992), Malanotte- Rizzoli (1996), Wunsch (1996), Robinson et al. (1998), Robinson and Lermusiaux (2002), and Kalnay (2003). We also refer to the U.S. Global Ocean Data Assimilation Experiment (GODAE) workshop on Global Ocean Data Assimilation: Prospects and Strategies (Rienecker et al., 2001); U.S. National Oceanic and Atmospheric Administration-Office of Global Programs (NOAA-OGP) workshop on Coupled Data Assimilation (Rienecker, 2003); and, NOAA-NASA-NSF workshop on Ongoing Analysis of the Climate System (Arkin et al., 2003).

The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean

Gregoire, M., P. Brasseur and P.F.J. Lermusiaux (Guest Eds.), 2003. The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean. Journal of Marine Systems, 40, 1-3.

The International Lie`ge Colloquium on Ocean Dynamics is organized annually. The topic differs from year to year in an attempt to address, as much as possible, recent problems and incentive new subjects in oceanography. Assembling a group of active and eminent scientists from various countries and often different disciplines, the Colloquia provide a forum for discussion and foster a mutually beneficial exchange of information opening on to a survey of recent discoveries, essential mechanisms, impelling question marks and valuable recommendations for future research. The objective of the 2001 Colloquium was to evaluate the progress of data assimilation methods in marine science and, in particular, in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean. The past decades have seen important advances in the understanding and modelling of key processes of the ocean circulation and bio-geo-chemical cycles. The increasing capabilities of data and models, and their combination, are allowing the study of multidisciplinary interactions that occur dynamically, in multiple ways, on multiscales and with feedbacks. The capacity of dynamical models to simulate interdisciplinary ocean processes over specific space- time windows and thus forecast their evolution over predictable time scales is also conditioned upon the availability of relevant observations to: initialise and continually update the physical and bio-geo-chemical sectors of the ocean state; provide relevant atmospheric and boundary forcing; calibrate the parameterizations of sub-grid scale processes, growth rates and reaction rates; construct interdisciplinary and multiscale correlation and feature models; identify and estimate the main sources of errors in the models; control or correct for mis-represented or neglected processes. The access to multivariate data sets requires the implementation, exploitation and management of dedicated ocean observing and prediction systems. However, the available data are often limited and, for instance, seldom in a form to be directly compatible or directly inserted into the numerical models. To relate the data to the ocean state on all scales and regions that matter, evolving three-dimensional and multivariate (measurement) models are becoming important. Equally significant is the reduction of observational requirements by design of sampling strategies via Observation System Simulation Experiments and adaptive sampling. Data assimilation is a quantitative approach to extract adequate information content from the data and to improve the consistency between data sets and model estimates. It is also a methodology to dynamically interpolate between data scattered in space and time, allowing comprehensive interpretation of multivariate observations. In general, the goals of data assimilation are to: control the growth of predictability errors; correct dynamical deficiencies; estimate model parameters, including the forcings, initial and boundary conditions; characterise key processes by analysis of four- 0924-7963/03/$ – see front matter D 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0924-7963(03)00027-7 www.elsevier.com/locate/jmarsys The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean Journal of Marine Systems 40-41 (2003) 1-3 dimensional fields and their statistics (balances of terms, etc.); carry out advanced sensitivity studies and Observation System Simulation Experiments, and conduct efficient operations, management and monitoring. The theoretical framework of data assimilation for marine sciences is now relatively well established, routed in control theory, estimation theory or inverse techniques, from variational to sequential approaches. Ongoing research efforts of special importance for interdisciplinary applications include the: stochastic representation of processes and determination of model and data errors; treatment of (open) boundary conditions and strong nonlinearities; space-time, multivariate extrapolation of limited and noisy data and determination of measurement models; demonstration that bio-geo-chemical models are valid enough and of adequate structures for their deficiencies to be controlled by data assimilation; and finally, ability to provide accurate estimates of fields, parameters, variabilities and errors, with large and complex dynamical models and data sets. Operationally, major engineering and computational challenges for the coming years include the: development of theoretically sound methods into useful, practical and reliable techniques at affordable costs; implementation of scalable, seamless and automated systems linking observing systems, numerical models and assimilation schemes; adequate mix of integrated and distributed (Web-based) networks; construction of user-friendly architectures and establishment of standards for the description of data and software (metadata) for efficient communication, dissemination and management. In addition to addressing the above items, the 33rd Lie`ge Colloquium has offered the opportunity to: – review the status and current progress of data assimilation methodologies utilised in the physical, acoustical, optical and bio-geo-chemical scientific communities; – demonstrate the potentials of data assimilation systems developed for coupled physical/ecosystem models, from scientific to management inquiries; – examine the impact of data assimilation and inverse modelling in improving model parameterisations; – discuss the observability and controllability properties of, and identify the missing gaps in current observing and prediction systems; and exchange the results of and the learnings from preoperational marine exercises. The presentations given during the Colloquium lead to discussions on a series of topics organized within the following sections: (1) Interdisciplinary research progress and issues: data, models, data assimilation criteria. (2) Observations for interdisciplinary data assimilation. (3) Advanced fields estimation for interdisciplinary systems. (4) Estimation of interdisciplinary parameters and model structures. (5) Assimilation methodologies for physical and interdisciplinary systems. (6) Toward operational interdisciplinary oceanography and data assimilation. A subset of these presentations is reported in the present Special Issue. As was pointed out during the Colloquium, coupled biological-physical data assimilation is in its infancy and much can be accomplished now by the immediate application of existing methods. Data assimilation intimately links dynamical models and observations, and it can play a critical role in the important area of fundamental biological oceanographic dynamical model development and validation over a hierarchy of complexities. Since coupled assimilation for coupled processes is challenging and can be complicated, care must be exercised in understanding, modeling and controlling errors and in performing sensitivity analyses to establish the robustness of results. Compatible interdisciplinary data sets are essential and data assimilation should iteratively define data impact and data requirements. Based on the results presented during the Colloquium, data assimilation is expected to enable future marine technologies and naval operations otherwise impossible or not feasible. Interdisciplinary predictability research, multiscale in both space and time, is required. State and parameter estimation via data assimilation is central to the successful establishment of advanced interdisciplinary ocean observing and prediction systems which, functioning in real time, will contribute to novel and efficient capabilities to manage, and to operate in our oceans. The Scientific Committee and the participants to the 33rd Lie`ge Colloquium wish to express their 2 Preface gratitude to the Ministe`re de l’Enseignement Supe’rieur et de la Recherche Scientifique de la Communaute – Francaise de Belgique, the Fonds National de la Recherche Scientifique de Belgique (F.N.R.S., Belgium), the Ministe`re de l’Emploi et de la Formation du Gouvernement Wallon, the University of Lie`ge, the Commission of European Union, the Scientific Committee on Oceanographic Research (SCOR), the International Oceanographic Commission of the UNESCO, the US Office of Naval Research, the National Science Foundation (NSF, USA) and the International Association for the Physical Sciences of the Ocean (IAPSO) for their most valuable support.

Data assimilation for modeling and predicting coupled physical-biological interactions in the sea

Robinson, A.R. and P.F.J. Lermusiaux, 2002. Data assimilation for modeling and predicting coupled physical-biological interactions in the sea. In "The Sea, Vol. 12: Biological-Physical Interactions in the Ocean", Robinson A.R., J.R. McCarthy and B.J. Rothschild (Eds.). 475-536.

Data assimilation is a modern methodology of relating natural data and dynamical models. The general dynamics of a model is combined or melded with a set of observations. All dynamical models are to some extent approximate, and all data sets are finite and to some extent limited by error bounds. The purpose of data assimilation is to provide estimates of nature which are better estimates than can be obtained by using only the observational data or the dynamical model. There are a number of specific approaches to data assimilation which are suitable for estimation of the state of nature, including natural parameters, and for evaluation of the dynamical approximations. Progress is accelerating in understanding the dynamics of real ocean biological- physical interactive processes. Although most biophysical processes in the sea await discovery, new techniques and novel interdisciplinary studies are evolving ocean science to a new level of realism. Generally, understanding proceeds from a quantitative description of four-dimensional structures and events, through the identification of specific dynamics, to the formulation of simple generalizations. The emergence of realistic interdisciplinary four-dimensional data assimilative ocean models and systems is contributing significantly and increasingly to this progress.

On the mapping of multivariate geophysical fields: sensitivity to size, scales and dynamics

Lermusiaux, P.F.J., 2002. On the mapping of multivariate geophysical fields: sensitivity to size, scales and dynamics. Journal of Atmospheric and Oceanic Technology, 19, 1602-1637.

The effects of a priori parameters on the error subspace estimation and mapping methodology introduced by P. F. J. Lermusiaux et al. is investigated. The approach is three-dimensional, multivariate, and multiscale. The sensitivities of the subspace and a posteriori fields to the size of the subspace, scales considered, and nonlinearities in the dynamical adjustments are studied. Applications focus on the mesoscale to subbasin-scale physics in the northwestern Levantine Sea during 10 February-15 March and 19 March-16 April 1995. Forecasts generated from various analyzed fields are compared to in situ and satellite data. The sensitivities to size show that the truncation to a subspace is efficient. The use of criteria to determine adequate sizes is emphasized and a backof- the-envelope rule is outlined. The sensitivities to scales confirm that, for a given region, smaller scales usually require larger subspaces because of spectral redness. However, synoptic conditions are also shown to strongly influence the ordering of scales. The sensitivities to the dynamical adjustment reveal that nonlinearities can modify the variability decomposition, especially the dominant eigenvectors, and that changes are largest for the features and regions with high shears. Based on the estimated variability variance fields, eigenvalue spectra, multivariate eigenvectors and (cross)-covariance functions, dominant dynamical balances and the spatial distribution of hydrographic and velocity characteristic scales are obtained for primary regional features. In particular, the Ierapetra Eddy is found to be close to gradient-wind balance and coastal-trapped waves are anticipated to occur along the northern escarpment of the basin.

Data Assimilation in Models

Robinson, A.R. and P.F.J. Lermusiaux, 2001. Data Assimilation in Models. Encyclopedia of Ocean Sciences, Academic Press Ltd., London, 623-634.

Data assimilation is a novel, versatile methodology for estimating oceanic variables. The estimation of a quantity of interest via data assimilation involves the combination of observational data with the underlying dynamical principles governing the system under observation. The melding of data and dynamics is a powerful methodology which makes possible efRcient, accurate, and realistic estimations otherwise not feasible. It is providing rapid advances in important aspects of both basic ocean science and applied marine technology and operations. The following sections introduce concepts, describe purposes, present applications to regional dynamics and forecasting, overview formalism and methods, and provide a selected range of examples.

On the mapping of multivariate geophysical fields: error and variability subspace estimates

Lermusiaux, P.F.J., D.G.M. Anderson and C.J. Lozano, 2000. On the mapping of multivariate geophysical fields: error and variability subspace estimates. The Quarterly Journal of the Royal Meteorological Society, April B, 1387-1430.

A basis is outlined for the first-guess spatial mapping of three-dimensional multivariate and multiscale geophysical fields and their dominant errors. The a priori error statistics are characterized by covariance matrices and the mapping obtained by solving a minimum-error-variance estimation problem. The size of the problem is reduced efficiently by focusing on the error subspace, here the dominant eigendecomposition of the a priori error covariance. The first estimate of this a priori error subspace is constructed in two parts. For the “observed” portions of the subspace, the covariance of the a priori missing variability is directly specified and eigendecomposed. For the “non-observed” portions, an ensemble of adjustment dynamical integrations is utilized, building the nonobserved covariances in statistical accord with the observed ones. This error subspace construction is exemplified and studied in a Middle Atlantic Bight simulation and in the eastern Mediterranean. Its use allows an accurate, global, multiscale and multivariate, three-dimensional analysis of primitive-equation fields and their errors, in real time. The a posteriori error covariance is computed and indicates complex data-variability influences. The error and variability subspaces obtained can also confirm or reveal the features of dominant variability, such as the Ierapetra Eddy in the Levantine basin.

Data assimilation via Error Subspace Statistical Estimation. Part II: Middle Atlantic Bight shelfbreak front simulations and ESSE validation

Lermusiaux, P.F.J., 1999a. Data assimilation via Error Subspace Statistical Estimation. Part II: Middle Atlantic Bight shelfbreak front simulations and ESSE validation. Monthly Weather Review, 127(7), 1408-1432, doi: 10.1175/1520-0493(1999)127<1408:DAVESS> 2.0.CO;2.

Identical twin experiments are utilized to assess and exemplify the capabilities of error subspace statistical estimation (ESSE). The experiments consists of nonlinear, primitive equation-based, idealized Middle Atlantic Bight shelfbreak front simulations. Qualitative and quantitative comparisons with an optimal interpolation (OI) scheme are made. Essential components of ESSE are illustrated. The evolution of the error subspace, in agreement with the initial conditions, dynamics, and data properties, is analyzed. The three-dimensional multivariate minimum variance melding in the error subspace is compared to the OI melding. Several advantages and properties of ESSE are discussed and evaluated. The continuous singular value decomposition of the nonlinearly evolving variations of variability and the possibilities of ESSE for dominant process analysis are illustrated and emphasized.

Data assimilation via Error Subspace Statistical Estimation. Part I: Theory and schemes

Lermusiaux, P.F.J. and A.R. Robinson, 1999. Data assimilation via Error Subspace Statistical Estimation. Part I: Theory and schemes. Monthly Weather Review, 127(7), 1385-1407, doi: 10.1175/1520-0493(1999) 127<1385:DAVESS>2.0.CO;2.

A rational approach is used to identify efficient schemes for data assimilation in nonlinear ocean-atmosphere models. The conditional mean, a minimum of several cost functionals, is chosen for an optimal estimate. After stating the present goals and describing some of the existing schemes, the constraints and issues particular to ocean-atmosphere data assimilation are emphasized. An approximation to the optimal criterion satisfying the goals and addressing the issues is obtained using heuristic characteristics of geophysical measurements and models. This leads to the notion of an evolving error subspace, of variable size, that spans and tracks the scales and processes where the dominant errors occur. The concept of error subspace statistical estimation (ESSE) is defined. In the present minimum error variance approach, the suboptimal criterion is based on a continued and energetically optimal reduction of the dimension of error covariance matrices. The evolving error subspace is characterized by error singular vectors and values, or in other words, the error principal components and coefficients. Schemes for filtering and smoothing via ESSE are derived. The data-forecast melding minimizes variance in the error subspace. Nonlinear Monte Carlo forecasts integrate the error subspace in time. The smoothing is based on a statistical approximation approach. Comparisons with existing filtering and smoothing procedures are made. The theoretical and practical advantages of ESSE are discussed. The concepts introduced by the subspace approach are as useful as the practical benefits. The formalism forms a theoretical basis for the intercomparison of reduced dimension assimilation methods and for the validation of specific assumptions for tailored applications. The subspace approach is useful for a wide range of purposes, including nonlinear field and error forecasting, predictability and stability studies, objective analyses, data-driven simulations, model improvements, adaptive sampling, and parameter estimation.

Data Assimilation

Robinson, A.R., P.F.J. Lermusiaux and N.Q. Sloan, III, 1998. Data Assimilation. In "The Sea: The Global Coastal Ocean I", Processes and Methods (K.H. Brink and A.R. Robinson, Editors), Volume 10, John Wiley and Sons, New York, NY, 541-594