Multidisciplinary Simulation, Estimation, and Assimilation Systems (MSEAS)

Toward Dynamic Data-Driven Systems for Rapid Adaptive Interdisciplinary Ocean Forecasting

Patrikalakis, N.M., P.F.J. Lermusiaux, C. Evangelinos, J.J. McCarthy, A.R. Robinson, H. Schmidt, P.J. Haley, S. Lalis, R. Tian, W.G. Leslie, and W. Cho, 2023. Toward Dynamic Data-Driven Systems for Rapid Adaptive Interdisciplinary Ocean Forecasting. Chapter 14, Handbook of Dynamic Data Driven Applications Systems, F. Darema, E.P. Blasch, S. Ravela, and A.J. Aved (Eds.), pp. 377-395. doi:10.1007/978-3-031-27986-7_14

The state of the ocean evolves and its dynamics involves transitions occurring at multiple scales. For efficient and rapid interdisciplinary forecasting, ocean observing and prediction systems must have the same behavior and adapt to the ever-changing dynamics. This chapter sets the basis of a distributed system for real-time interdisciplinary ocean field and uncertainty forecasting with adaptive modeling and adaptive sampling. The scientific goal is to couple physical and biological oceanography with ocean acoustic measurements. The technical goal is to build a dynamic modeling and instrumentation system based on advanced infrastructures, distributed/grid computing, and efficient information retrieval and visualization interfaces, from which all these are incorporated into the Poseidon system. Importantly, the Poseidon system combines a suite of modern legacy physical models, acoustic models, and ocean current monitoring data assimilation schemes with innovative modeling and adaptive sampling methods. The legacy systems are encapsulated at the binary level using software component methodologies. Measurement models are utilized to link the observed data to the dynamical model variables and structures. With adaptive sampling, the data acquisition is dynamic and aims to minimize the predicted uncertainties, maximize the optimized sampling of key dynamics, and maintain overall coverage. With adaptive modeling, model improvements dynamically select the best model structures and parameters among different physical or biogeochemical parameterizations. The dynamic coupling of models and measurements discussed here, and embodied in the Poseidon system, represents a Dynamic Data-Driven Applications Systems (DDDAS). Technical and scientific progress is highlighted based on examples in Massachusetts Bay, Monterey Bay, and the California Current System.

Deep Reinforcement Learning for Adaptive Mesh Refinement

Foucart, C., A. Charous, and P.F.J. Lermusiaux, 2023. Deep Reinforcement Learning for Adaptive Mesh Refinement. Journal of Computational Physics 491, 112381. doi:10.1016/j.jcp.2023.112381

Finite element discretizations of problems in computational physics often rely on adaptive mesh refinement (AMR) to preferentially resolve regions containing important features during simulation. However, these spatial refinement strategies are often heuristic and rely on domain-specific knowledge or trial-and-error. We treat the process of adaptive mesh refinement as a local, sequential decision-making problem under incomplete information, formulating AMR as a partially observable Markov decision process. Using a deep reinforcement learning approach, we train policy networks for AMR strategy directly from numerical simulation. The training process does not require an exact solution or a high-fidelity ground truth to the partial differential equation at hand, nor does it require a pre-computed training dataset. The local nature of our reinforcement learning formulation allows the policy network to be trained inexpensively on much smaller problems than those on which they are deployed. The methodology is not specific to any particular partial differential equation, problem dimension, or numerical discretization, and can flexibly incorporate diverse problem physics. To that end, we apply the approach to a diverse set of partial differential equations, using a variety of high-order discontinuous Galerkin and hybridizable discontinuous Galerkin finite element discretizations. We show that the resultant deep reinforcement learning policies are competitive with common AMR heuristics, generalize well across problem classes, and strike a favorable balance between accuracy and cost such that they often lead to a higher accuracy per problem degree of freedom.

Minimum-Correction Second-Moment Matching: Theory, Algorithms and Applications

Lin, J. and P.F.J. Lermusiaux, 2021. Minimum-Correction Second-Moment Matching: Theory, Algorithms and Applications. Numerische Mathematik 147(3): 611–650. doi:10.1007/s00211-021-01178-8

We address the problem of finding the closest matrix Ũ to a given U under the constraint that a prescribed second-moment matrix P̃ must be matched, i.e. Ũ^TŨ=P̃. We obtain a closed-form formula for the unique global optimizer Ũ for the full-rank case, that is related to U by an SPD (symmetric positive definite) linear transform. This result is generalized to rank-deficient cases as well as to infinite dimensions. We highlight the geometric intuition behind the theory and study the problem’s rich connections to minimum congruence transform, generalized polar decomposition, optimal transport, and rank-deficient data assimilation. In the special case of P̃=I, minimum-correction second-moment matching reduces to the well-studied optimal orthonormalization problem. We investigate the general strategies for numerically computing the optimizer and analyze existing polar decomposition and matrix square root algorithms. We modify and stabilize two Newton iterations previously deemed unstable for computing the matrix square root, such that they can now be used to efficiently compute both the orthogonal polar factor and the SPD square root. We then verify the higher performance of the various new algorithms using benchmark cases with randomly generated matrices. Lastly, we complete two applications for the stochastic Lorenz-96 dynamical system in a chaotic regime. In reduced subspace tracking using dynamically orthogonal equations, we maintain the numerical orthonormality and continuity of time-varying base vectors. In ensemble square root filtering for data assimilation, the prior samples are transformed into posterior ones by matching the covariance given by the Kalman update while also minimizing the corrections to the prior samples.

Distributed Implementation and Verification of Hybridizable Discontinuous Galerkin Methods for Nonhydrostatic Ocean Processes

Foucart, C., C. Mirabito, P.J. Haley, Jr., and P.F.J. Lermusiaux, 2018. Distributed Implementation and Verification of Hybridizable Discontinuous Galerkin Methods for Nonhydrostatic Ocean Processes. In: Oceans '18 MTS/IEEE Charleston, 22-25 October 2018. doi:10.1109/oceans.2018.8604679

Nonhydrostatic, multiscale processes are an important part of our understanding of ocean dynamics. However, resolving these dynamics with traditional computational techniques can often be prohibitively expensive. We apply the hybridizable discontinuous Galerkin (HDG) finite element methodology to perform computationally efficient, high-order, nonhydrostatic ocean modeling by solving the Navier-Stokes equations with the Boussinesq approximation. In this work, we introduce a distributed implementation of our HDG projection method algorithm. We provide numerical experiments to verify our methodology using the method of manufactured solutions and provide preliminary benchmarking for our distributed implementation that highlight the advantages of the HDG methodology in the context of distributed computing. Lastly, we present simulations in which we capture nonhydrostatic internal waves that form as a result of tidal interactions with ocean topography. First, we consider the case of tidally-driven oscillatory flow over an abrupt, shallow seamount, and next, the case of strongly-stratified, oscillatory flow over a tall seamount. We analyze and compare our simulations to other results in literature.

Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2011. Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean. IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing, I. Foster, I. Raicu and Y. Zhao (Guest Eds.), 22, doi: 10.1109/TPDS.2011.64.

Uncertainty prediction for ocean and climate predictions is essential for multiple applications today. Many-Task Computing can play a significant role in making such predictions feasible. In this manuscript, we focus on ocean uncertainty prediction using the Error Subspace Statistical Estimation (ESSE) approach. In ESSE, uncertainties are represented by an error subspace of variable size. To predict these uncertainties, we perturb an initial state based on the initial error subspace and integrate the corresponding ensemble of initial conditions forward in time, including stochastic forcing during each simulation. The dominant error covariance (generated via SVD of the ensemble) is used for data assimilation. The resulting ocean fields are used as inputs for predictions of underwater sound propagation. ESSE is a classic case of Many Task Computing: It uses dynamic heterogeneous workflows and ESSE ensembles are data intensive applications. We first study the execution characteristics of a distributed ESSE workflow on a medium size dedicated cluster, examine in more detail the I/O patterns exhibited and throughputs achieved by its components as well as the overall ensemble performance seen in practice. We then study the performance/usability challenges of employing Amazon EC2 and the Teragrid to augment our ESSE ensembles and provide better solutions faster.

Automated Sensor Networks to Advance Ocean Science

Schofield, O., S. Glenn, J. Orcutt, M. Arrott, M. Meisinger, A. Gangopadhyay, W. Brown, R. Signell, M. Moline, Y. Chao, S. Chien, D. Thompson, A. Balasuriya, P.F.J. Lermusiaux and M. Oliver, 2010. Automated Sensor Networks to Advance Ocean Science. EOS, Vol. 91, No. 39, 28 September 2010.

Oceanography is evolving from a ship-based expeditionary science to a distributed, observatory- based approach in which scientists continuously interact with instruments in the field. These new capabilities will facilitate the collection of long- term time series while also providing an interactive capability to conduct experiments using data streaming in real time. The U.S. National Science Foundation has funded the Ocean Observatories Initiative (OOI), which over the next 5 years will deploy infrastructure to expand scientists’ ability to remotely study the ocean. The OOI is deploying infrastructure that spans global, regional, and coastal scales. A global component will address planetary- scale problems using a new network of moored buoys linked to shore via satellite telecommunications. A regional cabled observatory will “wire” a single region in the northeastern Pacific Ocean with a high-speed optical and power grid. The coastal component will expand existing coastal observing assets to study the importance of high-frequency forcing on the coastal environment. These components will be linked by a robust cyberinfrastructure (CI) that will integrate marine observatories into a coherent system of systems. This CI infrastructure will also provide a Web- based social network enabled by real- time visualization and access to numerical model information, to provide the foundation for adaptive sampling science. Thus, oceanographers will have access to automated machine-to-machine sensor networks that can be scalable to increase in size and incorporate new technology for decades to come. A case study of this CI in action shows how a community of ocean scientists and engineers located throughout the United States at 12 different institutions used the automated ocean observatory to address daily adaptive science priorities in real time.

Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2009. Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation. Conference on High Performance Networking and Computing, Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers (Portland, OR, 16 November 2009), 10pp. doi.acm.org/10.1145/1646468.1646482.

Error Subspace Statistical Estimation (ESSE), an uncertainty prediction and data assimilation methodology employed for real-time ocean forecasts, is based on a characterization and prediction of the largest uncertainties. This is carried out by evolving an error subspace of variable size. We use an ensemble of stochastic model simulations, initialized based on an estimate of the dominant initial uncertainties, to predict the error subspace of the model fields. The dominant error covariance (generated via an SVD of the ensemble-generated error covariance matrix) is used for data assimilation. The resulting ocean fields are provided as the input to acoustic modeling, allowing for the prediction and study of the spatiotemporal variations in acoustic propagation and their uncertainties. The ESSE procedure is a classic case of Many Task Computing: These codes are managed based on dynamic workflows for the: (i) perturbation of the initial mean state, (ii) subsequent ensemble of stochastic PE model runs, (iii) continuous generation of the covariance matrix, (iv) successive computations of the SVD of the ensemble spread until a convergence criterion is satisfied, and (v) data assimilation. Its ensemble nature makes it a many task data intensive application and its dynamic workflow gives it heterogeneity. Subsequent acoustics propagation modeling involves a very large ensemble of short-in-duration acoustics runs.

Web-Enabled Configuration and Control of Legacy Codes: An Application to Ocean Modeling

Evangelinos, C., P.F.J. Lermusiaux, S. Geiger, R.C. Chang, and N.M. Patrikalakis, 2006. Web-Enabled Configuration and Control of Legacy Codes: An Application to Ocean Modeling. Ocean Modeling, 13, 197-220.

For modern interdisciplinary ocean prediction and assimilation systems, a significant part of the complexity facing users is the very large number of possible setups and parameters, both at build-time and at run-time, especially for the core physical, biological and acoustical ocean predictive models. The configuration of these modeling systems for both local as well as remote execution can be a daunting and error-prone task in the absence of a graphical user interface (GUI) and of software that automatically controls the adequacy and compatibility of options and parameters. We propose to encapsulate the configurability and requirements of ocean prediction codes using an eXtensible Markup Language (XML) based description, thereby creating new computer-readable manuals for the executable binaries. These manuals allow us to generate a GUI, check for correctness of compilation and input parameters, and finally drive execution of the prediction system components, all in an automated and transparent manner. This web-enabled configuration and automated control software has been developed (it is currently in “beta” form) and exemplified for components of the interdisciplinary Harvard ocean prediction system (HOPS) and for the uncertainty prediction components of the error subspace statistical estimation (ESSE) system. Importantly, the approach is general and applies to other existing ocean modeling applications and to other “legacy” codes.

Rapid real-time interdisciplinary ocean forecasting using adaptive sampling and adaptive modeling and legacy codes: Component encapsulation using XML

Evangelinos C., R. Chang, P.F.J. Lermusiaux and N.M. Patrikalakis, 2003. Rapid real-time interdisciplinary ocean forecasting using adaptive sampling and adaptive modeling and legacy codes: Component encapsulation using XML. Lecture Notes in Computer Science, 2660, 375-384.

We present the high level architecture of a real-time interdisciplinary ocean forecasting system that employs adaptive elements in both modeling and sampling. We also discuss an important issue that arises in creating an integrated, web-accessible framework for such a system out of existing stand-alone components: transparent support for handling legacy binaries. Such binaries, that are most common in scientific applications, expect a standard input stream, maybe some command line options, a set of input files and generate a set of output files as well as standard output and error streams. Legacy applications of this form are encapsulated using XML. We present a method that uses XML documents to describe the parameters for executing a binary.

The development and demonstration of an advanced fisheries management information system

Robinson, A.R., B.J. Rothschild, W.G. Leslie, J.J. Bisagni, M.F. Borges, W.S. Brown, D. Cai, P. Fortier, A. Gangopadhyay, P.J. Haley, Jr., H.S. Kim, L. Lanerolle, P.F.J. Lermusiaux, C.J. Lozano, M.G. Miller, G. Strout and M.A. Sundermeyer, 2001. The development and demonstration of an advanced fisheries management information system. Proc. of the 17th Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Albuquerque, New Mexico. American Meteorological Society, 186-190.

Fishery management regulates size and species-specific fishing mortality to optimize biological production from the fish populations and economic production from the fishery. Fishery management is similar to management in industries and in natural resources where the goals of management are intended to optimize outputs relative to inputs. However, the management of fish populations is among the most difficult. The difficulties arise because (a) the dynamics of the natural production system are extremely complicated; involving an infinitude of variables and interacting natural systems and (b) the size-and species-specific fishing mortality (i.e. system control) is difficult to measure, calibrate, and deploy. Despite the difficulties, it is believed that significant advances can be made by employing a fishery management system that involves knowing the short-term (daily to weekly) variability in the structures of environmental and fish fields. We need new information systems that bring together existing critical technologies and thereby place fishery management in a total-systems feedback-control context. Such a system would monitor the state of the structure of all stocks simultaneously in near real-time, be adaptive to the evolving fishery and consider the effects of the environment and economics. To do this the system would need to (a) employ new in situ and remote sensors in innovative ways, (b) develop new data streams to support the development of new information, (c) employ modern modeling, information and knowledge-base technology to process the diverse information and (d) generate management advice and fishing strategies that would optimize the production of fish.

The Advanced Fisheries Management Information System (AFMIS), built through a collaboration of Harvard University and the Center for Marine Science and Technology at the University of Massachusetts at Dartmouth, is intended to apply state-of-the-art multidisciplinary and computational capabilities to operational fisheries management. The system development concept is aimed toward: 1) utilizing information on the “state” of ocean physics, biology, and chemistry; the assessment of spatially-resolved fish-stock population dynamics and the temporal-spatial deployment of fishing effort to be used in fishing and in the operational management of fish stocks; and, 2) forecasting and understanding physical and biological conditions leading to recruitment variability. Systems components are being developed in the context of using the Harvard Ocean Prediction System to support or otherwise interact with the: 1) synthesis and analysis of very large data sets; 2) building of a multidisciplinary multiscale model (coupled ocean physics/N-P-Z/fish dynamics/management models) appropriate for the northwest Atlantic shelf, particularly Georges Bank and Massachusetts Bay; 3) the application and development of data assimilation techniques; and, 4) with an emphasis on the incorporation of remotely sensed data into the data stream.

AFMIS is designed to model a large region of the northwest Atlantic (NWA) as the deep ocean influences the slope and shelves. Several smaller domains, including the Gulf of Maine (GOM) and Georges Bank (GB) are nested within this larger domain (Figure 1). This provides a capability to zoom into these domains with higher resolution while maintaining the essential physics which are coupled to the larger domain. AFMIS will be maintained by the assimilation of a variety of real time data. Specifically this includes sea surface temperature (SST), color (SSC), and height (SSH) obtained from several space-based remote sensors (AVHRR, SeaWiFS and Topex/Poseidon). The assimilation of the variety of real-time remotely sensed data supported by in situ data will allow nowcasting and forecasting over significant periods of time.

A real-time demonstration of concept (RTDOC) nowcasting and forecasting exercise to demonstrate important aspects of the AFMIS concept by producing real time coupled forecasts of physical fields, biological and chemical fields, and fish abundance fields took place in March-May 2000. The RTDOC was designed to verify the physics, to validate the biology and chemistry but only to demonstrate the concept of forecasting the fish fields, since the fish dynamical models are at a very early stage of development. In addition, it demonstrated the integrated system concept and the implication for future coupling of a management model. This note reports on the RTDOC.