Multidisciplinary Simulation, Estimation, and Assimilation Systems (MSEAS)

Bayesian Learning of Coupled Biogeochemical-Physical Models

Gupta, A. and P.F.J. Lermusiaux, 2023. Bayesian Learning of Coupled Biogeochemical-Physical Models. Progress in Oceanography 216, 103050. doi:10.1016/j.pocean.2023.103050

Predictive dynamical models for marine ecosystems are used for a variety of needs. Due to the sparse measurements and limited understanding of the myriad of ocean processes, there is however significant uncertainty. There is model uncertainty in the parameter values, functional forms with diverse parameterizations, and level of complexity needed, and thus in the state variable fields. We develop a Bayesian model learning methodology that allows interpolation in the space of candidate dynamical models and discovery of new models from noisy, sparse, and indirect observations, all while estimating state variable fields and parameter values, as well as the joint probability distributions of all learned quantities. We address the challenges of high-dimensional and multidisciplinary dynamics governed by partial differential equations (PDEs) by using state augmentation and the computationally efficient Gaussian Mixture Model – Dynamically Orthogonal filter. Our innovations include stochastic formulation parameters and stochastic complexity parameters to unify candidate models into a single general model as well as stochastic expansion parameters within piecewise function approximations to generate dense candidate model spaces. These innovations allow handling many compatible and embedded candidate models, possibly none of which are accurate, and learning elusive unknown functional forms that augment these models. Our new Bayesian methodology is generalizable and interpretable. It seamlessly and rigorously discriminates among existing models, but also extrapolates out of the space of models to discover new ones. We perform a series of twin experiments based on flows past a ridge coupled with three-to-five component ecosystem models, including flows with chaotic advection. We quantify the learning skill, and evaluate convergence and the sensitivity to hyper-parameters. Our PDE framework successfully discriminates among functional forms and model complexities, and learns in the absence of prior knowledge by searching in dense function spaces. The probabilities of known, uncertain, and unknown model formulations, and of biogeochemical-physical fields and parameters, are updated jointly using Bayes’ law. Non-Gaussian statistics, ambiguity, and biases are captured. The parameter values and the model formulations that best explain the noisy, sparse, and indirect data are identified. When observations are sufficiently informative, model complexity and model functions are discovered.

Bayesian Learning of Stochastic Dynamical Models

Lu, P., and P.F.J. Lermusiaux, 2021. Bayesian Learning of Stochastic Dynamical Models. Physica D 427: 133003. doi:10.1016/j.physd.2021.133003

A new methodology for rigorous Bayesian learning of high-dimensional stochastic dynamical models is developed. The methodology performs parallelized computation of marginal likelihoods for multiple candidate models, integrating over all state variable and parameter values, and enabling a principled Bayesian update of model distributions. This is accomplished by leveraging the dynamically orthogonal (DO) evolution equations for uncertainty prediction in a dynamic stochastic subspace and the Gaussian Mixture Model-DO filter for inference of nonlinear state variables and parameters, using reduced-dimension state augmentation to accommodate models featuring uncertain parameters. Overall, the joint Bayesian inference of the state, model equations, geometry, boundary conditions, and initial conditions is performed. Results are exemplified using two high-dimensional, nonlinear simulated fluid and ocean systems. For the first, limited measurements of fluid flow downstream of an obstacle are used to perform joint inference of the obstacle’s shape, the Reynolds number, and the O(10⁵) fluid velocity state variables. For the second, limited measurements of the concentration of a microorganism advected by an uncertain flow are used to perform joint inference of the microorganism’s reaction equation and the O(10⁵) microorganism concentration and ocean velocity state variables. When the observations are sufficiently informative about the learning objectives, we find that our posterior model probabilities correctly identify either the true model or the most plausible models, even in cases where a human would be challenged to do the same.

METEOR: A Mobile (Portable) ocEan roboTic ObsErvatORy

Rajan, K., F. Aguado, P. Lermusiaux, J. Borges de Sousa, A. Subramaniam, and J. Tintore, 2021. METEOR: A Mobile (Portable) ocEan roboTic ObsErvatORy. Marine Technology Society Journal 55(3): 74-75. doi:10.4031/MTSJ.55.3.42

The oceans make this planet habitable and provide a variety of essential ecosystem services ranging from climate regulation through control of greenhouse gases to provisioning about 17% of protein consumed by humans. The oceans are changing as a consequence of human activity but this system is severely under sampled. Traditional methods of studying the oceans, sailing in straight lines, extrapolating a few point measurements have not changed much in 200 years. Despite the tremendous advances in sampling technologies, we often use our autonomous assets the same way. We propose to use the advances in multiplatform, multidisciplinary, and integrated ocean observation, artificial intelligence, marine robotics, new high-resolution coastal ocean data assimilation techniques and computer models to observe and predict the oceans “intelligently”—by deploying self-propelled autonomous sensors and Smallsats guided by data assimilating models to provide observations to reduce model uncertainty in the coastal ocean. This system will be portable and capable of being deployed rapidly in any ocean.

Minimum-Correction Second-Moment Matching: Theory, Algorithms and Applications

Lin, J. and P.F.J. Lermusiaux, 2021. Minimum-Correction Second-Moment Matching: Theory, Algorithms and Applications. Numerische Mathematik 147(3): 611–650. doi:10.1007/s00211-021-01178-8

We address the problem of finding the closest matrix Ũ to a given U under the constraint that a prescribed second-moment matrix P̃ must be matched, i.e. Ũ^TŨ=P̃. We obtain a closed-form formula for the unique global optimizer Ũ for the full-rank case, that is related to U by an SPD (symmetric positive definite) linear transform. This result is generalized to rank-deficient cases as well as to infinite dimensions. We highlight the geometric intuition behind the theory and study the problem’s rich connections to minimum congruence transform, generalized polar decomposition, optimal transport, and rank-deficient data assimilation. In the special case of P̃=I, minimum-correction second-moment matching reduces to the well-studied optimal orthonormalization problem. We investigate the general strategies for numerically computing the optimizer and analyze existing polar decomposition and matrix square root algorithms. We modify and stabilize two Newton iterations previously deemed unstable for computing the matrix square root, such that they can now be used to efficiently compute both the orthogonal polar factor and the SPD square root. We then verify the higher performance of the various new algorithms using benchmark cases with randomly generated matrices. Lastly, we complete two applications for the stochastic Lorenz-96 dynamical system in a chaotic regime. In reduced subspace tracking using dynamically orthogonal equations, we maintain the numerical orthonormality and continuity of time-varying base vectors. In ensemble square root filtering for data assimilation, the prior samples are transformed into posterior ones by matching the covariance given by the Kalman update while also minimizing the corrections to the prior samples.

Stochastic Oceanographic-Acoustic Prediction and Bayesian Inversion for Wide Area Ocean Floor Mapping

Ali, W.H., M.S. Bhabra, P.F.J. Lermusiaux, A. March, J.R. Edwards, K. Rimpau, and P. Ryu, 2019. Stochastic Oceanographic-Acoustic Prediction and Bayesian Inversion for Wide Area Ocean Floor Mapping. In: OCEANS '19 MTS/IEEE Seattle, 27-31 October 2019, doi:10.23919/OCEANS40490.2019.8962870

Covering the vast majority of our planet, the ocean is still largely unmapped and unexplored. Various imaging techniques researched and developed over the past decades, ranging from echo-sounders on ships to LIDAR systems in the air, have only systematically mapped a small fraction of the seafloor at medium resolution. This, in turn, has spurred recent ambitious efforts to map the remaining ocean at high resolution. New approaches are needed since existing systems are neither cost nor time effective. One such approach consists of a sparse aperture mapping technique using autonomous surface vehicles to allow for efficient imaging of wide areas of the ocean floor. Central to the operation of this approach is the need for robust, accurate, and efficient inference methods that effectively provide reliable estimates of the seafloor profile from the measured data. In this work, we utilize such a stochastic prediction and Bayesian inversion and demonstrate results on benchmark problems. We first outline efficient schemes for deterministic and stochastic acoustic modeling using the parabolic wave equation and the optimally-reduced Dynamically Orthogonal equations and showcase results on stochastic test cases. We then present our Bayesian inversion schemes and its results for rigorous nonlinear assimilation and joint bathymetry-ocean physics-acoustics inversion.

A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Applications

Lolla, T. and P.F.J. Lermusiaux, 2017b. A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Applications. Monthly Weather Review, 145, 2763-2790 DOI:10.1175/MWR-D-16-0065.1.

The nonlinear Gaussian Mixture Model Dynamically Orthogonal (GMM–DO) smoother for high- dimensional stochastic fields is exemplified and contrasted with other smoothers by applications to three dynamical systems, all of which admit far-from-Gaussian distributions. The capabilities of the smoother are first illustrated using a double-well stochastic diffusion experiment. Comparisons with the original and improved versions of the ensemble Kalman smoother explain the detailed mechanics of GMM–DO smoothing and show that its accuracy arises from the joint GMM distributions across successive observation times. Next, the smoother is validated using the advection of a passive stochastic tracer by a reversible shear flow. This example admits an exact smoothed solution, whose derivation is also provided. Results show that the GMM– DO smoother accurately captures the full smoothed distributions and not just the mean states. The final example showcases the smoother in more complex nonlinear fluid dynamics caused by a barotropic jet flowing through a sudden expansion and leading to variable jets and eddies. The accuracy of the GMM–DO smoother is compared to that of the Error Subspace Statistical Estimation smoother. It is shown that even when the dynamics result in only slightly multimodal joint distributions, Gaussian smoothing can lead to a severe loss of information. The three examples show that the backward inferences of the GMM–DO smoother are skillful and efficient. Accurate evaluation of Bayesian smoothers for nonlinear high-dimensional dynamical systems is challenging in itself. The present three examples—stochastic low dimension, reversible high dimension, and irreversible high dimension—provide complementary and effective benchmarks for such evaluation.

A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Theory and Scheme

Lolla, T. and P.F.J. Lermusiaux, 2017a. A Gaussian Mixture Model Smoother for Continuous Nonlinear Stochastic Dynamical Systems: Theory and Scheme. Monthly Weather Review, 145, 2743-2761, DOI:10.1175/MWR-D-16-0064.1

Retrospective inference through Bayesian smoothing is indispensable in geophysics, with crucial applications in ocean and numerical weather estimation, climate dynamics, and Earth system modeling. However, dealing with the high-dimensionality and nonlinearity of geophysical processes remains a major challenge in the development of Bayesian smoothers. Addressing this issue, a novel subspace smoothing methodology for high-dimensional stochastic fields governed by general nonlinear dynamics is obtained. Building on recent Bayesian filters and classic Kalman smoothers, the fundamental equations and forward–backward algorithms of new Gaussian Mixture Model (GMM) smoothers are derived, for both the full state space and dynamic subspace. For the latter, the stochastic Dynamically Orthogonal (DO) field equations and their time-evolving stochastic subspace are employed to predict the prior subspace probabilities. Bayesian inference, both forward and backward in time, is then analytically carried out in the dominant stochastic subspace, after fitting semiparametric GMMs to joint subspace realizations. The theoretical properties, varied forms, and computational costs of the new GMM smoother equations are presented and discussed.

Validation of Genetic Algorithm Based Optimal Sampling for Ocean Data Assimilation

Heaney, K. D., P. F. J. Lermusiaux, T. F. Duda and P. J. Haley Jr., 2016.Validation of Genetic Algorithm Based Optimal Sampling for Ocean Data Assimilation. Ocean Dynamics. 66: 1209-1229. doi:10.1007/s10236-016-0976-5.

Regional ocean models are capable of forecasting conditions for usefully long intervals of time (days) provided that initial and ongoing conditions can be measured. In resource-limited circumstances, the placement of sensors in optimal locations is essential. Here, a nonlinear optimization approach to determine optimal adaptive sampling that uses the Genetic Algorithm (GA) method is presented. The method determines sampling strategies that minimize a user-defined physics-based cost function. The method is evaluated using identical twin experiments, comparing hindcasts from an ensemble of simulations that assimilate data selected using the GA adaptive sampling and other methods. For skill metrics, we employ the reduction of the ensemble root-mean-square-error (RMSE) between the “true” data-assimilative ocean simulation and the different ensembles of data-assimilative hindcasts. A 5-glider optimal sampling study is set up for a 400 km x 400 km domain in the Middle Atlantic Bight region, along the New Jersey shelf-break. Results are compared for several ocean and atmospheric forcing conditions.

Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part II: Applications

Sondergaard, T. and P.F.J. Lermusiaux, 2013b. Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part II: Applications. Monthly Weather Review, 141, 6, 1761-1785, doi:10.1175/MWR-D-11-00296.1.

The properties and capabilities of the GMM-DO filter are assessed and exemplified by applications to two dynamical systems: (1) the Double Well Diffusion and (2) Sudden Expansion flows; both of which admit far-from-Gaussian statistics. The former test case, or twin experiment, validates the use of the EM algorithm and Bayesian Information Criterion with Gaussian Mixture Models in a filtering context; the latter further exemplifies its ability to efficiently handle state vectors of non-trivial dimensionality and dynamics with jets and eddies. For each test case, qualitative and quantitative comparisons are made with contemporary filters. The sensitivity to input parameters is illustrated and discussed. Properties of the filter are examined and its estimates are described, including: the equation-based and adaptive prediction of the probability densities; the evolution of the mean field, stochastic subspace modes and stochastic coefficients; the fitting of Gaussian Mixture Models; and, the efficient and analytical Bayesian updates at assimilation times and the corresponding data impacts. The advantages of respecting nonlinear dynamics and preserving non-Gaussian statistics are brought to light. For realistic test cases admitting complex distributions and with sparse or noisy measurements, the GMM-DO filter is shown to fundamentally improve the filtering skill, outperforming simpler schemes invoking the Gaussian parametric distribution.

Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part I: Theory and Scheme

Sondergaard, T. and P.F.J. Lermusiaux, 2013a. Data Assimilation with Gaussian Mixture Models using the Dynamically Orthogonal Field Equations. Part I. Theory and Scheme. Monthly Weather Review, 141, 6, 1737-1760, doi:10.1175/MWR-D-11-00295.1.

This work introduces and derives an efficient, data-driven assimilation scheme, focused on a time-dependent stochastic subspace, that respects nonlinear dynamics and captures non-Gaussian statistics as it occurs. The motivation is to obtain a filter that is applicable to realistic geophysical applications but that also rigorously utilizes the governing dynamical equations with information theory and learning theory for efficient Bayesian data assimilation. Building on the foundations of classical filters, the underlying theory and algorithmic implementation of the new filter are developed and derived. The stochastic Dynamically Orthogonal (DO) field equations and their adaptive stochastic subspace are employed to predict prior probabilities for the full dynamical state, effectively approximating the Fokker-Planck equation. At assimilation times, the DO realizations are fit to semiparametric Gaussian mixture models (GMMs) using the Expectation-Maximization algorithm and the Bayesian Information Criterion. Bayes’ Law is then efficiently carried out analytically within the evolving stochastic subspace. The resulting GMM-DO filter is illustrated in a very simple example. Variations of the GMM-DO filter are also provided along with comparisons with related schemes.

Dynamical criteria for the evolution of the stochastic dimensionality in flows with uncertainty

Sapsis, T.P. and P.F.J. Lermusiaux, 2012. Dynamical criteria for the evolution of the stochastic dimensionality in flows with uncertainty. Physica D, 241(1), 60-76, doi:10.1016/j.physd.2011.10.001.

We estimate and study the evolution of the dominant dimensionality of dynamical systems with uncertainty governed by stochastic partial differential equations, within the context of dynamically orthogonal (DO) field equations. Transient nonlinear dynamics, irregular data and non-stationary statistics are typical in a large range of applications such as oceanic and atmospheric flow estimation. To efficiently quantify uncertainties in such systems, it is essential to vary the dimensionality of the stochastic subspace with time. An objective here is to provide criteria to do so, working directly with the original equations of the dynamical system under study and its DO representation. We first analyze the scaling of the computational cost of these DO equations with the stochastic dimensionality and show that unlike many other stochastic methods the DO equations do not suffer from the curse of dimensionality. Subsequently, we present the new adaptive criteria for the variation of the stochastic dimensionality based on instantaneous i) stability arguments and ii) Bayesian data updates. We then illustrate the capabilities of the derived criteria to resolve the transient dynamics of two 2D stochastic fluid flows, specifically a double-gyre wind-driven circulation and a lid-driven cavity flow in a basin. In these two applications, we focus on the growth of uncertainty due to internal instabilities in deterministic flows. We consider a range of flow conditions described by varied Reynolds numbers and we study and compare the evolution of the uncertainty estimates under these varied conditions.

Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2011. Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean. IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing, I. Foster, I. Raicu and Y. Zhao (Guest Eds.), 22, doi: 10.1109/TPDS.2011.64.

Uncertainty prediction for ocean and climate predictions is essential for multiple applications today. Many-Task Computing can play a significant role in making such predictions feasible. In this manuscript, we focus on ocean uncertainty prediction using the Error Subspace Statistical Estimation (ESSE) approach. In ESSE, uncertainties are represented by an error subspace of variable size. To predict these uncertainties, we perturb an initial state based on the initial error subspace and integrate the corresponding ensemble of initial conditions forward in time, including stochastic forcing during each simulation. The dominant error covariance (generated via SVD of the ensemble) is used for data assimilation. The resulting ocean fields are used as inputs for predictions of underwater sound propagation. ESSE is a classic case of Many Task Computing: It uses dynamic heterogeneous workflows and ESSE ensembles are data intensive applications. We first study the execution characteristics of a distributed ESSE workflow on a medium size dedicated cluster, examine in more detail the I/O patterns exhibited and throughputs achieved by its components as well as the overall ensemble performance seen in practice. We then study the performance/usability challenges of employing Amazon EC2 and the Teragrid to augment our ESSE ensembles and provide better solutions faster.

Statistical Field Estimation for Complex Coastal Regions and Archipelagos

Agarwal, A. and P.F.J. Lermusiaux, 2011. Statistical Field Estimation for Complex Coastal Regions and Archipelagos. Ocean Modeling, 40(2), 164-189, doi: 10.1016/j.ocemod.2011.08.001.

A fundamental requirement in realistic ocean simulations and dynamical studies is the optimal estimation of gridded fields from the spatially irregular and multivariate data sets that are collected by varied platforms. In this work, we derive and utilize new schemes for the mapping and dynamical inference of ocean fields in complex multiply-connected domains and study the computational properties of these schemes. Specifically, we extend a Bayesian-based multiscale Objective Analysis (OA) approach to complex coastal regions and archipelagos. Such OAs commonly require an estimate of the distances between data and model points, without going across complex landforms. New OA schemes that estimate the length of shortest sea paths using the Level Set Method (LSM) and Fast Marching Method (FMM) are thus derived, implemented and utilized in idealized and realistic ocean cases. An FMM-based methodology for the estimation of total velocity under geostrophic balance in complex domains is also presented. Comparisons with other OA approaches are provided, including those using stochastically forced partial differential equations (SPDEs). We find that the FMM-based OA scheme is the most efficient and accurate. The FMM-based field maps do not require postprocessing (smoothing). Mathematical and computational properties of our new OA schemes are studied in detail, using fundamental theorems and illustrations. We find that higher-order FMM’s schemes improve accuracy and that a multi-order scheme is efficient. We also provide solutions that ensure the use of positive-definite covariances, even in complex multiply-connected domains.

Lagoon of Venice ecosystem: Seasonal dynamics and environmental guidance with uncertainty analyses and error subspace data assimilation

Cossarini, G., P.F.J. Lermusiaux, and C. Solidoro, 2009. Lagoon of Venice ecosystem: Seasonal dynamics and environmental guidance with uncertainty analyses and error subspace data assimilation, J. Geophys. Res., 114, C06026, doi:10.1029/2008JC005080.

An ensemble data assimilation scheme, Error Subspace Statistical Estimation (ESSE), is utilized to investigate the seasonal ecosystem dynamics of the Lagoon of Venice and provide guidance on the monitoring and management of the Lagoon, combining a rich data set with a physical-biogeochemical numerical estuary-coastal model. Novel stochastic ecosystem modeling components are developed to represent prior uncertainties in the Lagoon dynamics model, measurement model, and boundary forcing by rivers, open-sea inlets, and industrial discharges. The formulation and parameters of these additive and multiplicative stochastic error models are optimized based on data-model forecast misfits. The sensitivity to initial and boundary conditions is quantified and analyzed. Half-decay characteristic times are estimated for key ecosystem variables, and their spatial and temporal variability are studied. General results of our uncertainty analyses are that boundary forcing and internal mixing have a significant control on the Lagoon dynamics and that data assimilation is needed to reduce prior uncertainties. The error models are used in the ESSE scheme for ensemble uncertainty predictions and data assimilation, and an optimal ensemble dimension is estimated. Overall, higher prior uncertainties are predicted in the central and northern regions of the Lagoon. On the basis of the dominant singular vectors of the ESSE ensemble, the two major northern rivers are the biggest sources of dissolved inorganic nitrogen (DIN) uncertainty in the Lagoon. Other boundary sources such as the southern rivers and industrial discharges can dominate uncertainty modes on certain months. For dissolved inorganic phosphorus (DIP) and phytoplankton, dominant modes are also linked to external boundaries, but internal dynamics effects are more significant than those for DIN. Our posterior estimates of the seasonal biogeochemical fields and of their uncertainties in 2001 cover the whole Lagoon. They provide the means to describe the ecosystem and guide local environmental policies. Specifically, our findings and results based on these fields include the temporal and spatial variability of nutrient and plankton gradients in the Lagoon; dynamical connections among ecosystem fields and their variability; strengths, gradients and mechanisms of the plankton blooms in late spring, summer, and fall; reductions of uncertainties by data assimilation and thus a quantification of data impacts and data needs; and, finally, an assessment of the water quality in the Lagoon in light of the local environmental legislation.

Forecasting and Reanalysis in the Monterey Bay/California Current Region for the Autonomous Ocean Sampling Network-II Experiment.

Haley, P.J. Jr., P.F.J. Lermusiaux, A.R. Robinson, W.G. Leslie, O. Logutov, G. Cossarini, X.S. Liang, P. Moreno, S.R. Ramp, J.D. Doyle, J. Bellingham, F. Chavez, S. Johnston, 2009. Forecasting and Reanalysis in the Monterey Bay/California Current Region for the Autonomous Ocean Sampling Network-II Experiment. Special issue on AOSN-II, Deep Sea Research, Part II. ISSN 0967-0645, doi: 10.1016/j.dsr2.2008.08.010.

During the August-September 2003 Autonomous Ocean Sampling Network-II experiment, the Harvard Ocean Prediction System (HOPS) and Error Subspace Statistical Estimation (ESSE) system were utilized in real-time to forecast physical fields and uncertainties, assimilate various ocean measurements (CTD, AUVs, gliders and SST data), provide suggestions for adaptive sampling, and guide dynamical investigations. The qualitative evaluations of the forecasts showed that many of the surface ocean features were predicted, but that their detailed positions and shapes were less accurate. The root-mean-square errors of the real-time forecasts showed that the forecasts had skill out to two days. Mean one-day forecast temperature RMS error was 0.26^oC less than persistence RMS error. Mean two-day forecast temperature RMS error was 0.13^oC less than persistence RMS error. Mean one- or two-day salinity RMS error was 0.036 PSU less than persistence RMS error. The real-time skill in the surface was found to be greater than the skill at depth. Pattern correlation coefficient comparisons showed, on average, greater skill than the RMS errors. For simulations lasting 10 or more days, uncertainties in the boundaries could lead to errors in the Monterey Bay region.

Following the real-time experiment, a reanalysis was performed in which improvements were made in the selection of model parameters and in the open-boundary conditions. The result of the reanalysis was improved long-term stability of the simulations and improved quantitative skill, especially the skill in the main thermocline (RMS simulation error 1^oC less than persistence RMS error out to five days). This allowed for an improved description of the ocean features. During the experiment there were two-week to 10-day long upwelling events. Two types of upwelling events were observed: one with plumes extending westward at point Ano Nuevo (AN) and Point Sur (PS); the other with a thinner band of upwelled water parallel to the coast and across Monterey Bay. During strong upwelling events the flows in the upper 10-20 m had scales similar to atmospheric scales. During relaxation, kinetic energy becomes available and leads to the development of mesoscale features. At 100-300 m depths, broad northward flows were observed, sometimes with a coastal branch following topographic features. An anticyclone was often observed in the subsurface fields in the mouth of Monterey Bay.

At-sea Real-time Coupled Four-dimensional Oceanographic and Acoustic Forecasts during Battlespace Preparation 2007

Lam, F.P, P.J. Haley, Jr., J. Janmaat, P.F.J. Lermusiaux, W.G. Leslie, and M.W. Schouten, 2009. At-sea Real-time Coupled Four-dimensional Oceanographic and Acoustic Forecasts during Battlespace Preparation 2007. Special issue of the Journal of Marine Systems on "Coastal processes: challenges for monitoring and prediction", Drs. J.W. Book, Prof. M. Orlic and Michel Rixen (Guest Eds.), 78, S306-S320, doi: 10.1016/j.jmarsys.2009.01.029.

Systems capable of forecasting ocean properties and acoustic performance in the littoral ocean are becoming a useful capability for scientific and operational exercises. The coupling of a data-assimilative nested ocean modeling system with an acoustic propagation modeling system was carried out at sea for the first time, within the scope of Battlespace Preparation 2007 (BP07) that was part of Marine Rapid Environmental Assessment (MREA07) exercises. The littoral region for our studies was southeast of the island of Elba ( Italy) in the Tyrrhenian basin east of Corsica and Sardinia. During BP07, several vessels collected in situ ocean data, based in part on recommendations from oceanographic forecasts. The data were assimilated into a four- dimensional high-resolution ocean modeling system. Sound-speed forecasts were then used as inputs for bearing- and range-dependent acoustic propagation forecasts. Data analyses are carried out and the set-up of the coupled oceanographic-acoustic system as well as the results of its real-time use are described. A significant finding is that oceanographic variability can considerably influence acoustic propagation properties, including the probability of detection, even in this apparently quiet region around Elba. This strengthens the importance of coupling at-sea acoustic modeling to real-time ocean forecasting. Other findings include the challenges involved in downscaling basin-scale modeling systems to high-resolution littoral models, especially in the Mediterranean Sea. Due to natural changes, global human activities and present model resolutions, the assimilation of synoptic regional ocean data is recommended in the region.

Acoustically Focused Adaptive Sampling and On-board Routing for Marine Rapid Environmental Assessment

Wang, D., P.F.J. Lermusiaux, P.J. Haley, D. Eickstedt, W.G. Leslie and H. Schmidt, 2009. Acoustically Focused Adaptive Sampling and On-board Routing for Marine Rapid Environmental Assessment. Special issue of Journal of Marine Systems on "Coastal processes: challenges for monitoring and prediction", Drs. J.W. Book, Prof. M. Orlic and Michel Rixen (Guest Eds), 78, S393-S407, doi: 10.1016/j.jmarsys.2009.01.037.

Variabilities in the coastal ocean environment span a wide range of spatial and temporal scales. From an acoustic viewpoint, the limited oceanographic measurements and today’s ocean computational capabilities are not always able to provide oceanic-acoustic predictions in high-resolution and with enough accuracy. Adaptive Rapid Environmental Assessment (AREA) is an adaptive sampling concept being developed in connection with the emergence of Autonomous Ocean Sampling Networks and interdisciplinary ensemble predictions and adaptive sampling via Error Subspace Statistical Estimation (ESSE). By adaptively and optimally deploying in situ sampling resources and assimilating these data into coupled nested ocean and acoustic models, AREA can dramatically improve the estimation of ocean fields that matter for acoustic predictions. These concepts are outlined and a methodology is developed and illustrated based on the Focused Acoustic Forecasting-05 (FAF05) exercise in the northern Tyrrhenian sea. The methodology first couples the data-assimilative environmental and acoustic propagation ensemble modeling. An adaptive sampling plan is then predicted, using the uncertainty of the acoustic predictions as input to an optimization scheme which finds the parameter values of autonomous sampling behaviors that optimally reduce this forecast of the acoustic uncertainty. To compute this reduction, the expected statistics of unknown data to be sampled by different candidate sampling behaviors are assimilated. The predicted-optimal parameter values are then fed to the sampling vehicles. A second adaptation of these parameters is ultimately carried out in the water by the sampling vehicles using onboard routing, in response to the real ocean data that they acquire. The autonomy architecture and algorithms used to implement this methodology are also described. Results from a number of real-time AREA simulations using data collected during the Focused Acoustic Forecasting (FAF05) exercise are presented and discussed for the case of a single Autonomous Underwater Vehicle (AUV). For FAF05, the main AREA-ESSE application was the optimal tracking of the ocean thermocline based on ocean-acoustic ensemble prediction, adaptive sampling plans for vertical Yo-Yo behaviors and subsequent onboard Yo-Yo routing.

Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2009. Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation. Conference on High Performance Networking and Computing, Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers (Portland, OR, 16 November 2009), 10pp. doi.acm.org/10.1145/1646468.1646482.

Error Subspace Statistical Estimation (ESSE), an uncertainty prediction and data assimilation methodology employed for real-time ocean forecasts, is based on a characterization and prediction of the largest uncertainties. This is carried out by evolving an error subspace of variable size. We use an ensemble of stochastic model simulations, initialized based on an estimate of the dominant initial uncertainties, to predict the error subspace of the model fields. The dominant error covariance (generated via an SVD of the ensemble-generated error covariance matrix) is used for data assimilation. The resulting ocean fields are provided as the input to acoustic modeling, allowing for the prediction and study of the spatiotemporal variations in acoustic propagation and their uncertainties. The ESSE procedure is a classic case of Many Task Computing: These codes are managed based on dynamic workflows for the: (i) perturbation of the initial mean state, (ii) subsequent ensemble of stochastic PE model runs, (iii) continuous generation of the covariance matrix, (iv) successive computations of the SVD of the ensemble spread until a convergence criterion is satisfied, and (v) data assimilation. Its ensemble nature makes it a many task data intensive application and its dynamic workflow gives it heterogeneity. Subsequent acoustics propagation modeling involves a very large ensemble of short-in-duration acoustics runs.

A multigrid methodology for assimilation of measurements into regional tidal models

Logutov, O.G., 2008. A multigrid methodology for assimilation of measurements into regional tidal models. Ocean Dynamics, 58, 441-460, doi:10.1007/s10236-008-0163-4.

This paper presents a rigorous, yet practical, method of multigrid data assimilation into regional structured-grid tidal models. The new inverse tidal nesting scheme, with nesting across multiple grids, is designed to provide a fit of the tidal dynamics to data in areas with highly complex bathymetry and coastline geometry. In these areas, computational constraints make it impractical to fully resolve local topographic and coastal features around all of the observation sites in a stand-alone computation. The proposed strategy consists of increasing the model resolution in multiple limited area domains around the observation locations where a representativeness error is detected in order to improve the representation of the measurements with respect to the dynamics. Multiple high-resolution nested domains are set up and data assimilation is carried out using these embedded nested computations. Every nested domain is coupled to the outer domain through the open boundary conditions (OBCs). Data inversion is carried out in a control space of the outer domain model. A level of generality is retained throughout the presentation with respect to the choice of the control space; however, a specific example of using the outer domain OBCs as the control space is provided, with other sensible choices discussed. In the forward scheme, the computations in the nested domains do not affect the solution in the outer domain. The subsequent inverse computations utilize the observation-minus-model residuals of the forward computations across these multiple nested domains in order to obtain the optimal values of parameters in the control space of the outer domain model. The inversion is carried out by propagating the uncertainty from the control space to model tidal fields at observation locations in the outer and in the nested domains using efficient low-rank error covariance representations. Subsequently, an analysis increment in the control space of the outer domain model is computed and the multigrid system is steered optimally towards observations while preserving a perfect dynamical balance. The method is illustrated using a real-world application in the context of the Philippines Strait Dynamics experiment.

Path Planning of Autonomous Underwater Vehicles for Adaptive Sampling Using Mixed Integer Linear Programming

Yilmaz, N.K., C. Evangelinos, P.F.J. Lermusiaux and N. Patrikalakis, 2008. Path Planning of Autonomous Underwater Vehicles for Adaptive Sampling Using Mixed Integer Linear Programming. IEEE Transactions, Journal of Oceanic Engineering, 33 (4), 522-537. doi: 10.1109/JOE.2008.2002105.

The goal of adaptive sampling in the ocean is to predict the types and locations of additional ocean measurements that would be most useful to collect. Quantitatively, what is most useful is defined by an objective function and the goal is then to optimize this objective under the constraints of the available observing network. Examples of objectives are better oceanic understanding, to improve forecast quality, or to sample regions of high interest. This work provides a new path-planning scheme for the adaptive sampling problem. We define the path-planning problem in terms of an optimization framework and propose a method based on mixed integer linear programming (MILP). The mathematical goal is to find the vehicle path that maximizes the line integral of the uncertainty of field estimates along this path. Sampling this path can improve the accuracy of the field estimates the most. While achieving this objective, several constraints must be satisfied and are implemented. They relate to vehicle motion, intervehicle coordination, communication, collision avoidance, etc. The MILP formulation is quite powerful to handle different problem constraints and flexible enough to allow easy extensions of the problem. The formulation covers single- and multiple-vehicle cases as well as singleand multiple-day formulations. The need for a multiple-day formulation arises when the ocean sampling mission is optimized for several days ahead. We first introduce the details of the formulation, then elaborate on the objective function and constraints, and finally, present a varied set of examples to illustrate the applicability of the proposed method.

Inverse Barotropic Tidal Estimation for Regional Ocean Applications

Logutov, O.G. and Lermusiaux, P.F.J., 2008. Inverse Barotropic Tidal Estimation for Regional Ocean Applications. Ocean Modeling, 25, 17-34. doi: 10.1016/j.ocemod.2008.06.004.

Correct representation of tidal processes in regional ocean models is contingent on the accurate specification of open boundary conditions. This paper describes a new inverse scheme for the assimilation of observational data into a depth-integrated spectral shallow water tidal model and the numerical implementation of this scheme into a stand-alone computational system for regional tidal prediction. A novel aspect is a specific implementation of the inverse which does not require an adjoint model. An optimization is carried out in the open boundary condition space rather than in the observational space or model state space. Our approach reflects the specifics of regional tidal modeling applications in which open boundary conditions (OBCs) typically constitute a significant source of uncertainty. Regional tidal models rely predominantly on global tidal estimates for open boundary conditions. As the resolution of global tidal models is insufficient to fully resolve regional topographic and coastal features, the a priori OBC estimates potentially contain an error. It is, therefore, desirable to correct these OBCs by finding an inverse OBC estimate that is fitted to the regional observations, in accord with the regional dynamics and respective error estimates. The data assimilation strategy presented in this paper provides a consistent and practical estimation scheme for littoral ocean science and applications where tidal effects are significant. Illustrations of our methodological and computational results are presented in the area of Dabob Bay and Hood Canal, WA, which is a region connected to the open Pacific ocean through a series of inland waterways and complex shorelines and bathymetry.

Adaptive Modeling, Adaptive Data Assimilation and Adaptive Sampling.

Lermusiaux, P.F.J, 2007. Adaptive Modeling, Adaptive Data Assimilation and Adaptive Sampling. Refereed invited manuscript. Special issue on "Mathematical Issues and Challenges in Data Assimilation for Geophysical Systems: Interdisciplinary Perspectives". C.K.R.T. Jones and K. Ide, Eds. Physica D, Vol 230, 172-196, doi: 10.1016/j.physd.2007.02.014.

For efficient progress, model properties and measurement needs can adapt to oceanic events and interactions as they occur. The combination of models and data via data assimilation can also be adaptive. These adaptive concepts are discussed and exemplified within the context of comprehensive real-time ocean observing and prediction systems. Novel adaptive modeling approaches based on simplified maximum likelihood principles are developed and applied to physical and physical-biogeochemical dynamics. In the regional examples shown, they allow the joint calibration of parameter values and model structures. Adaptable components of the Error Subspace Statistical Estimation (ESSE) system are reviewed and illustrated. Results indicate that error estimates, ensemble sizes, error subspace ranks, covariance tapering parameters and stochastic error models can be calibrated by such quantitative adaptation. New adaptive sampling approaches and schemes are outlined. Illustrations suggest that these adaptive schemes can be used in real time with the potential for most efficient sampling.

Environmental Prediction, Path Planning and Adaptive Sampling: Sensing and Modeling for Efficient Ocean Monitoring, Management and Pollution Control

Lermusiaux, P.F.J., P.J. Haley Jr. and N.K. Yilmaz, 2007. Environmental Prediction, Path Planning and Adaptive Sampling: Sensing and Modeling for Efficient Ocean Monitoring, Management and Pollution Control. Sea Technology, 48(9), 35-38.

Adaptive Acoustical-Environmental Assessment for the Focused Acoustic Field-05 At-sea Exercise

Wang, D., P.F.J. Lermusiaux, P.J. Haley, W.G. Leslie and H. Schmidt, 2006. Adaptive Acoustical-Environmental Assessment for the Focused Acoustic Field-05 At-sea Exercise, Oceans 2006, 6pp, Boston, MA, 18-21 Sept. 2006, doi: 10.1109/OCEANS.2006.306904.

Progress and Prospects of U.S. Data Assimilation in Ocean Research

Lermusiaux, P.F.J., P. Malanotte-Rizzoli, D. Stammer, J. Carton, J. Cummings and A.M. Moore, 2006. "Progress and Prospects of U.S. Data Assimilation in Ocean Research". Oceanography, Special issue on "Advances in Computational Oceanography", T. Paluszkiewicz and S. Harper, Eds., 19, 1, 172-183.

THIS REPORT summarizes goals, activities, and recommendations of a workshop on data assimilation held in Williamsburg, Virginia on September 9-11, 2003, and sponsored by the U.S. Office of Naval Research (ONR) and National Science Foundation (NSF). The overall goal of the workshop was to synthesize research directions for ocean data assimilation (DA) and outline efforts required during the next 10 years and beyond to evolve DA into an integral and sustained component of global, regional, and coastal ocean science and observing and prediction systems. The workshop built on the success of recent and existing DA activities such as those sponsored by the National Oceanographic Partnership Program (NOPP) and NSF-Information Technology Research (NSF-ITR). DA is a quantitative approach to optimally combine models and observations. The combination is usually consistent with model and data uncertainties, which need to be represented. Ocean DA can extract maximum knowledge from the sparse and expensive measurements of the highly variable ocean dynamics. The ultimate goal is to better understand and predict these dynamics on multiple spatial and temporal scales, including interactions with other components of the climate system. There are many applications that involve DA or build on its results, including: coastal, regional, seasonal, and inter-annual ocean and climate dynamics; carbon and biogeochemical cycles; ecosystem dynamics; ocean engineering; observing-system design; coastal management; fisheries; pollution control; naval operations; and defense and security. These applications have different requirements that lead to variations in the DA schemes utilized. For literature on DA, we refer to Ghil and Malanotte-Rizzoli (1991), the National Research Council (1991), Bennett (1992), Malanotte- Rizzoli (1996), Wunsch (1996), Robinson et al. (1998), Robinson and Lermusiaux (2002), and Kalnay (2003). We also refer to the U.S. Global Ocean Data Assimilation Experiment (GODAE) workshop on Global Ocean Data Assimilation: Prospects and Strategies (Rienecker et al., 2001); U.S. National Oceanic and Atmospheric Administration-Office of Global Programs (NOAA-OGP) workshop on Coupled Data Assimilation (Rienecker, 2003); and, NOAA-NASA-NSF workshop on Ongoing Analysis of the Climate System (Arkin et al., 2003).

Uncertainty Estimation and Prediction for Interdisciplinary Ocean Dynamics

Lermusiaux, P.F.J., 2006. Uncertainty Estimation and Prediction for Interdisciplinary Ocean Dynamics. Refereed manuscript, Special issue on "Uncertainty Quantification". J. Glimm and G. Karniadakis, Eds. Journal of Computational Physics, 217, 176-199. doi: 10.1016/j.jcp.2006.02.010.

Scientific computations for the quantification, estimation and prediction of uncertainties for ocean dynamics are developed and exemplified. Primary characteristics of ocean data, models and uncertainties are reviewed and quantitative data assimilation concepts defined. Challenges involved in realistic data-driven simulations of uncertainties for four-dimensional interdisciplinary ocean processes are emphasized. Equations governing uncertainties in the Bayesian probabilistic sense are summarized. Stochastic forcing formulations are introduced and a new stochastic-deterministic ocean model is presented. The computational methodology and numerical system, Error Subspace Statistical Estimation, that is used for the efficient estimation and prediction of oceanic uncertainties based on these equations is then outlined. Capabilities of the ESSE system are illustrated in three data-assimilative applications: estimation of uncertainties for physical-biogeochemical fields, transfers of ocean physics uncertainties to acoustics, and real-time stochastic ensemble predictions with assimilation of a wide range of data types. Relationships with other modern uncertainty quantification schemes and promising research directions are discussed.

Adaptive Coupled Physical and Biogeochemical Ocean Predictions: A Conceptual Basis

Lermusiaux, P.F.J, C. Evangelinos, R. Tian, P.J. Haley, J.J. McCarthy, N.M. Patrikalakis, A.R. Robinson and H. Schmidt, 2004. Adaptive Coupled Physical and Biogeochemical Ocean Predictions: A Conceptual Basis. Refereed invited manuscript, F. Darema (Ed.), Lecture Notes in Computer Science, 3038, 685-692.

Physical and biogeochemical ocean dynamics can be intermittent and highly variable, and involve interactions on multiple scales. In general, the oceanic fields, processes and interactions that matter thus vary in time and space. For efficient forecasting, the structures and parameters of models must evolve and respond dynamically to new data injected into the executing prediction system. The conceptual basis of this adaptive modeling and corresponding computational scheme is the subject of this presentation. Specifically, we discuss the process of adaptive modeling for coupled physical and biogeochemical ocean models. The adaptivity is introduced within an interdisciplinary prediction system. Model-data misfits and data assimilation schemes are used to provide feedback from measurements to applications and modify the runtime behavior of the prediction system. Illustrative examples in Massachusetts Bay and Monterey Bay are presented to highlight ongoing progress.

Prediction Systems with Data Assimilation for Coupled Ocean Science and Ocean Acoustics

Robinson, A.R. and P.F.J. Lermusiaux, 2004. Prediction Systems with Data Assimilation for Coupled Ocean Science and Ocean Acoustics, Proceedings of the Sixth International Conference on Theoretical and Computational Acoustics (A. Tolstoy, et al., editors), World Scientific Publishing, 325-342. Refereed invited Keynote Manuscript.

Ocean science and ocean acoustics today are engaged in coupled interdisciplinary research on both fundamental dynamics and applications. In this context interdisciplinary data assimilation, which melds observations and fundamental dynamical models for field and parameter estimation is emerging as a novel and powerful methodology, but computational demands present challenging constraints which need to be overcome. These ideas are developed within the concept of an interdisciplinary system for assessing sonar system performance. An end-to-end system, which couples meteorology-physical oceanography-geoacoustics-ocean acoustics-bottom-noise-target-sonar data and models, is used to estimate uncertainties and their transfers and feedbacks. The approach to interdisciplinary data assimilation for this system importantly involves a full, interdisciplinary state vector and error covariance matrix. An idealized end-to-end system example is presented based upon the Shelfbreak PRIMER experiment in the Middle Atlantic Bight. Uncertainties in the physics are transferred to the acoustics and to a passive sonar using fully coupled physical and acoustical data assimilation.

Coupled physical and biogeochemical data driven simulations of Massachusetts Bay in late summer: real-time and post-cruise data assimilation

Besiktepe, S.T., P.F.J. Lermusiaux and A.R. Robinson, 2003. Coupled physical and biogeochemical data driven simulations of Massachusetts Bay in late summer: real-time and post-cruise data assimilation. Special issue on "The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the oceans", M. Gregoire, P. Brasseur and P.F.J. Lermusiaux (Eds.), Journal of Marine Systems, 40, 171-212.

Data-driven forecasts and simulations for Massachusetts Bay based on in situ observations collected during August – September 1998 and on coupled four-dimensional (4-D) physical and biogeochemical models are carried out, evaluated, and studied. The real-time forecasting and adaptive sampling took place from August 17 to October 5, 1998. Simultaneous synoptic physical and biogeochemical data sets were obtained over a range of scales. For the real-time forecasts, the physical model was initialized using hydrographic data from August 1998 and the new biogeochemical model using historical data. The models were forced with real-time meteorological fields and the physical data were assimilated. The resulting interdisciplinary forecasts were robust and the Bay-scale biogeochemical variability was qualitatively well represented. For the postcruise simulations, the August – September 1998 biogeochemical data are utilized. Extensive comparisons of the coupled model fields with data allowed significant improvements of the biogeochemical model. All physical and biogeochemical data are assimilated using an optimal interpolation scheme. Within this scheme, an approximate biogeochemical balance and dynamical adjustments are utilized to derive the non-observed ecosystem variables from the observed ones. Several processes occurring in the lower trophic levels of Massachusetts Bay during the summer – autumn period over different spatial and temporal scales are described. The coupled dynamics is found to be more vigorous and diverse than previously thought to be the case in this period. For the biogeochemical dynamics, multiscale patchiness occurs. The locations of the patches are mainly defined by physical processes, but their strengths are mainly controlled by biogeochemical processes. The fluxes of nutrients into the euphotic zone are episodic and induced in part by atmospheric forcing. The quasi-weekly passage of storms gradually deepened the mixed layer and often altered the Bay-scale circulation and induced internal submesoscale variability. The physical variability increased the transfer of biogeochemical materials between the surface and deeper layers and modulated the biological processes.

Data driven simulations of synoptic circulation and transports in the Tunisia-Sardinia-Sicily region

Onken, R., A.R. Robinson, P.F.J. Lermusiaux, P.J. Haley Jr. and L.A. Anderson, 2003. Data driven simulations of synoptic circulation and transports in the Tunisia-Sardinia-Sicily region. Journal of Geophysical Research, 108, (C9), 8123-8136.

Data from a hydrographic survey of the Tunisia-Sardinia-Sicily region are assimilated into a primitive equations ocean model. The model simulation is then averaged in time over the short duration of the data survey. The corresponding results, consistent with data and dynamics, are providing new insight into the circulation of Modified Atlantic Water (MAW) and Levantine Intermediate Water (LIW) in this region of the western Mediterranean. For MAW these insights include a southward jet off the east coast of Sardinia, anticyclonic recirculation cells on the Algerian and Tunisian shelves, and a secondary flow splitting in the Strait of Sicily. For the LIW regime a detailed view of the circulation in the Strait of Sicily is given, indicating that LIW proceeds from the strait to the Tyrrhenian Sea. No evidence is found for a direct current path to the Sardinia Channel. Complex circulation patterns are validated by two-way nesting of critical regions. Volume transports are computed for the Strait of Sicily, the Sardinia Channel, and the passage between Sardinia and Sicily.

The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean

Gregoire, M., P. Brasseur and P.F.J. Lermusiaux (Guest Eds.), 2003. The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean. Journal of Marine Systems, 40, 1-3.

The International Lie`ge Colloquium on Ocean Dynamics is organized annually. The topic differs from year to year in an attempt to address, as much as possible, recent problems and incentive new subjects in oceanography. Assembling a group of active and eminent scientists from various countries and often different disciplines, the Colloquia provide a forum for discussion and foster a mutually beneficial exchange of information opening on to a survey of recent discoveries, essential mechanisms, impelling question marks and valuable recommendations for future research. The objective of the 2001 Colloquium was to evaluate the progress of data assimilation methods in marine science and, in particular, in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean. The past decades have seen important advances in the understanding and modelling of key processes of the ocean circulation and bio-geo-chemical cycles. The increasing capabilities of data and models, and their combination, are allowing the study of multidisciplinary interactions that occur dynamically, in multiple ways, on multiscales and with feedbacks. The capacity of dynamical models to simulate interdisciplinary ocean processes over specific space- time windows and thus forecast their evolution over predictable time scales is also conditioned upon the availability of relevant observations to: initialise and continually update the physical and bio-geo-chemical sectors of the ocean state; provide relevant atmospheric and boundary forcing; calibrate the parameterizations of sub-grid scale processes, growth rates and reaction rates; construct interdisciplinary and multiscale correlation and feature models; identify and estimate the main sources of errors in the models; control or correct for mis-represented or neglected processes. The access to multivariate data sets requires the implementation, exploitation and management of dedicated ocean observing and prediction systems. However, the available data are often limited and, for instance, seldom in a form to be directly compatible or directly inserted into the numerical models. To relate the data to the ocean state on all scales and regions that matter, evolving three-dimensional and multivariate (measurement) models are becoming important. Equally significant is the reduction of observational requirements by design of sampling strategies via Observation System Simulation Experiments and adaptive sampling. Data assimilation is a quantitative approach to extract adequate information content from the data and to improve the consistency between data sets and model estimates. It is also a methodology to dynamically interpolate between data scattered in space and time, allowing comprehensive interpretation of multivariate observations. In general, the goals of data assimilation are to: control the growth of predictability errors; correct dynamical deficiencies; estimate model parameters, including the forcings, initial and boundary conditions; characterise key processes by analysis of four- 0924-7963/03/$ – see front matter D 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0924-7963(03)00027-7 www.elsevier.com/locate/jmarsys The use of data assimilation in coupled hydrodynamic, ecological and bio-geo-chemical models of the ocean Journal of Marine Systems 40-41 (2003) 1-3 dimensional fields and their statistics (balances of terms, etc.); carry out advanced sensitivity studies and Observation System Simulation Experiments, and conduct efficient operations, management and monitoring. The theoretical framework of data assimilation for marine sciences is now relatively well established, routed in control theory, estimation theory or inverse techniques, from variational to sequential approaches. Ongoing research efforts of special importance for interdisciplinary applications include the: stochastic representation of processes and determination of model and data errors; treatment of (open) boundary conditions and strong nonlinearities; space-time, multivariate extrapolation of limited and noisy data and determination of measurement models; demonstration that bio-geo-chemical models are valid enough and of adequate structures for their deficiencies to be controlled by data assimilation; and finally, ability to provide accurate estimates of fields, parameters, variabilities and errors, with large and complex dynamical models and data sets. Operationally, major engineering and computational challenges for the coming years include the: development of theoretically sound methods into useful, practical and reliable techniques at affordable costs; implementation of scalable, seamless and automated systems linking observing systems, numerical models and assimilation schemes; adequate mix of integrated and distributed (Web-based) networks; construction of user-friendly architectures and establishment of standards for the description of data and software (metadata) for efficient communication, dissemination and management. In addition to addressing the above items, the 33rd Lie`ge Colloquium has offered the opportunity to: – review the status and current progress of data assimilation methodologies utilised in the physical, acoustical, optical and bio-geo-chemical scientific communities; – demonstrate the potentials of data assimilation systems developed for coupled physical/ecosystem models, from scientific to management inquiries; – examine the impact of data assimilation and inverse modelling in improving model parameterisations; – discuss the observability and controllability properties of, and identify the missing gaps in current observing and prediction systems; and exchange the results of and the learnings from preoperational marine exercises. The presentations given during the Colloquium lead to discussions on a series of topics organized within the following sections: (1) Interdisciplinary research progress and issues: data, models, data assimilation criteria. (2) Observations for interdisciplinary data assimilation. (3) Advanced fields estimation for interdisciplinary systems. (4) Estimation of interdisciplinary parameters and model structures. (5) Assimilation methodologies for physical and interdisciplinary systems. (6) Toward operational interdisciplinary oceanography and data assimilation. A subset of these presentations is reported in the present Special Issue. As was pointed out during the Colloquium, coupled biological-physical data assimilation is in its infancy and much can be accomplished now by the immediate application of existing methods. Data assimilation intimately links dynamical models and observations, and it can play a critical role in the important area of fundamental biological oceanographic dynamical model development and validation over a hierarchy of complexities. Since coupled assimilation for coupled processes is challenging and can be complicated, care must be exercised in understanding, modeling and controlling errors and in performing sensitivity analyses to establish the robustness of results. Compatible interdisciplinary data sets are essential and data assimilation should iteratively define data impact and data requirements. Based on the results presented during the Colloquium, data assimilation is expected to enable future marine technologies and naval operations otherwise impossible or not feasible. Interdisciplinary predictability research, multiscale in both space and time, is required. State and parameter estimation via data assimilation is central to the successful establishment of advanced interdisciplinary ocean observing and prediction systems which, functioning in real time, will contribute to novel and efficient capabilities to manage, and to operate in our oceans. The Scientific Committee and the participants to the 33rd Lie`ge Colloquium wish to express their 2 Preface gratitude to the Ministe`re de l’Enseignement Supe’rieur et de la Recherche Scientifique de la Communaute – Francaise de Belgique, the Fonds National de la Recherche Scientifique de Belgique (F.N.R.S., Belgium), the Ministe`re de l’Emploi et de la Formation du Gouvernement Wallon, the University of Lie`ge, the Commission of European Union, the Scientific Committee on Oceanographic Research (SCOR), the International Oceanographic Commission of the UNESCO, the US Office of Naval Research, the National Science Foundation (NSF, USA) and the International Association for the Physical Sciences of the Ocean (IAPSO) for their most valuable support.

Four-dimensional data assimilation for coupled physical-acoustical fields

Lermusiaux, P.F.J. and C.-S. Chiu, 2002. Four-dimensional data assimilation for coupled physical-acoustical fields. In "Acoustic Variability, 2002". N.G. Pace and F.B. Jensen (Eds.), Saclantcen. Kluwer Academic Press, 417-424.

The estimation of oceanic environmental and acoustical fields is considered as a single coupled data assimilation problem. The four-dimensional data assimilation methodology employed is Error Subspace Statistical Estimation. Environmental fields and their dominant uncertainties are predicted by an ocean dynamical model and transferred to acoustical fields and uncertainties by an acoustic propagation model. The resulting coupled dominant uncertainties define the error subspace. The available physical and acoustical data are then assimilated into the predicted fields in accord with the error subspace and all data uncertainties. The criterion for data assimilation is presently to correct the predicted fields such that the total error variance in the error subspace is minimized. The approach is exemplified for the New England continental shelfbreak region, using data collected during the 1996 Shelfbreak Primer Experiment. The methodology is discussed, computational issues are outlined and the assimilation of model-simulated acoustical data is carried out. Results are encouraging and provide some insights into the dominant variability and uncertainty properties of acoustical fields.

Advanced interdisciplinary data assimilation: Filtering and smoothing via error subspace statistical estimation.

Lermusiaux, P.F.J., A.R. Robinson, P.J. Haley and W.G. Leslie, 2002. Advanced interdisciplinary data assimilation: Filtering and smoothing via error subspace statistical estimation. Proceedings of "The OCEANS 2002 MTS/IEEE" conference, Holland Publications, 795-802.

The efficient interdisciplinary 4D data assimilation with nonlinear models via Error Subspace Statistical Estimation (ESSE) is reviewed and exemplified. ESSE is based on evolving an error subspace, of variable size, that spans and tracks the scales and processes where the dominant errors occur. A specific focus here is the use of ESSE in interdisciplinary smoothing which allows the correction of past estimates based on future data, dynamics and model errors. ESSE is useful for a wide range of purposes which are illustrated by three investigations: (i) smoothing estimation of physical ocean fields in the Eastern Mediterranean, (ii) coupled physical-acoustical data assimilation in the Middle Atlantic Bight shelfbreak, and (iii) coupled physical-biological smoothing and dynamics in Massachusetts Bay.

Data assimilation for modeling and predicting coupled physical-biological interactions in the sea

Robinson, A.R. and P.F.J. Lermusiaux, 2002. Data assimilation for modeling and predicting coupled physical-biological interactions in the sea. In "The Sea, Vol. 12: Biological-Physical Interactions in the Ocean", Robinson A.R., J.R. McCarthy and B.J. Rothschild (Eds.). 475-536.

Data assimilation is a modern methodology of relating natural data and dynamical models. The general dynamics of a model is combined or melded with a set of observations. All dynamical models are to some extent approximate, and all data sets are finite and to some extent limited by error bounds. The purpose of data assimilation is to provide estimates of nature which are better estimates than can be obtained by using only the observational data or the dynamical model. There are a number of specific approaches to data assimilation which are suitable for estimation of the state of nature, including natural parameters, and for evaluation of the dynamical approximations. Progress is accelerating in understanding the dynamics of real ocean biological- physical interactive processes. Although most biophysical processes in the sea await discovery, new techniques and novel interdisciplinary studies are evolving ocean science to a new level of realism. Generally, understanding proceeds from a quantitative description of four-dimensional structures and events, through the identification of specific dynamics, to the formulation of simple generalizations. The emergence of realistic interdisciplinary four-dimensional data assimilative ocean models and systems is contributing significantly and increasingly to this progress.

On the mapping of multivariate geophysical fields: sensitivity to size, scales and dynamics

Lermusiaux, P.F.J., 2002. On the mapping of multivariate geophysical fields: sensitivity to size, scales and dynamics. Journal of Atmospheric and Oceanic Technology, 19, 1602-1637.

The effects of a priori parameters on the error subspace estimation and mapping methodology introduced by P. F. J. Lermusiaux et al. is investigated. The approach is three-dimensional, multivariate, and multiscale. The sensitivities of the subspace and a posteriori fields to the size of the subspace, scales considered, and nonlinearities in the dynamical adjustments are studied. Applications focus on the mesoscale to subbasin-scale physics in the northwestern Levantine Sea during 10 February-15 March and 19 March-16 April 1995. Forecasts generated from various analyzed fields are compared to in situ and satellite data. The sensitivities to size show that the truncation to a subspace is efficient. The use of criteria to determine adequate sizes is emphasized and a backof- the-envelope rule is outlined. The sensitivities to scales confirm that, for a given region, smaller scales usually require larger subspaces because of spectral redness. However, synoptic conditions are also shown to strongly influence the ordering of scales. The sensitivities to the dynamical adjustment reveal that nonlinearities can modify the variability decomposition, especially the dominant eigenvectors, and that changes are largest for the features and regions with high shears. Based on the estimated variability variance fields, eigenvalue spectra, multivariate eigenvectors and (cross)-covariance functions, dominant dynamical balances and the spatial distribution of hydrographic and velocity characteristic scales are obtained for primary regional features. In particular, the Ierapetra Eddy is found to be close to gradient-wind balance and coastal-trapped waves are anticipated to occur along the northern escarpment of the basin.

Transfer of uncertainties through physical-acoustical-sonar end-to-end systems: A conceptual basis

Robinson, A.R., P. Abbot, P.F.J. Lermusiaux and L. Dillman, 2002. Transfer of uncertainties through physical-acoustical-sonar end-to-end systems: A conceptual basis. In "Acoustic Variability, 2002:. N.G. Pace and F.B. Jensen (Eds.), SACLANTCEN. Kluwer Academic Press, 603-610.

An interdisciplinary team of scientists is collaborating to enhance the understanding of the uncertainty in the ocean environment, including the sea bottom, and characterize its impact on tactical system performance. To accomplish these goals quantitatively an end-to-end system approach is necessary. The conceptual basis of this approach and the framework of the end-to-end system, including its components, is the subject of this presentation. Specifically, we present a generic approach to characterize variabilities and uncertainties arising from regional scales and processes, construct uncertainty models for a generic sonar system, and transfer uncertainties from the acoustic environment to the sonar and its signal processing. Illustrative examples are presented to highlight recent progress toward the development of the methodology and components of the system.

Data Assimilation in Models

Robinson, A.R. and P.F.J. Lermusiaux, 2001. Data Assimilation in Models. Encyclopedia of Ocean Sciences, Academic Press Ltd., London, 623-634.

Data assimilation is a novel, versatile methodology for estimating oceanic variables. The estimation of a quantity of interest via data assimilation involves the combination of observational data with the underlying dynamical principles governing the system under observation. The melding of data and dynamics is a powerful methodology which makes possible efRcient, accurate, and realistic estimations otherwise not feasible. It is providing rapid advances in important aspects of both basic ocean science and applied marine technology and operations. The following sections introduce concepts, describe purposes, present applications to regional dynamics and forecasting, overview formalism and methods, and provide a selected range of examples.

Evolving the subspace of the three-dimensional multiscale ocean variability: Massachusetts Bay

Lermusiaux, P.F.J., 2001. Evolving the subspace of the three-dimensional multiscale ocean variability: Massachusetts Bay. Journal of Marine Systems, Special issue on "Three-dimensional ocean circulation: Lagrangian measurements and diagnostic analyses", 29/1-4, 385-422, doi: 10.1016/S0924-7963(01)00025-2.

A data and dynamics driven approach to estimate, decompose, organize and analyze the evolving three-dimensional variability of ocean fields is outlined. Variability refers here to the statistics of the differences between ocean states and a reference state. In general, these statistics evolve in time and space. For a first endeavor, the variability subspace defined by the dominant eigendecomposition of a normalized form of the variability covariance is evolved. A multiscale methodology for its initialization and forecast is outlined. It combines data and primitive equation dynamics within a Monte-Carlo approach. The methodology is applied to part of a multidisciplinary experiment that occurred in Massachusetts Bay in late summer and early fall of 1998. For a 4-day time period, the three-dimensional and multivariate properties of the variability standard deviations and dominant eigenvectors are studied. Two variability patterns are discussed in detail. One relates to a displacement of the Gulf of Maine coastal current offshore from Cape Ann, with the creation of adjacent mesoscale recirculation cells. The other relates to a Bay-wide coastal upwelling mode from Barnstable Harbor to Gloucester in response to strong southerly winds. Snapshots and tendencies of physical fields and trajectories of simulated Lagrangian drifters are employed to diagnose and illustrate the use of the dominant variability covariance. The variability subspace is shown to guide the dynamical analysis of the physical fields. For the stratified conditions, it is found that strong wind events can alter the structures of the buoyancy flow and that circulation features are more variable than previously described, on multiple scales. In several locations, the factors estimated to be important include some or all of the atmospheric and surface pressure forcings, and associated Ekman transports and downwelling/upwelling processes, the Coriolis force, the pressure force, inertia and mixing.

On the mapping of multivariate geophysical fields: error and variability subspace estimates

Lermusiaux, P.F.J., D.G.M. Anderson and C.J. Lozano, 2000. On the mapping of multivariate geophysical fields: error and variability subspace estimates. The Quarterly Journal of the Royal Meteorological Society, April B, 1387-1430.

A basis is outlined for the first-guess spatial mapping of three-dimensional multivariate and multiscale geophysical fields and their dominant errors. The a priori error statistics are characterized by covariance matrices and the mapping obtained by solving a minimum-error-variance estimation problem. The size of the problem is reduced efficiently by focusing on the error subspace, here the dominant eigendecomposition of the a priori error covariance. The first estimate of this a priori error subspace is constructed in two parts. For the “observed” portions of the subspace, the covariance of the a priori missing variability is directly specified and eigendecomposed. For the “non-observed” portions, an ensemble of adjustment dynamical integrations is utilized, building the nonobserved covariances in statistical accord with the observed ones. This error subspace construction is exemplified and studied in a Middle Atlantic Bight simulation and in the eastern Mediterranean. Its use allows an accurate, global, multiscale and multivariate, three-dimensional analysis of primitive-equation fields and their errors, in real time. The a posteriori error covariance is computed and indicates complex data-variability influences. The error and variability subspaces obtained can also confirm or reveal the features of dominant variability, such as the Ierapetra Eddy in the Levantine basin.

Estimation and study of mesoscale variability in the Strait of Sicily

Lermusiaux, P.F.J., 1999b. Estimation and study of mesoscale variability in the Strait of Sicily. Dynamics of Atmospheres and Oceans, 29, 255-303.

Considering mesoscale variability in the Strait of Sicily during September 1996, the four-dimensional physical fields and their dominant variability and error covariances are estimated and studied. The methodology applied in real-time combines an intensive data survey and primitive equation dynamics based on the error subspace statistical estimation approach. A sequence of filtering and prediction problems are solved for a period of 10 days, with adaptive learning of the dominant errors. Intercomparisons with optimal interpolation fields, clear sea surface temperature images and available in situ data are utilized for qualitative and quantitative evaluations. The present estimation system is shown to be a comprehensive nonlinear and adaptive assimilation scheme, capable of providing real-time forecasts of ocean fields and associated dominant variability and error covariances. The initialization and evolution of the error subspace is explained. The dominant error eigenvectors, variance and covariance fields are illustrated and their multivariate, multiscale properties described. Five coupled features associated with the dominant variability in the Strait during August-September 1996 emerge from the dominant decomposition of the initial PE variability covariance matrix: the Adventure Bank Vortex, Maltese Channel Crest, Ionian Shelf Break Vortex, Strait of Messina Vortex, and subbasin-scale temperature and salinity fronts of the Ionian slope. From the evolution of the estimated fields and dominant predictability error covariance decompositions, several of the primitive equation processes associated with the variations of these features are revealed, decomposed and studied. In general, the estimation of the evolving dominant decompositions of the multivariate predictability error and variability covariances appears promising for ocean sciences and technology. The practical feedbacks of the present approach which include the determination of data optimals and the refinements of dynamical and measurement models are considered.

Data assimilation via Error Subspace Statistical Estimation. Part II: Middle Atlantic Bight shelfbreak front simulations and ESSE validation

Lermusiaux, P.F.J., 1999a. Data assimilation via Error Subspace Statistical Estimation. Part II: Middle Atlantic Bight shelfbreak front simulations and ESSE validation. Monthly Weather Review, 127(7), 1408-1432, doi: 10.1175/1520-0493(1999)127<1408:DAVESS> 2.0.CO;2.

Identical twin experiments are utilized to assess and exemplify the capabilities of error subspace statistical estimation (ESSE). The experiments consists of nonlinear, primitive equation-based, idealized Middle Atlantic Bight shelfbreak front simulations. Qualitative and quantitative comparisons with an optimal interpolation (OI) scheme are made. Essential components of ESSE are illustrated. The evolution of the error subspace, in agreement with the initial conditions, dynamics, and data properties, is analyzed. The three-dimensional multivariate minimum variance melding in the error subspace is compared to the OI melding. Several advantages and properties of ESSE are discussed and evaluated. The continuous singular value decomposition of the nonlinearly evolving variations of variability and the possibilities of ESSE for dominant process analysis are illustrated and emphasized.

Data assimilation via Error Subspace Statistical Estimation. Part I: Theory and schemes

Lermusiaux, P.F.J. and A.R. Robinson, 1999. Data assimilation via Error Subspace Statistical Estimation. Part I: Theory and schemes. Monthly Weather Review, 127(7), 1385-1407, doi: 10.1175/1520-0493(1999) 127<1385:DAVESS>2.0.CO;2.

A rational approach is used to identify efficient schemes for data assimilation in nonlinear ocean-atmosphere models. The conditional mean, a minimum of several cost functionals, is chosen for an optimal estimate. After stating the present goals and describing some of the existing schemes, the constraints and issues particular to ocean-atmosphere data assimilation are emphasized. An approximation to the optimal criterion satisfying the goals and addressing the issues is obtained using heuristic characteristics of geophysical measurements and models. This leads to the notion of an evolving error subspace, of variable size, that spans and tracks the scales and processes where the dominant errors occur. The concept of error subspace statistical estimation (ESSE) is defined. In the present minimum error variance approach, the suboptimal criterion is based on a continued and energetically optimal reduction of the dimension of error covariance matrices. The evolving error subspace is characterized by error singular vectors and values, or in other words, the error principal components and coefficients. Schemes for filtering and smoothing via ESSE are derived. The data-forecast melding minimizes variance in the error subspace. Nonlinear Monte Carlo forecasts integrate the error subspace in time. The smoothing is based on a statistical approximation approach. Comparisons with existing filtering and smoothing procedures are made. The theoretical and practical advantages of ESSE are discussed. The concepts introduced by the subspace approach are as useful as the practical benefits. The formalism forms a theoretical basis for the intercomparison of reduced dimension assimilation methods and for the validation of specific assumptions for tailored applications. The subspace approach is useful for a wide range of purposes, including nonlinear field and error forecasting, predictability and stability studies, objective analyses, data-driven simulations, model improvements, adaptive sampling, and parameter estimation.

Data Assimilation

Robinson, A.R., P.F.J. Lermusiaux and N.Q. Sloan, III, 1998. Data Assimilation. In "The Sea: The Global Coastal Ocean I", Processes and Methods (K.H. Brink and A.R. Robinson, Editors), Volume 10, John Wiley and Sons, New York, NY, 541-594