headgraphic
loader graphic

Loading content ...

Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2011. Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean. IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing, I. Foster, I. Raicu and Y. Zhao (Guest Eds.), 22, doi: 10.1109/TPDS.2011.64.

Uncertainty prediction for ocean and climate predictions is essential for multiple applications today. Many-Task Computing can play a significant role in making such predictions feasible. In this manuscript, we focus on ocean uncertainty prediction using the Error Subspace Statistical Estimation (ESSE) approach. In ESSE, uncertainties are represented by an error subspace of variable size. To predict these uncertainties, we perturb an initial state based on the initial error subspace and integrate the corresponding ensemble of initial conditions forward in time, including stochastic forcing during each simulation. The dominant error covariance (generated via SVD of the ensemble) is used for data assimilation. The resulting ocean fields are used as inputs for predictions of underwater sound propagation. ESSE is a classic case of Many Task Computing: It uses dynamic heterogeneous workflows and ESSE ensembles are data intensive applications. We first study the execution characteristics of a distributed ESSE workflow on a medium size dedicated cluster, examine in more detail the I/O patterns exhibited and throughputs achieved by its components as well as the overall ensemble performance seen in practice. We then study the performance/usability challenges of employing Amazon EC2 and the Teragrid to augment our ESSE ensembles and provide better solutions faster.

Automated Sensor Networks to Advance Ocean Science

Schofield, O., S. Glenn, J. Orcutt, M. Arrott, M. Meisinger, A. Gangopadhyay, W. Brown, R. Signell, M. Moline, Y. Chao, S. Chien, D. Thompson, A. Balasuriya, P.F.J. Lermusiaux and M. Oliver, 2010. Automated Sensor Networks to Advance Ocean Science. EOS, Vol. 91, No. 39, 28 September 2010.

Oceanography is evolving from a ship-based expeditionary science to a distributed, observatory- based approach in which scientists continuously interact with instruments in the field. These new capabilities will facilitate the collection of long- term time series while also providing an interactive capability to conduct experiments using data streaming in real time. The U.S. National Science Foundation has funded the Ocean Observatories Initiative (OOI), which over the next 5 years will deploy infrastructure to expand scientists’ ability to remotely study the ocean. The OOI is deploying infrastructure that spans global, regional, and coastal scales. A global component will address planetary- scale problems using a new network of moored buoys linked to shore via satellite telecommunications. A regional cabled observatory will “wire” a single region in the northeastern Pacific Ocean with a high-speed optical and power grid. The coastal component will expand existing coastal observing assets to study the importance of high-frequency forcing on the coastal environment. These components will be linked by a robust cyberinfrastructure (CI) that will integrate marine observatories into a coherent system of systems. This CI infrastructure will also provide a Web- based social network enabled by real- time visualization and access to numerical model information, to provide the foundation for adaptive sampling science. Thus, oceanographers will have access to automated machine-to-machine sensor networks that can be scalable to increase in size and incorporate new technology for decades to come. A case study of this CI in action shows how a community of ocean scientists and engineers located throughout the United States at 12 different institutions used the automated ocean observatory to address daily adaptive science priorities in real time.

Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation

Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2009. Many Task Computing for Multidisciplinary Ocean Sciences: Real-Time Uncertainty Prediction and Data Assimilation. Conference on High Performance Networking and Computing, Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers (Portland, OR, 16 November 2009), 10pp. doi.acm.org/10.1145/1646468.1646482.

Error Subspace Statistical Estimation (ESSE), an uncertainty prediction and data assimilation methodology employed for real-time ocean forecasts, is based on a characterization and prediction of the largest uncertainties. This is carried out by evolving an error subspace of variable size. We use an ensemble of stochastic model simulations, initialized based on an estimate of the dominant initial uncertainties, to predict the error subspace of the model fields. The dominant error covariance (generated via an SVD of the ensemble-generated error covariance matrix) is used for data assimilation. The resulting ocean fields are provided as the input to acoustic modeling, allowing for the prediction and study of the spatiotemporal variations in acoustic propagation and their uncertainties. The ESSE procedure is a classic case of Many Task Computing: These codes are managed based on dynamic workflows for the: (i) perturbation of the initial mean state, (ii) subsequent ensemble of stochastic PE model runs, (iii) continuous generation of the covariance matrix, (iv) successive computations of the SVD of the ensemble spread until a convergence criterion is satisfied, and (v) data assimilation. Its ensemble nature makes it a many task data intensive application and its dynamic workflow gives it heterogeneity. Subsequent acoustics propagation modeling involves a very large ensemble of short-in-duration acoustics runs.

Towards Dynamic Data Driven Systems for Rapid Adaptive Interdisciplinary Ocean Forecasting

Patrikalakis, N.M., P.F.J. Lermusiaux, C. Evangelinos, J.J. McCarthy, A.R. Robinson, H. Schmidt, P.J. Haley, S. Lalis, R. Tian, W.G. Leslie, and W. Cho, 2009. Towards Dynamic Data Driven Systems for Rapid Adaptive Interdisciplinary Ocean Forecasting Invited paper in "Dynamic Data-Driven Application Systems''. F. Darema, Editor. Springer, 2009. In press.

The state of the ocean evolves and its dynamics involves transitions occurring at multiple scales. For efficient and rapid interdisciplinary forecasting, ocean observing and prediction systems must have the same behavior and adapt to the everchanging dynamics. The work discussed here aims to set the basis of a distributed system for real-time interdisciplinary ocean field and uncertainty forecasting with adaptive modeling and adaptive sampling. The scientific goal is to couple physical and biological oceanography with ocean acoustics. The technical goal is to build a dynamic system based on advanced infrastructures, distributed / grid computing and efficient information retrieval and visualization interfaces. Importantly, the system combines a suite of modern legacy physical models, acoustic models and ocean current monitoring data assimilation schemes with new adaptive modeling and adaptive sampling methods. The legacy systems are encapsulated at the binary level using software component methodologies. Measurement models are utilized to link the observed data to the dynamical model variables and structures. With adaptive sampling, the data acquisition is dynamic and aims to minimize the predicted uncertainties, maximize the sampling of key dynamics and maintain overall coverage. With adaptive modeling, model improvements are dynamic and aim to select the best model structures and parameters among different physical or biogeochemical parameterizations. The dynamic coupling of models and measurements discussed here represents a Dynamic Data-Driven Application System (DDDAS). Technical and scientific progress is highlighted based on examples in Massachusetts Bay, and Monterey Bay and the California Current System. Keywords: Oceanography, interdisciplinary, adaptive, sampling,

Web-Enabled Configuration and Control of Legacy Codes: An Application to Ocean Modeling

Evangelinos, C., P.F.J. Lermusiaux, S. Geiger, R.C. Chang, and N.M. Patrikalakis, 2006. Web-Enabled Configuration and Control of Legacy Codes: An Application to Ocean Modeling. Ocean Modeling, 13, 197-220.

For modern interdisciplinary ocean prediction and assimilation systems, a significant part of the complexity facing users is the very large number of possible setups and parameters, both at build-time and at run-time, especially for the core physical, biological and acoustical ocean predictive models. The configuration of these modeling systems for both local as well as remote execution can be a daunting and error-prone task in the absence of a graphical user interface (GUI) and of software that automatically controls the adequacy and compatibility of options and parameters. We propose to encapsulate the configurability and requirements of ocean prediction codes using an eXtensible Markup Language (XML) based description, thereby creating new computer-readable manuals for the executable binaries. These manuals allow us to generate a GUI, check for correctness of compilation and input parameters, and finally drive execution of the prediction system components, all in an automated and transparent manner. This web-enabled configuration and automated control software has been developed (it is currently in “beta” form) and exemplified for components of the interdisciplinary Harvard ocean prediction system (HOPS) and for the uncertainty prediction components of the error subspace statistical estimation (ESSE) system. Importantly, the approach is general and applies to other existing ocean modeling applications and to other “legacy” codes.

Rapid real-time interdisciplinary ocean forecasting using adaptive sampling and adaptive modeling and legacy codes: Component encapsulation using XML

Evangelinos C., R. Chang, P.F.J. Lermusiaux and N.M. Patrikalakis, 2003. Rapid real-time interdisciplinary ocean forecasting using adaptive sampling and adaptive modeling and legacy codes: Component encapsulation using XML. Lecture Notes in Computer Science, 2660, 375-384.

We present the high level architecture of a real-time interdisciplinary ocean forecasting system that employs adaptive elements in both modeling and sampling. We also discuss an important issue that arises in creating an integrated, web-accessible framework for such a system out of existing stand-alone components: transparent support for handling legacy binaries. Such binaries, that are most common in scientific applications, expect a standard input stream, maybe some command line options, a set of input files and generate a set of output files as well as standard output and error streams. Legacy applications of this form are encapsulated using XML. We present a method that uses XML documents to describe the parameters for executing a binary.

The development and demonstration of an advanced fisheries management information system

Robinson, A.R., B.J. Rothschild, W.G. Leslie, J.J. Bisagni, M.F. Borges, W.S. Brown, D. Cai, P. Fortier, A. Gangopadhyay, P.J. Haley, Jr., H.S. Kim, L. Lanerolle, P.F.J. Lermusiaux, C.J. Lozano, M.G. Miller, G. Strout and M.A. Sundermeyer, 2001. The development and demonstration of an advanced fisheries management information system. Proc. of the 17th Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Albuquerque, New Mexico. American Meteorological Society, 186-190.

Fishery management regulates size and species-specific fishing mortality to optimize biological production from the fish populations and economic production from the fishery. Fishery management is similar to management in industries and in natural resources where the goals of management are intended to optimize outputs relative to inputs. However, the management of fish populations is among the most difficult. The difficulties arise because (a) the dynamics of the natural production system are extremely complicated; involving an infinitude of variables and interacting natural systems and (b) the size-and species-specific fishing mortality (i.e. system control) is difficult to measure, calibrate, and deploy. Despite the difficulties, it is believed that significant advances can be made by employing a fishery management system that involves knowing the short-term (daily to weekly) variability in the structures of environmental and fish fields. We need new information systems that bring together existing critical technologies and thereby place fishery management in a total-systems feedback-control context. Such a system would monitor the state of the structure of all stocks simultaneously in near real-time, be adaptive to the evolving fishery and consider the effects of the environment and economics. To do this the system would need to (a) employ new in situ and remote sensors in innovative ways, (b) develop new data streams to support the development of new information, (c) employ modern modeling, information and knowledge-base technology to process the diverse information and (d) generate management advice and fishing strategies that would optimize the production of fish.

The Advanced Fisheries Management Information System (AFMIS), built through a collaboration of Harvard University and the Center for Marine Science and Technology at the University of Massachusetts at Dartmouth, is intended to apply state-of-the-art multidisciplinary and computational capabilities to operational fisheries management. The system development concept is aimed toward: 1) utilizing information on the “state” of ocean physics, biology, and chemistry; the assessment of spatially-resolved fish-stock population dynamics and the temporal-spatial deployment of fishing effort to be used in fishing and in the operational management of fish stocks; and, 2) forecasting and understanding physical and biological conditions leading to recruitment variability. Systems components are being developed in the context of using the Harvard Ocean Prediction System to support or otherwise interact with the: 1) synthesis and analysis of very large data sets; 2) building of a multidisciplinary multiscale model (coupled ocean physics/N-P-Z/fish dynamics/management models) appropriate for the northwest Atlantic shelf, particularly Georges Bank and Massachusetts Bay; 3) the application and development of data assimilation techniques; and, 4) with an emphasis on the incorporation of remotely sensed data into the data stream.

AFMIS is designed to model a large region of the northwest Atlantic (NWA) as the deep ocean influences the slope and shelves. Several smaller domains, including the Gulf of Maine (GOM) and Georges Bank (GB) are nested within this larger domain (Figure 1). This provides a capability to zoom into these domains with higher resolution while maintaining the essential physics which are coupled to the larger domain. AFMIS will be maintained by the assimilation of a variety of real time data. Specifically this includes sea surface temperature (SST), color (SSC), and height (SSH) obtained from several space-based remote sensors (AVHRR, SeaWiFS and Topex/Poseidon). The assimilation of the variety of real-time remotely sensed data supported by in situ data will allow nowcasting and forecasting over significant periods of time.

A real-time demonstration of concept (RTDOC) nowcasting and forecasting exercise to demonstrate important aspects of the AFMIS concept by producing real time coupled forecasts of physical fields, biological and chemical fields, and fish abundance fields took place in March-May 2000. The RTDOC was designed to verify the physics, to validate the biology and chemistry but only to demonstrate the concept of forecasting the fish fields, since the fish dynamical models are at a very early stage of development. In addition, it demonstrated the integrated system concept and the implication for future coupling of a management model. This note reports on the RTDOC.