Evangelinos, C., P.F.J. Lermusiaux, J. Xu, P.J. Haley, and C.N. Hill, 2011. Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean. IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing, I. Foster, I. Raicu and Y. Zhao (Guest Eds.), 22, doi: 10.1109/TPDS.2011.64.
Uncertainty prediction for ocean and climate predictions is essential for multiple applications today. Many-Task Computing
can play a significant role in making such predictions feasible. In this manuscript, we focus on ocean uncertainty prediction using the
Error Subspace Statistical Estimation (ESSE) approach. In ESSE, uncertainties are represented by an error subspace of variable size.
To predict these uncertainties, we perturb an initial state based on the initial error subspace and integrate the corresponding ensemble
of initial conditions forward in time, including stochastic forcing during each simulation. The dominant error covariance (generated via
SVD of the ensemble) is used for data assimilation. The resulting ocean fields are used as inputs for predictions of underwater sound
propagation. ESSE is a classic case of Many Task Computing: It uses dynamic heterogeneous workflows and ESSE ensembles are
data intensive applications. We first study the execution characteristics of a distributed ESSE workflow on a medium size dedicated
cluster, examine in more detail the I/O patterns exhibited and throughputs achieved by its components as well as the overall ensemble
performance seen in practice. We then study the performance/usability challenges of employing Amazon EC2 and the Teragrid to
augment our ESSE ensembles and provide better solutions faster.