Data assimilation via Error Subspace Statistical Estimation. Part I: Theory and schemes
A rational approach is used to identify efficient schemes for data assimilation in nonlinear ocean-atmosphere
models. The conditional mean, a minimum of several cost functionals, is chosen for an optimal estimate. After
stating the present goals and describing some of the existing schemes, the constraints and issues particular to
ocean-atmosphere data assimilation are emphasized. An approximation to the optimal criterion satisfying the
goals and addressing the issues is obtained using heuristic characteristics of geophysical measurements and
models. This leads to the notion of an evolving error subspace, of variable size, that spans and tracks the scales
and processes where the dominant errors occur. The concept of error subspace statistical estimation (ESSE) is
defined. In the present minimum error variance approach, the suboptimal criterion is based on a continued and
energetically optimal reduction of the dimension of error covariance matrices. The evolving error subspace is
characterized by error singular vectors and values, or in other words, the error principal components and
coefficients.
Schemes for filtering and smoothing via ESSE are derived. The data-forecast melding minimizes variance in
the error subspace. Nonlinear Monte Carlo forecasts integrate the error subspace in time. The smoothing is
based on a statistical approximation approach. Comparisons with existing filtering and smoothing procedures
are made. The theoretical and practical advantages of ESSE are discussed. The concepts introduced by the
subspace approach are as useful as the practical benefits. The formalism forms a theoretical basis for the
intercomparison of reduced dimension assimilation methods and for the validation of specific assumptions for
tailored applications. The subspace approach is useful for a wide range of purposes, including nonlinear field
and error forecasting, predictability and stability studies, objective analyses, data-driven simulations, model
improvements, adaptive sampling, and parameter estimation.