Hybrid Modeling
Traditional, physics-based Earth system models - such as Community Earth System Model (CESM) and Model for Prediction Across Scales (MPAS) - derive their strength from first principles of physics: they explicitly encode conservation laws and physical mechanisms, making them interpretable, stable over long integrations, and capable of simulating novel forcing scenarios such as the future state of the Earth system. However, they are computationally expensive, rely on imperfect parameterizations of sub-grid-scale processes, and can require extensive tuning. Machine learning (ML) models, by contrast, excel at modeling intricate relationships and computational efficiency; they can learn complex, nonlinear relationships from data and generate forecasts or emulations at a fraction of the runtime cost. Yet they depend on the quality and range of their training data, may struggle to extrapolate to out-of-distribution environments, and do not inherently enforce physical constraints. These complementary strengths and weaknesses have motivated the development of hybrid approaches that combine physical structure with data-driven learning.
In Earth system science, such combinations include ML-based parameterizations that replace or augment subgrid physics, neural emulators of computationally expensive components (e.g., radiation or chemistry), ML-based bias correction layered atop dynamical cores, AI-assisted data assimilation, and neural operators embedded within physics solvers. Together, these strategies aim to retain the generality and interpretability of physics-based models while leveraging ML to improve efficiency, reduce bias, and better represent complex processes.
Towards a Machine Learning enhanced version of CESM (CESM-MLe)
The Community Earth System Model (CESM) project is working towards a hybrid machine learning enhanced version of CESM (CESM-MLe) to evaluate the hypothesis that AI/ML can provide the platform to make a step change in the accuracy and reliability of Earth System models and CESM (see Eyring et al., 2024, for further background and discussion). The hybrid modeling approach involves replacing some existing parameterizations with physics-aware ML parameterizations to improve the representation of subgrid-scale or poorly known physical, biological, or chemical processes. This hybrid approach, especially when combined with semi-automated model calibration methodologies that can harness the growing wealth of Earth observations, can enable models to represent better sub-grid scales or other poorly understood processes (Fig. 1). The CESM-MLe project has developed methodologies to calibrate the atmosphere and land components and is working towards coupled model calibration methods. And, there are several ML-based parameterizations in various stages of development and testing (Fig. 2). Progress towards CESM-MLe is in collaboration with the NSF Learning the Earth with Artificial Intelligence and Physics (LEAP) STC and the Multiscale Machine Learning in Coupled Earth System Modeling (M2LInES) project supported by Schmidt Sciences and others in the CESM community. An important milestone was software development that enables easy coupling between PyTorch and its AI/ML algorithmic ecosystem through FTorch to the CESM component models, thereby enabling community members to test their own ideas for ML parameterizations (documentation). While there is good promise in the hybrid modeling approach, there are also many challenges and opportunities that will need to be addressed including assessment of the reliability of new ML parameterizations in out-of-training Earth system variability, the potential for new CESM model instabilities, the fact that new simulations may degrade orthogonal simulation aspects, and new tuning challenges with some previously available tuning knobs potentially no longer available within the new ML parameterizations.
Progress towards CESM-MLe is being tracked here.
For more information or for partnership opportunities, please contact David Lawrence.

