AI-Ready Data

All data hosted on NSF NCAR’s GDEX platform are accessible for AI and ML applications. In response to the growing demand for model training and evaluation datasets, NSF NCAR curates, standardizes, and publishes high-quality Earth system datasets optimized for AI-enabled research and discovery. AI-ready data refers to datasets stored in cloud-optimized formats (zarr or kerchunk) with standardized intake-ESM catalogs, designed so researchers can plug them directly into AI/ML workflows without extensive preprocessing.
Our AI-ready data holdings continue to expand as new observations, simulations, and community datasets are integrated.

Commonly used AI-Ready Datasets available on GDEX:

Reanalysis:

Convective permitting models output:

CESM Output:

CMIP Models Output:

CMIP subset of the Coupled Model Intercomparison Project model output 6

ECMWF Forecasts:

ECMWF IFS High-Resolution Operational Forecasts

Radar Data:

GridRad-Severe - Three-Dimensional Gridded NEXRAD WSR-88D Radar Data for Severe Events

Regional models output:

Other:

Data to support OSDF example workflows

For general inquiries and data support questions, please contact the NSF NCAR Research Data Help Desk at datahelp@ucar.edu or through the NSF NCAR Geoscience Data Exchange Help Desk.