Stephan Rasp (Technical University of Munich); Soukayna Mouatadid (University of Toronto); Peter Dueben (European Centre for Medium-Range Weather Forecasts (ECMWF)); Sebastian Scher (Stockholm University); Jonathan Weyn (University of Washington); Nils Thuerey (firstname.lastname@example.org)
Accurate weather forecasts are a crucial prerequisite for climate change adaptation. Can these be provided by deep learning? First studies show promise, but the lack of a common dataset and evaluation metrics make inter-comparison between the proposed models difficult. In fact, despite the recent research surge in data-driven weather forecasting, there is currently no standard approach for evaluating the proposed models. Here we introduce WeatherBench, a benchmark dataset for data-driven medium-range weather forecasting. We provide data derived from an archive of assimilated earth observations for the last 40 years that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different proposed methods. Further, we provide baseline scores from simple linear regression techniques, purely physical forecasting models as well as existing deep learning weather forecasting models. All data and code are made publicly available along with tutorials for getting started. We believe WeatherBench will provide a useful and reproducible way of evaluating data-driven weather forecasting models and we hope that it will accelerate research in this direction.