Data Gaps (Beta) - More Info

About

Artificial intelligence (AI) and machine learning (ML) offer a powerful suite of tools to accelerate climate change mitigation and adaptation across different sectors. However, the lack of high-quality, easily accessible, and standardized data often hinders the impactful use of AI/ML for climate change applications.

In this project, Climate Change AI, with the support of Google DeepMind, aims to identify and catalog critical data gaps that impede AI/ML applications in addressing climate change, and lay out pathways for filling these gaps. In particular, we identify candidate improvements to existing datasets, as well as "wishes" for new datasets whose creation would enable specific ML-for-climate use cases. We hope that researchers, practitioners, data providers, funders, policymakers, and others will join the effort to address these critical data gaps.

Our list of critical data gaps is available at the following link: www.climatechange.ai/dev/datagaps. This page provides more details on the methodology through which this list was compiled, as well as on our taxonomy of data gaps.

This project is currently in its beta phase, with ongoing improvements to content and usability. We encourage you to provide input and contributions via the routes listed below, or by emailing us at datagaps@climatechange.ai. We are grateful to the many stakeholders and interviewees who have already provided input.

Methodology

Climate Change AI's list of critical data gaps was compiled via a combination of desk research and stakeholder interviews. Please check back soon for more details on our methodology, as well as a list of stakeholders interviewed.

Taxonomy of Data Gaps

Data gaps are classified within six categories: Wish, Obtainability, Usability, Reliability, Sufficiency, and Miscellaneous/Other.

Type W: Wish - Dataset does not exist.

Type O: Obtainability - Dataset is not easily obtainable.

Type U: Usability - Data is not readily usable.

Type R: Reliability - Data needs to be improved, validated, and/or verified.

Type S: Sufficiency - Data is insufficient and needs to be collected or simulated.

Type M: Miscellaneous/Other - Challenges or gaps that do not fit into the other categories, including challenges that arise from the use of multiple datasets.