Public Policy

Blog Posts

Discussion Seminars and Webinars

Innovation Grants

Talks

Workshop Papers

Venue Title
NeurIPS 2025 EcoEval: A Benchmark for Evaluating Large Language Model Handling of Climate Change Misinformation, False Beliefs, and Climate Policy Sentiment (Papers Track)
Abstract and authors: (click to expand)

Abstract: As Large Language Models (LLMs) become primary sources of factual knowledge, their ability to accurately communicate climate science, resist misinformation, and provide balanced policy guidance becomes critically important. However, existing evaluation frameworks lack a comprehensive assessment of LLM performance across the multifaceted challenges of climate communication. We introduce EcoEval, an open-source benchmark evaluating LLM performance across three dimensions: (1) giving users correct information, while correcting user misconceptions, (2) avoiding generation of fabricated climate content, and (3) expressing balanced climate policy sentiment. Our results span 8 commercially deployed models, revealing substantial variation in policy sentiment, sycophancy, and willingness to generate misinformation.

Authors: Nick Lechtenboerger (HPI); Pat Pataranutaporn (MIT Media Lab); Pattie Maes (MIT Media Lab)

NeurIPS 2025 ML-IAM: Emulating Integrated Assessment Models With Machine Learning (Papers Track)
Abstract and authors: (click to expand)

Abstract: Integrated Assessment Models (IAMs) are essential for projecting future greenhouse gas (GHG) emissions and energy outputs, but they are computationally expensive and limited by model-specific idiosyncrasies. We present ML-IAM, a machine learning model trained on the AR6 Scenarios Database to emulate IAMs. ML-IAM generates results for new scenarios in seconds, avoids convergence failures, and produces model-agnostic outputs by learning from diverse model families. Among the tested models, XGBoost achieves the best performance with an $R^2$ of 0.98 with the original IAM data. ML-IAM enables rapid exploration of climate scenarios, complementing traditional IAMs with efficient and scalable computation for climate policy analysis.

Authors: Yen Shin (KAIST); Haewon McJeon (KAIST); Changyoon Lee (KAIST); Eunsu Kim (KAIST); Junho Myung (KAIST); Kiwoong Park (KAIST); Jung-Hun Woo (Seoul National University); Min-Young Choi (Seoul National University); Bomi Kim (Seoul National University); Hyun W. Ka (KAIST); Alice Oh (KAIST)

NeurIPS 2025 Reflexive Evidence-Based Multimodal Learning for Clean Energy Transitions: Causal Insights on Cooking Fuel Access, Urbanization, and Carbon Emissions (Papers Track)
Abstract and authors: (click to expand)

Abstract: Achieving Sustainable Development Goal 7 (Affordable and Clean Energy) requires not only technological innovation but also a deeper understanding of the socio-economic factors that influence energy access and carbon emissions. Despite growing attention to these drivers, key questions remain, particularly regarding how to quantify socio-economic impacts, how these impacts interact across domains such as policy, technology, and infrastructure, and how feedback processes shape energy systems. To address these gaps, this study introduces ClimateAgents, an AI-based framework that combines large language models with domain-specialized agents to support hypothesis generation and scenario exploration. Leveraging 20 years of socio-economic and emissions data from 265 economies, countries and regions, and 98 indicators drawn from the World Bank database, the framework applies a machine learning–based causal inference approach to identify key determinants of carbon emissions in an evidence-based, data-driven manner. The analysis highlights three primary drivers: (1) access to clean cooking fuels in rural areas, (2) access to clean cooking fuels in urban areas, and (3) the percentage of population living in urban areas. These findings underscore the critical role of clean cooking technologies and urbanization patterns in shaping emission outcomes. In line with growing calls for evidence-based AI policy, ClimateAgents offers a modular and reflexive learning system that supports the generation of credible and actionable insights for policy. By integrating heterogeneous data modalities, including structured indicators, policy documents, and semantic reasoning, the framework contributes to adaptive policymaking infrastructures that can evolve with complex socio-technical challenges. This approach aims to support a shift from siloed modeling to reflexive, modular systems designed for dynamic, context-aware climate action.

Authors: Shan Shan (Zhejiang University)

NeurIPS 2025 Machine learning discovery of regional and social disparities in electric vehicle charging reliability with GPT-5 (Papers Track)
Abstract and authors: (click to expand)

Abstract: There is growing interest in studying charger reliability to address persistent barriers to electric vehicle (EV) adoption and advance the decarbonization of transportation, one of the largest emitting sectors globally. Improved measurement of charger reliability is critically needed to accelerate network effects to promote EV adoption, develop pay-as-you-use infrastructure, and aggregate intelligence for more responsive service operations. However, prior methods for assessing charger reliability, which typically rely on citizen-generated data and expensive expert annotation/supervision, have proven inadequate for identifying regional and social disparities in charging performance. Prior architectures have often lacked the detection accuracy necessary for large-scale inference, especially with imbalanced datasets. This study introduces a machine learning pipeline that detects spatial disparities in charger reliability based on 838,785 U.S. consumer reviews of their experiences. We document new performance benchmarks in reliability detection using zero and few shot learning capabilities and expert counterfactual reasoning (F1 score: 0.97, SD: 0.02), outperforming previous models in the domain of electric mobility, such as ClimateBERT. To enable spatial analyses, we further demonstrate how reliability measures can be combined with popular diversity indices to inform economic and policy decision-making. Using this approach, we find evidence of widespread charging reliability issues in about half of all U.S. counties (1,653 of 3,244 counties), especially in the most populated areas. Disparities in charger reliability are most pronounced in metropolitan areas and along federally-designated EV corridors, raising concerns about inconsistent user experiences in high-traffic zones. This scalable and evidence-based approach to data discovery can be integrated into a wide range of causal inference and prediction settings in electric mobility.

Authors: Yifan Liu (Georgia Institute of Technology); Lindsey Snyder (Georgia Institute of Technology); Omar Asensio (Georgia Institute of Technology)

NeurIPS 2025 Bayesian Methods for Enhanced Greenhouse Gas Emissions Inventories (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Developing effective mitigation strategies for greenhouse gas reduction hinges on accurate emissions and metadata tracking to identify the most impactful reduction opportunities. Given that emissions cannot be perfectly and ubiquitously observed, constructing inventories entails fusing data from multiple sources that are of varying levels of fidelity, quality, and completeness. This proposal suggests that Bayesian models, powered by modern probabilistic programming frameworks, can integrate multiple data sources data into posterior emissions estimates while also accounting for incompleteness and leveraging data from less granular spatiotemporal scales. A preliminary analysis combining country-level steel production data and facility-level activity data shows promise for estimating emissions reduction potential when there is a population of facilities that have not been directly observed

Authors: Michael Pekala (JHU/APL); Michael Pekala (JHU/APL)

NeurIPS 2025 A modular framework to run AI-based models from high-resolution climate projections (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Recent advances in AI-based weather and climate models promise transformative improvements in forecasting, yet their integration with state-of-the-art climate simulations remains constrained by data heterogeneity. High-resolution projections, such as those from the Destination Earth Climate Digital Twin, are produced at 5 km resolution with specialized grids, formats, and variable sets that are incompatible with most AI models. Current integration efforts are ad-hoc, model-specific, and difficult to reproduce, slowing progress and limiting large-scale evaluation. Taking advantage of decades of experience in climate workflows, we propose a modular, source-agnostic framework that enables systematic and reproducible execution of AI-based climate models across diverse datasets and high-performance computing environments. The framework standardizes data preparation, automates model execution through containerized workflows, and provides built-in post-processing and evaluation tools. Preliminary experiments show that the framework reproduces and extends recent studies on AI model robustness in future climates with minimal technical overhead.

Authors: Aina Gaya-Àvila (Barcelona Supercomputing Center); Amirpasha Mozaffari (Barcelona Supercomputing Center); Amanda Duarte (Barcelona Supercomputing Center); Oriol Tintó Prims (Barcelona Supercomputing Center)

NeurIPS 2025 From Sparse to Representative: Machine Learning to Densify IAM Scenario Ensembles for Policy Insight (Proposals Track)
Abstract and authors: (click to expand)

Abstract: This research addresses the challenge of extracting policy-relevant insights from Integrated Assessment Model (IAM) scenario ensembles, which are often sparse, non-representative, and inaccessible to non-experts. We propose a machine learning framework preserving high-dimensional dependencies between variables, enabling generation of plausible in-gap scenarios when one or more outputs are constrained. The intended output is a simplified exploration space for policymakers concerned with crucial climate policy exploration.

Authors: Georgia Ray (Imperial College London)

NeurIPS 2025 GeoWaste: Leveraging GIS and Machine Learning for Urban Waste Management in African Cities (Proposals Track) Best Pathway to Impact
Abstract and authors: (click to expand)

Abstract: Waste collection in many cities in Africa remains ineffective, with little reliable data, fragmented reporting, and static truck routes, all of which lead to increased greenhouse gas emissions and overflowing bins. We present GeoWaste, a GIS and machine learning-driven system that integrates open-source geospatial datasets, GPS-enabled collection trucks, citizen geo-tagged reports, and fill-level sensors to deliver an integrated spatio-temporal waste database. GeoWaste generates optimised routes and makes forecasts using a Random Forest regressor to predict waste volumes, heuristic solutions for routing, and clustering (K-Means, DBSCAN) to locate hotspots. A pilot was done in Yenagoa, the state capital of Bayelsa State, covering 120 bins and 8540 households. Service coverage was increased from 62% to 87%, average collection time was reduced by 32%, and truck fuel use was reduced by 28%. GeoWaste demonstrates a scalable pathway for data-driven, climate-smart urban resilience.

Authors: Bakumor Yolo (University of Calabar)

NeurIPS 2025 Tracking the spread of climate change skepticism on X with simulations and deep learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Climate change continues to be a global challenge that requires urgent action. However, the ongoing presence of climate skepticism undermines society's ability to confront this important challenge. Understanding the mechanisms driving the spread of climate skepticism might give policymakers additional tools to combat climate change. Here, we propose a methodological approach that combines computational simulation (in the form of an agent-based model representing online X communication) with simulation-based inference using amortized deep neural networks. Our approach allows us to infer the relative importance of a variety of different learning strategies that can contribute to the spread of climate skepticism and support.

Authors: Uwaila Ekhator (Boise State University); Mason Youngblood (Institute for Advanced Computational Science, Stony Brook University); Vicken Hillis (Boise State University)

NeurIPS 2025 AI Agents For Decision-Making in Climate Governance Using Policy Benchmarks (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Climate change governance requires navigating complex policy documents, including treaties, regulations, and socio-political frameworks. Understanding these texts is essential for evidence-based decision-making but remains challenging due to their complexity and domain specificity. This study explores the potential of AI agents to support policy reasoning and decision-making through structured evaluation on climate policy benchmarks, with a focus on dynamic governance scenarios. Drawing on global frameworks such as the UN Sustainable Development Goals (UNSDGs) and IPCC assessment pathways, this study evaluates agents using datasets such as Climate-FEVER (factual claim verification), LegalBench (legal reasoning), and PolicyQA (policy question answering). Target tasks include treaty interpretation, socio-political analysis, adaptation policy reasoning, and scenario-based planning. This study introduces a hybrid evaluation framework combining expert assessment and interdisciplinary feedback to systematically benchmark AI agents’ performance in climate governance, identifying their strengths, limitations, and potential for real-world support. It aims to bridge AI and climate governance, a tale of two systems, into a tale of collaboration.

Authors: Shan Shan (Zhejiang University)

ICLR 2025 Predicting extreme weather impacts on physical activity and sleep patterns using real-world data from wrist-worn accelerometers (Papers Track)
Abstract and authors: (click to expand)

Abstract: The increasing frequency of extreme weather events, such as heat waves, is among the most pressing consequences of climate change, with profound implications for human health and well-being. Despite increasing incidence of extreme weather events globally, there is a lack of understanding on the impact of hot weather on health outcomes. In this study, we utilized machine learning techniques to explore how variations in outdoor temperature influence physical activity and sleep patterns, two critical determinants of physical and mental health. Using data from 90,434 participants in the UK Biobank, recorded via wrist-worn accelerometers, linked with meteorological data from the UK Met Office, we analysed the relationship between outdoor temperature (5°C to 30°C) and daily magnitudes and durations of a) physical activity and b) sleep, whilst adjusting for sociodemographic, clinical, lifestyle, seasonality, precipitation, and regional variables. Our results reveal that moderate-to-vigorous physical activity (MVPA) increases with temperature, reaching its peak at 25°C, but plateaus thereafter. Conversely, sedentary behaviour and sleep disturbances significantly intensify as temperatures reach 30. Here tested in UK settings, our approach is generalisable to other climatic regions and determinants of health and should be further investigated in regions with high climate-vulnerability. These findings emphasize the role of machine learning in identifying health risks associated with climate change and underscore the necessity of climate-adaptive public health strategies to mitigate these effects.

Authors: Sara khalid (University of Oxford)

ICLR 2025 Large Language Models for Monitoring Dataset Mentions in Climate Research (Papers Track)
Abstract and authors: (click to expand)

Abstract: Effective climate change research relies on diverse datasets to inform mitigation and adaptation strategies and policies. However, the ways these datasets are cited, used, and distributed remain poorly understood. This paper presents a machine learning framework that automates the detection and classification of dataset mentions in climate research papers. Leveraging large language models (LLMs), we generate a weakly supervised dataset through zero-shot extraction, quality assessment via an LLM-as-a-Judge, and refinement by a reasoning agent. The Phi-3.5-mini instruct model is pre-fine-tuned on this dataset, followed by fine-tuning on a smaller manually annotated subset to specialize in extracting data mentions. At inference, a ModernBERT-based classifier filters for dataset mentions, optimizing computational efficiency. Evaluated on a held-out manually annotated sample, our fine-tuned model outperforms NuExtract-v1.5 and GLiNER-large-v2.1 in dataset extraction accuracy. As a framework for monitoring dataset mentions in research papers, this approach enhances transparency, identifies data gaps, and enables researchers, funders, and policymakers to improve data discoverability and usage, leading to more informed decision-making.

Authors: Aivin Solatorio (The World Bank); Rafael Macalaba (The World Bank); James Liounis (The World Bank)

ICLR 2025 Tracking ESG Disclosures of European Companies with Retrieval-Augmented Generation (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Corporations play a crucial role in mitigating climate change and accelerating progress toward environmental, social, and governance (ESG) objectives. However, structured information on the current state of corporate ESG efforts remains limited. In this paper, we propose a machine learning framework based on a retrieval-augmented generation (RAG) pipeline to track ESG indicators from N=9,200 corporate reports. Our analysis includes ESG indicators from 600 of the largest listed corporations in Europe between 2014 and 2023. We focus on two key dimensions: first, we identify gaps in corporate sustainability reporting in light of existing standards. Second, we provide comprehensive bottom-up estimates of key ESG indicators across European industries. Our findings enable policymakers and financial markets to effectively assess corporate ESG transparency and track progress toward global sustainability objectives.

Authors: Kerstin Forster (LMU Munich & Munich Center for Machine Learning); Victor Wagner (LMU Munich & Sustainability Reporting Navigator); Lucas Elias Keil (University of Cologne & Sustainability Reporting Navigator); Maximilian A. Müller (University of Cologne & Sustainability Reporting Navigator); Thorsten Sellhorn (LMU Munich & Sustainability Reporting Navigator); Stefan Feuerriegel (LMU Munich & Munich Center for Machine Learning)

ICLR 2025 Modelling the Doughnut of social and planetary boundaries with machine learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Most national governments pursue GDP growth as a primary objective. However, measures of human well-being correlate with GDP only up until a point, and evidence shows that GDP growth is tightly coupled to environmental degradation. Achieving a high level of human well-being within planetary boundaries may thus require new policy approaches that move beyond the pursuit of GDP, such as those advocated within the "post-growth" literature. A popular framework here is the "Doughnut" of social and planetary boundaries, which has inspired the development of ecological macroeconomic models that incorporate sustainable thresholds for both social and environmental indicators. Machine learning (ML) can enhance these models in many ways. Here, we focus on two core aspects: searching desired model behavior and optimizing transitions towards it. We apply standard ML techniques to a simple consumer-resource model, exploring how different consumption and efficiency policies impact sustainability. Using a random forest classifier, we identify policy conditions that align with the Doughnut framework, providing an interpretable pathway to sustainability. Additionally, reinforcement learning (RL) can optimize trajectories in the model parameter space to reach model behavior corresponding to a sustainable regime. While ML methods present challenges, such as the number of data points and hyperparameter optimization for classification, they also offer several useful tools, including for data sampling optimization and model explainability. Overall, our proposal shows how ML can support the development of ecological macroeconomic models that address the complexity of achieving good social outcomes for all people within planetary boundaries.

Authors: Stefano Vrizzi (Ecole Normale Superieure); Daniel O'Neill (Universitat de Barcelona)

NeurIPS 2024 InvestESG: A Multi-agent Reinforcement Learning Benchmark for Studying Climate Investment as a Social Dilemma (Papers Track)
Abstract and authors: (click to expand)

Abstract: InvestESG is a novel multi-agent reinforcement learning (MARL) benchmark designed to study the impact of Environmental, Social, and Governance (ESG) disclosure mandates on corporate climate investments. Supported by both PyTorch and JAX implementation, the benchmark models an intertemporal social dilemma where companies balance short-term profit losses from climate mitigation efforts and long-term benefits from reducing climate risk, while ESG-conscious investors attempt to influence corporate behavior through their investment decisions, in a scalable and hardware-accelerated manner. Companies allocate capital across mitigation, greenwashing, and resilience, with varying strategies influencing climate outcomes and investor preferences. Our experiments show that without ESG-conscious investors with sufficient capital, corporate mitigation efforts remain limited under the disclosure mandate. However, when a critical mass of investors prioritizes ESG, corporate cooperation increases, which in turn reduces climate risks and enhances long-term financial stability. Additionally, providing more information about global climate risks encourages companies to invest more in mitigation, even without investor involvement. Our findings align with empirical research using real-world data, highlighting MARL's potential to inform policy by providing insights into large-scale socio-economic challenges through efficient testing of alternative policy and market designs.

Authors: Xiaoxuan Hou (University of Washington); Jiayi Yuan (University of Washington); Natasha Jaques (University of Washington)

NeurIPS 2024 Climate Impact Assessment Requires Weighting: Introducing the Weighted Climate Dataset (Papers Track)
Abstract and authors: (click to expand)

Abstract: High-resolution gridded climate data are readily available from multiple sources, yet climate research and decision-making increasingly require country and region-specific climate information weighted by socio-economic factors. Moreover, the current landscape of disparate data sources and inconsistent weighting methodologies exacerbates the reproducibility crisis and undermines scientific integrity. To address these issues, we have developed a globally comprehensive dataset at both country (GADM0) and region (GADM1) levels, encompassing various climate indicators (precipitation, temperature, SPEI, wind gust). Our methodology involves weighting gridded climate data by population density, night-time light intensity, cropland area, and concurrent population count – all proxies for socio-economic activity – before aggregation. We process data from multiple sources, offering daily, monthly, and annual climate variables spanning from 1900 to 2023. A unified framework streamlines our preprocessing steps, and rigorous validation against leading climate impact studies ensures data reliability. The resulting Weighted Climate Dataset is publicly accessible through an online dashboard at https://weightedclimatedata.streamlit.app/.

Authors: Marco Gortan (University of Basel); Lorenzo Testa (Carnegie Mellon University); Giorgio Fagiolo (Sant'Anna School of Advanced Studies); Francesco Lamperti (Sant'Anna School of Advanced Studies)

NeurIPS 2024 Learning the Indicators of Energy Burden for Knowledge Informed Policy (Papers Track)
Abstract and authors: (click to expand)

Abstract: The United States is one of the largest energy consumers per capita, which puts an expectation on households to have adequate energy expenditures to keep up with modern society. This adds additional stress on low-income households that may need to limit energy use due to financial constraints. This paper investigates energy burden, the ratio of household energy bills to household income, within the United States West. Self-Organizing Maps, an unsupervised neural network, is used to learn the indicators attributed to energy burden to inform public policy. This is one of the first studies to consider environmental justice indicators, which include outdoor air quality metrics and health disparities as energy burden indicators. The results show significant (p<0.05) differences among high energy burden areas and those with no energy burden for the environmental justice indicators. Thus, beyond the socioeconomic hardships of marginalized communities, counties with high energy burden suffer from environmental and health hazards, which will be amplified under a changing climate.

Authors: Jasmine Garland (University of Colorado Boulder); Rajagopalan Balaji (University of Colorado, Boulder); Kyri Baker (University of Colorado, Boulder); Ben Livneh (University of Colorado, Boulder)

NeurIPS 2024 Wildflower Monitoring with Expert-annotated Images and Flowering Phenology (Papers Track)
Abstract and authors: (click to expand)

Abstract: Understanding biodiversity trends is essential for preservation policy planning, and advanced computer vision solutions now enable large-scale automated monitoring for many biodiversity use cases. Wildflower monitoring, in particular, presents unique challenges. Visual similarities in shape and color may exist between different species, while flowers within a species may have significant visual differences. Moreover, flowers follow a growth cycle and look distinctly different over the year, while different species flower at different times of the year. Having access to flowering phenology, more accurate predictions may be made. We propose a novel multi-modal wildflower monitoring task to better identify species, levering both expert-annotated wildflower images and flowering phenology estimates. Moreover, we benchmark several state-of-the-art models using two groups of common wildflower species that have high inter-class similarity, and show that this multi-modal approach significantly outperforms image-only baselines. With this work, we aim to encourage the development of standards for automated wildflower monitoring as a step towards bending the curve of biodiversity loss. The data and the code are publicly available https://georgianagmanolache.github.io/wildflowerpower/

Authors: Georgiana Manolache (Fontys University of Applied Science); Gerard Schouten (Fontys University of Applied Sciences)

NeurIPS 2024 Making Climate AI Systems Past and Future Aware to Better Evaluate Climate Change Policies (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Addressing the issues faced by climate change necessitates appropriate methodologies for evaluating climate policies, particularly when discussing long-term and real-world scenarios. While large language models (LLMs) have altered artificial intelligence, they ultimately fall short of connecting historical data with future estimates. We propose an agentic LLM system that would address this gap by considering and analyzing the probable outcomes of the user-specified climate policy inside the practical settings. Further, we propose using knowledge graphs to model the existing data about the impact of climate policies along with allowing our system to access the data about future climate predictions. Done this way, the model can peek into the past (previous policies) and the future (climate scenarios forecast), paving the way for agencies to evaluate and design strategies and plans for climate change more effectively.

Authors: Riya . (IIT Roorkee); Sudhakar Singh (Nvidia)

NeurIPS 2024 Large language model co-pilot for transparent and trusted life cycle assessment comparisons (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Intercomparing life cycle assessments (LCA), a common type of sustainability and climate model, is difficult due to basic differences in fundamental assumptions, especially in the goal and scope definition stage. This complicates decision-making and the selection of climate-smart policies, as it becomes difficult to compare optimal products and processes between different studies. To aid policymakers and LCA practitioners alike, we plan to leverage large language models (LLM) to build a database containing documented assumptions for LCAs across the agricultural sector, with a case study on livestock management. The articles for this database are identified in a systematic literature search, then processed to extract relevant assumptions about the goal and scope definition of the LCA and inserted into a vector database. We then leverage this database to develop an AI co-pilot by augmenting LLMs with retrieval augmented generation to be used by stakeholders and LCA practitioners alike. This co-pilot will accrue two major benefits: 1) enhance the decision-making process through facilitating comparisons among LCAs to enable policymakers to adopt data-driven climate policies and 2) encourage the use of common assumptions by LCA practitioners. Ultimately, we hope to create a foundational model for LCA tasks that can plug-in with existing open source LCA software and tools.

Authors: Nathan Preuss (Cornell University); Fengqi You (Cornell University)

ICLR 2024 Identifying Climate Targets in National Laws and Policies using Machine Learning (Papers Track)
Abstract and authors: (click to expand)

Abstract: Quantified policy targets are a fundamental element of climate policy, typically characterised by domain-specific and technical language. Current methods for curating comprehensive views of global climate policy targets entail significant manual effort. At present there are few scalable methods for extracting climate targets from national laws or policies, which limits policymakers’ and researchers’ ability to (1) assess private and public sector alignment with global goals and (2) inform policy decisions. In this paper we present an approach for extracting mentions of climate targets from national laws and policies. We create an expert-annotated dataset identifying three categories of target (’Net Zero’, ’Reduction’ and ’Other’ (e.g. renewable energy targets)) and train a classifier to reliably identify them in text. We investigate bias and equity impacts related to our model and identify specific years and country names as problematic features. We explore the dataset generated from applying our classifier to the Climate Policy Radar (CPR) dataset, showcasing the potential for automated data collection and research support in climate policy. Our work represents a significant upgrade in the accessibility of these key climate policy elements for policymakers and researchers.

Authors: Matyas Juhasz (Climate Policy Radar); Tina Marchand (Climate Policy Radar); Roshan Melwani (Climate Policy Radar); Kalyan Dutia (Climate Policy Radar); Sarah Goodenough (Climate Policy Radar); Harrison Pim (Climate Policy Radar); Henry Franks (Climate Policy Radar)

ICLR 2024 EU Climate Change News Index: Forecasting EU ETS prices with online news (Papers Track)
Abstract and authors: (click to expand)

Abstract: Carbon emission allowance prices have been rapidly increasing in the EU since 2018 and accurate forecasting of EU Emissions Trading System (ETS) prices has become essential. This paper proposes a novel method to generate alternative predictors for daily ETS price returns using relevant online news information. We devise the EU Climate Change News Index by calculating the term frequency–inverse document frequency (TF–IDF) feature for climate change-related keywords. The index is capable of tracking the ongoing debate about climate change in the EU. Finally, we show that incorporating the index in a simple predictive model significantly improves forecasts of ETS price returns.

Authors: Aron Pap (BGSE); Aron D Hartvig (Corvinus University of Budapest, Cambridge Econometrics); Péter Pálos (Budapest University of Technology and Economics)

ICLR 2024 Deep Gaussian Processes and inversion for decision support in model-based climate change mitigation and adaptation problems (Papers Track)
Abstract and authors: (click to expand)

Abstract: To inform their decisions, policy makers often rely on models developed by researchers that are computationally intensive and complex and that frequently run on High Performance Computers (HPC). These decision-support models are not used directly by deciders and the results of these models tend to be presented by experts as a limited number of potential scenarios that would result from a limited number of potential policy choices. Machine learning models such as Deep Gaussian Processes (DGPs) can be used to radically re-define how decision makers can use models by creating a ‘surrogate model’ or ‘emulator’ of the original model. Surrogate models can then be embedded into apps that decisions makers can use to directly explore a vast array of policy options corresponding to potential target outcomes (model inversion). To illustrate the mechanism, we give an example of application that is envisaged as part of the UK government’s Net Zero strategy. To achieve Net Zero CO2 emissions by 2050, the UK government is considering multiple options that include planting trees to capture carbon. However, the amount of CO2 captured by the trees depend on a large number of factors that include climate conditions, soil type, soil carbon, tree type, ... Depending on these factors the net balance of carbon removal after planting trees may not necessarily be positive. Hence, choosing the right place to plant the right tree is very important. A decision-helping model has been developed to tackle this problem. For a given policy input, the model outputs its impact in terms of CO2 sequestration, biodiversity and other ecosystem services. We show how DGPs can be used to create a surrogate model of this original afforestation model and how these can be embedded into an R shiny app that can then be directly used by decision makers.

Authors: bertrand nortier (University of Exeter); daniel williamson (University of Exeter); mattia mancini (University of Exeter); amy binner (University of Exeter); brett day (University of Exeter); ian bateman (University of Exeter)

NeurIPS 2023 Machine learning for gap-filling in greenhouse gas emissions databases (Papers Track)
Abstract and authors: (click to expand)

Abstract: Greenhouse Gas (GHG) emissions datasets are often incomplete due to inconsistent reporting and poor transparency. Filling the gaps in these datasets allows for more accurate targeting of strategies to accelerate the reduction of GHG emissions. This study evaluates the potential of machine learning methods to automate the completion of GHG datasets. We use 3 datasets of increasing complexity with 18 different gap-filling methods and provide a guide to which methods are useful in which circumstances. If few dataset features are available, or the gap consists only of a missing time step in a record, then simple interpolation is often the most accurate method and complex models should be avoided. However, if more features are available and the gap involves non-reporting emitters, then machine learning methods can be more accurate than simple extrapolation. Furthermore, the secondary output of feature importance from complex models allows for data collection prioritisation to accelerate the improvement of datasets. Graph based methods are particularly scalable due to the ease of updating predictions given new data and incorporating multimodal data sources. This study can serve as a guide to the community upon which to base ever more integrated frameworks for automated detailed GHG emissions estimations, and implementation guidance is available at https://hackmd.io/@luke-scot/ML-for-GHG-database-completion.

Authors: Luke Cullen (University of Cambridge); Andrea Marinoni (UiT the Arctic University of Norway); Jonathan M Cullen (University of Cambridge)

NeurIPS 2023 Can Reinforcement Learning support policy makers? A preliminary study with Integrated Assessment Models (Papers Track)
Abstract and authors: (click to expand)

Abstract: Governments around the world aspire to ground decision-making on evidence. Many of the foundations of policy making — e.g. sensing patterns that relate to societal needs, developing evidence-based programs, forecasting potential outcomes of policy changes, and monitoring effectiveness of policy programs — have the potential to benefit from the use of large-scale datasets or simulations together with intelligent algorithms. These could, if designed and deployed in a way that is well grounded on scientific evidence, enable a more comprehensive, faster, and rigorous approach to policy making. Integrated Assessment Models (IAM) is a broad umbrella covering scientific models that attempt to link main features of society and economy with the biosphere into one modelling framework. At present, these systems are probed by by policy makers and advisory groups in a hypothesis-driven manner. In this paper, we empirically demonstrate that modern Reinforcement Learning can be used to probe IAMs and explore the space of solutions in a more principled manner. While the implication of our results are modest since the environment is simplistic, we believe that this is a stepping stone towards more ambitious use cases, which could allow for effective exploration of policies and understanding of their consequences and limitations.

Authors: Theodore LM Wolf (Carbon Re); Nantas Nardelli (CarbonRe); John Shawe-Taylor (University College London); Maria Perez-Ortiz (University College London)

NeurIPS 2023 ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate Statements? (Papers Track)
Abstract and authors: (click to expand)

Abstract: Evaluating the accuracy of outputs generated by Large Language Models (LLMs) is especially important in the climate science and policy domain. We introduce the Expert Confidence in Climate Statements (ClimateX) dataset, a novel, curated, expert-labeled dataset consisting of 8094 climate statements collected from the latest Intergovernmental Panel on Climate Change (IPCC) reports, labeled with their associated confidence levels. Using this dataset, we show that recent LLMs can classify human expert confidence in climate-related statements, especially in a few-shot learning setting, but with limited (up to 47%) accuracy. Overall, models exhibit consistent and significant over-confidence on low and medium confidence statements. We highlight implications of our results for climate communication, LLMs evaluation strategies, and the use of LLMs in information retrieval systems.

Authors: Romain Lacombe (Stanford University); Kerrie Wu (Stanford University); Eddie Dilworth (Stanford University)

NeurIPS 2023 Monitoring Sustainable Global Development Along Shared Socioeconomic Pathways (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Sustainable global development is one of the most prevalent challenges facing the world today, hinging on the equilibrium between socioeconomic growth and environmental sustainability. We propose approaches to monitor and quantify sustainable development along the Shared Socioeconomic Pathways (SSPs), including mathematically derived scoring algorithms, and machine learning methods. These integrate socioeconomic and environmental datasets, to produce an interpretable metric for SSP alignment. An initial study demonstrates promising results, laying the groundwork for the application of different methods to the monitoring of sustainable global development.

Authors: Michelle Wan (University of Cambridge); Jeff Clark (University of Bristol); Edward Small (Royal Melbourne Institute of Technology); Elena Fillola (University of Bristol); Raul Santos Rodriguez (University of Bristol)

NeurIPS 2023 Understanding Climate Legislation Decisions with Machine Learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Effective action is crucial in order to avert climate disaster. Key in enacting change is the swift adoption of climate positive legislation which advocates for climate change mitigation and adaptation. This is because government legislation can result in far-reaching impact, due to the relationships between climate policy, technology, and market forces. To advocate for legislation, current strategies aim to identify potential levers and obstacles, presenting an opportunity for the application of recent advances in machine learning language models. Here we propose a machine learning pipeline to analyse climate legislation, aiming to investigate the feasibility of natural language processing for the classification of climate legislation texts, to predict policy voting outcomes. By providing a model of the decision making process, the proposed pipeline can enhance transparency and aid policy advocates and decision makers in understanding legislative decisions, thereby providing a tool to monitor and understand legislative decisions towards climate positive impact.

Authors: Jeff Clark (University of Bristol); Michelle Wan (University of Cambridge); Raul Santos Rodriguez (University of Bristol)

ICLR 2023 Mining Effective Strategies for Climate Change Communication (Papers Track)
Abstract and authors: (click to expand)

Abstract: With the goal of understanding effective strategies to communicate about climate change, we build interpretable models to rank tweets related to climate change with respect to the engagement they generate. Our models are based on the Bradley-Terry model of pairwise comparison outcomes and use a combination of the tweets’ topic and metadata features to do the ranking. To remove confounding factors related to author popularity and minimise noise, they are trained on pairs of tweets that are from the same author and around the same time period and have a sufficiently large difference in engagement. The models achieve good accuracy on a held-out set of pairs. We show that we can interpret the parameters of the trained model to identify the topic and metadata features that contribute to high engagement. Among other observations, we see that topics related to climate projections, human cost and deaths tend to have low engagement while those related to mitigation and adaptation strategies have high engagement. We hope the insights gained from this study will help craft effective climate communication to promote engagement, thereby lending strength to efforts to tackle climate change.

Authors: Aswin Suresh (EPFL); Lazar Milikic (EPFL); Francis Murray (EPFL); Yurui Zhu (EPFL); Matthias Grossglauser (École Polytechnique Fédérale de Lausanne (EPFL))

ICLR 2023 Robustly modeling the nonlinear impact of climate change on agriculture by combining econometrics and machine learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Climate change is expected to have a dramatic impact on agricultural production; however, due to natural complexity, the exact avenues and relative strengths by which this will happen are still unknown. The development of accurate forecasting models is thus of great importance to enable policy makers to design effective interventions. To date, most machine learning methods aimed at tackling this problem lack a consideration of causal structure, thereby making them unreliable for the types of counterfactual analysis necessary when making policy decisions. Econometrics has developed robust techniques for estimating cause-effect relations in time-series, specifically through the use of cointegration analysis and Granger causality. However, these methods are frequently limited in flexibility, especially in the estimation of nonlinear relationships. In this work, we propose to integrate the non-linear function approximators with the robust causal estimation methods to ultimately develop an accurate agricultural forecasting model capable of robust counterfactual analysis. This method would be a valuable new asset for government and industrial stakeholders to understand how climate change impacts agricultural production.

Authors: Benedetta Francesconi (Independent Researcher); Ying-Jung C Deweese (Descartes Labs / Georgia Insititute of Technology)

NeurIPS 2022 Forecasting European Ozone Air Pollution With Transformers (Papers Track)
Abstract and authors: (click to expand)

Abstract: Surface ozone is an air pollutant that contributes to hundreds of thousands of premature deaths annually. Accurate short-term ozone forecasts may allow improved policy to reduce the risk to health, such as air quality warnings. However, forecasting ozone is a difficult problem, as surface ozone concentrations are controlled by a number of physical and chemical processes which act on varying timescales. Accounting for these temporal dependencies appropriately is likely to provide more accurate ozone forecasts. We therefore deploy a state-of-the-art transformer-based model, the Temporal Fusion Transformer, trained on observational station data from three European countries. In four-day test forecasts of daily maximum 8-hour ozone, the novel approach is highly skilful (MAE = 4.6 ppb, R2 = 0.82), and generalises well to two European countries unseen during training (MAE = 4.9 ppb, R2 = 0.79). The model outperforms standard machine learning models on our data, and compares favourably to the published performance of other deep learning architectures tested on different data. We illustrate that the model pays attention to physical variables known to control ozone concentrations, and that the attention mechanism allows the model to use relevant days of past ozone concentrations to make accurate forecasts.

Authors: Seb Hickman (University of Cambridge); Paul Griffiths (University of Cambridge); Alex Archibald (University of Cambridge); Peer Nowack (Imperial College London); Elie Alhajjar (USMA)

NeurIPS 2022 Estimating Chicago’s tree cover and canopy height using multi-spectral satellite imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: Information on urban tree canopies is fundamental to mitigating climate change as well as improving quality of life. Urban tree planting initiatives face a lack of up-to-date data about the horizontal and vertical dimensions of the tree canopy in cities. We present a pipeline that utilizes LiDAR data as ground-truth and then trains a multi-task machine learning model to generate reliable estimates of tree cover and canopy height in urban areas using multi-source multi-spectral satellite imagery for the case study of Chicago.

Authors: John Francis (University College London)

NeurIPS 2022 Climate Policy Tracker: Pipeline for automated analysis of public climate policies (Papers Track)
Abstract and authors: (click to expand)

Abstract: The number of standardized policy documents regarding climate policy and their publication frequency is significantly increasing. The documents are long and tedious for manual analysis, especially for policy experts, lawmakers, and citizens who lack access or domain expertise to utilize data analytics tools. Potential consequences of such a situation include reduced citizen governance and involvement in climate policies and an overall surge in analytics costs, rendering less accessibility for the public. In this work, we use a Latent Dirichlet Allocation-based pipeline for the automatic summarization and analysis of 10-years of national energy and climate plans (NECPs) for the period from 2021 to 2030, established by 27 Member States of the European Union. We focus on analyzing policy framing, the language used to describe specific issues, to detect essential nuances in the way governments frame their climate policies and achieve climate goals. The methods leverage topic modeling and clustering for the comparative analysis of policy documents across different countries. It allows for easier integration in potential user-friendly applications for the development of theories and processes of climate policy. This would further lead to better citizen governance and engagement over climate policies and public policy research.

Authors: Artur Żółkowski (Warsaw University of Technology); Mateusz Krzyziński (Warsaw University of Technology); Piotr Wilczyński (Warsaw University of Technology); Stanisław Giziński (University of Warsaw); Emilia Wiśnios (University of Warsaw); Bartosz Pieliński (University of Warsaw); Julian Sienkiewicz (Warsaw University of Technology); Przemysław Biecek (Warsaw University of Technology)

NeurIPS 2022 Topic correlation networks inferred from open-ended survey responses reveal signatures of ideology behind carbon tax opinion (Papers Track)
Abstract and authors: (click to expand)

Abstract: Ideology can often render policy design ineffective by overriding what, at face value, are rational incentives. A timely example is carbon pricing, whose public support is strongly influenced by ideology. As a system of ideas, ideology expresses itself in the way people explain themselves and the world. As an object of study, ideology is then amenable to a generative modelling approach within the text-as-data paradigm. Here, we analyze the structure of ideology underlying carbon tax opinion using topic models. An idea, termed a topic, is operationalized as the fixed set of proportions with which words are used when talking about it. We characterize ideology through the relational structure between topics. To access this latent structure, we use the highly expressive Structural Topic Model to infer topics and the weights with which individual opinions mix topics. We fit the model to a large dataset of open-ended survey responses of Canadians elaborating on their support of or opposition to the tax. We propose and evaluate statistical measures of ideology in our data, such as dimensionality and heterogeneity. Finally, we discuss the implications of the results for transition policy in particular, and of our approach to analyzing ideology for computational social science in general.

Authors: Maximilian Puelma Touzel (Mila)

NeurIPS 2022 Analyzing the global energy discourse with machine learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: To transform our economy towards net-zero emissions, industrial development of clean energy technologies (CETs) to replace fossil energy technologies (FETs) is crucial. Although the media has great power in influencing consumer behavior and decision making in business and politics, its role in the energy transformation is still underexplored. In this paper, we analyze the global energy discourse via machine learning. For this, we collect a large-scale dataset with ~5 million news articles from seven of the world’s major CO2 emitting countries, covering eight CETs and four FETs. Using machine learning, we then analyze the content of news articles on a highly granular level and along several dimensions, namely relevance (for the energy discourse), context (e.g., costs, regulation, investment), and connotations (e.g., high/increasing vs. low/decreasing costs). By linking empirical discourse patterns to investment and deployment data of CETs and FETs, this study advances the current understanding about the role of the media in the energy transformation. Thereby, it enables businesses, investors, and policy makers to respond more effectively to sensitive topics in the media discourse and leverage windows of opportunity for scaling CETs.

Authors: Malte Toetzke (ETH Zurich); Benedict Probst (ETH Zurich); Yasin Tatar (ETH Zurich); Stefan Feuerriegel (LMU Munich); Volker Hoffmann (ETH Zurich)

NeurIPS 2022 ForestBench: Equitable Benchmarks for Monitoring, Reporting, and Verification of Nature-Based Solutions with Machine Learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Restoring ecosystems and reducing deforestation are necessary tools to mitigate the anthropogenic climate crisis. Current measurements of forest carbon stock can be inaccurate, in particular for underrepresented and small-scale forests in the Global South, hindering transparency and accountability in the Monitoring, Reporting, and Verification (MRV) of these ecosystems. There is thus need for high quality datasets to properly validate ML-based solutions. To this end, we present ForestBench, which aims to collect and curate geographically-balanced gold-standard datasets of small-scale forest plots in the Global South, by collecting ground-level measurements and visual drone imagery of individual trees. These equitable validation datasets for ML-based MRV of nature-based solutions shall enable assessing the progress of ML models for estimating above-ground biomass, ground cover, and tree species diversity.

Authors: Lucas Czech (Carnegie Institution for Science); Björn Lütjens (MIT); David Dao (ETH Zurich)

NeurIPS 2022 Personalizing Sustainable Agriculture with Causal Machine Learning (Proposals Track) Best Paper: Proposals
Abstract and authors: (click to expand)

Abstract: To fight climate change and accommodate the increasing population, global crop production has to be strengthened. To achieve the "sustainable intensification" of agriculture, transforming it from carbon emitter to carbon sink is a priority, and understanding the environmental impact of agricultural management practices is a fundamental prerequisite to that. At the same time, the global agricultural landscape is deeply heterogeneous, with differences in climate, soil, and land use inducing variations in how agricultural systems respond to farmer actions. The "personalization" of sustainable agriculture with the provision of locally adapted management advice is thus a necessary condition for the efficient uplift of green metrics, and an integral development in imminent policies. Here, we formulate personalized sustainable agriculture as a Conditional Average Treatment Effect estimation task and use Causal Machine Learning for tackling it. Leveraging climate data, land use information and employing Double Machine Learning, we estimate the heterogeneous effect of sustainable practices on the field-level Soil Organic Carbon content in Lithuania. We thus provide a data-driven perspective for targeting sustainable practices and effectively expanding the global carbon sink.

Authors: Georgios Giannarakis (National Observatory of Athens); Vasileios Sitokonstantinou (National Observatory of Athens); Roxanne Suzette Lorilla (National Observatory of Athens); Charalampos Kontoes (National Observatory of Athens)

AAAI FSS 2022 Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N
Abstract and authors: (click to expand)

Abstract: Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing severe inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no central authority to enforce adherence to those agreements. Hence, it is critical to design negotiation and agreement frameworks that foster cooperation, allow all agents to meet their individual policy objectives, and incentivize long-term adherence. This is an interdisciplinary challenge that calls for collaboration between researchers in machine learning, economics, climate science, law, policy, ethics, and other fields. In particular, we argue that machine learning is a critical tool to address the complexity of this domain. To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks. We also describe how to use multi-agent reinforcement learning to train rational agents using RICE-N. This framework underpins AI for Global Climate Cooperation, a working group collaboration and competition on climate negotiation and agreement design. Here, we invite the scientific community to design and evaluate their solutions using RICE-N, machine learning, economic intuition, and other domain knowledge. More information can be found on www.ai4climatecoop.org.

Authors: Tianyu Zhang (Université de Montréal, MILA), Andrew Williams (Université de Montréal, MILA), Soham Phade (Salesforce Research), Sunil Srinivasa (Salesforce Research), Yang Zhang (MILA), Prateek Gupta (MILA, University of Oxford, The Alan Turing Institute), Yoshua Bengio (Université de Montréal, MILA, CIFAR) and Stephan Zheng (Salesforce Research)