AI Agents For Decision-Making in Climate Governance Using Policy Benchmarks (Proposals Track)
Shan Shan (Zhejiang University)
Abstract
Climate change governance requires navigating complex policy documents, including treaties, regulations, and socio-political frameworks. Understanding these texts is essential for evidence-based decision-making but remains challenging due to their complexity and domain specificity. This study explores the potential of AI agents to support policy reasoning and decision-making through structured evaluation on climate policy benchmarks, with a focus on dynamic governance scenarios. Drawing on global frameworks such as the UN Sustainable Development Goals (UNSDGs) and IPCC assessment pathways, this study evaluates agents using datasets such as Climate-FEVER (factual claim verification), LegalBench (legal reasoning), and PolicyQA (policy question answering). Target tasks include treaty interpretation, socio-political analysis, adaptation policy reasoning, and scenario-based planning. This study introduces a hybrid evaluation framework combining expert assessment and interdisciplinary feedback to systematically benchmark AI agents’ performance in climate governance, identifying their strengths, limitations, and potential for real-world support. It aims to bridge AI and climate governance, a tale of two systems, into a tale of collaboration.