Extracting Structured Policy Information from Climate Action Plans (Proposals Track)

Tom Corringham (Scripps Institution of Oceanography); Nupoor Gandhi (Carnegie Mellon); Bryan Flores (Independent Researcher); Emma Strubell (Carnegie Mellon); Sireesh Gururaja (Carnegie Mellon); Tristan Romanov (Independent Researcher); Jacob Dunafon (Independent Researcher)

Paper PDF Cite
Natural Language Processing Behavioral and Social Science Cities & Urban Planning

Abstract

Most of the world’s climate action policies are planned and implemented at the local level, through city and regional climate action plans (CAPs). To assess global progress in climate mitigation and adaptation, as in forthcoming assessments such as the 2027 IPCC Special Report on Climate Change and Cities, we need systematic ways to track and analyze these plans. However, CAPs are dispersed across thousands of jurisdictions, vary widely in structure and format, and are often difficult to access. We propose a standard CAP ontology, and a retrieval- and extraction-oriented pipeline that leverages recent advances in natural language processing (NLP) and information retrieval (IR) to transform CAPs into a structured, verifiable dataset of climate policies. As a case study, we focus on California, where more than 260 local governments have published one or more CAPs since 2006. We develop an annotated benchmark dataset of 17 San Diego County CAPs with over 1,800 extracted policies and associated attributes. Unlike prior efforts that rely on small annotated corpora or industry-specific disclosures, our system explicitly grounds every extracted element in its underlying PDF, ensuring transparency and reducing hallucination in the produced dataset. Addressing these challenges will enable large-scale comparative analyses of CAPs across jurisdictions world-wide, supporting policymakers, sustainability officers, and hazard managers, and accelerating climate adaptation and mitigation efforts.