CaML: Carbon Footprinting of Products with Zero-Shot Semantic Text Similarity (Papers Track)

Bharathan Balaji (Amazon); Venkata Sai Gargeya Vunnava (amazon); Geoffrey Guest (Amazon); Jared Kramer (Amazon)

Paper PDF Cite
Natural Language Processing Supply Chains


Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. For EIO-LCA, an expert needs to map each product to one of upwards of 1000 potential industry sectors. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels.