ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability (Papers Track)

Yanming Guo (University of Sydney)

Climate Finance & Economics


The Environmental Extended Multi-Regional Input-Output analysis is the predominant Ecological Economic research framework for analysing the environmental impact of economic activities. This paper introduces the novel ExioML dataset as the first Machine Learning benchmark data in sustainability analysis. We open-sourced the ExioML data and development toolkit to lower barriers and accelerate the cooperation between Machine Learning and Ecological Economic research. A crucial greenhouse gas emission regression task evaluates the usability of the proposed dataset. We compared the performance of traditional shallow models against deep models by leveraging a diverse factor accounting table and incorporating multiple modalities of categorical and numerical features. Our findings reveal that deep and ensemble models achieve low mean square errors below 0.25 and serve as a future machine learning research baseline. Through Ex- ioML, we aim to foster precise ML predictions and modelling to support climate actions and sustainable investment decisions. The data and codes are available: https://github.com/Yvnminc/ExioML