NADBenchmarks - a compilation of Benchmark Datasets for Machine Learning Tasks related to Natural Disasters

NADBenchmarks - a compilation of Benchmark Datasets for Machine Learning Tasks related to Natural Disasters

Adiba Proma (University of Rochester), Md Saiful Islam (University of Rochester), Stela Ciko (University of Rochester), Raiyan Abdul Baten (University of Rochester) and Ehsan Hoque (University of Rochester)

Paper PDF Cite

Disaster Management and Relief Data Mining

Abstract

Climate change has increased the intensity, frequency, and duration of extreme weather events and natural disasters across the world. While the increased data on natural disasters improves the scope of machine learning(ML) for this field, progress is relatively slow. One bottleneck is the lack of benchmark datasets that would allow ML researchers to quantify their progress against a standard metric. The objective of this short paper is to explore the state of benchmark datasets for ML tasks related to natural disasters, categorizing the datasets according to the disaster management cycle. We compile a list of existing benchmark datasets that have been introduced in the past five years. We propose a web platform where researchers can search for benchmark datasets in this domain, and develop a preliminary version of such a platform using our compiled list. This paper is intended to aid researchers in finding benchmark datasets to train their ML models on, and provide general directions in for topics where they can contribute new benchmark datasets.