Advancing Multimodal Fact-Checking Against Climate Misinformation: A Benchmark Dataset and Comparison of Lightweight Models (Papers Track)
Omar El Baf (IRD); Quentin Senatore (IRD); Amira Mouakher (Université de Perpignan); Laure Berti-Equille (IRD)
Abstract
This paper proposes TIGER, a high-quality benchmark dataset to better evaluate multimodal fact-checking models that verify the veracity of claims combining texts and images on topics related to climate change. Unlike previous datasets, which are very unbalanced, do not focus on climate misinformation, and are often unimodal, TIGER includes curated claims and scripts to augment the dataset with information extracted from IPCC reports, claims generated by ChatGPT and related web-scraped images. We also propose M4FC, a set of lightweight MLP-based models with different textual and visual encoders and compare them against other ML models. Our models outperform strong baselines such as Random Forests and Gradient Boosting by up to +1.5% in accuracy and +1.7% in F1-score on the TIGER dataset. The key advantage of MLP-based models for multimodal fact-checking is simplicity and flexibility, as they achieve competitive performance with lower computational cost and carbon footprint compared to heavier architectures.