SHRUG-FM: Reliability-Aware Foundation Models for Earth Observation (Papers Track)

Kai Cohrs (University of Valencia); Maria Gonzalez-Calabuig (University of Valencia); Vishal Nedungadi (Wageningen University and Research); Zuzanna Osika (Delft University of Technology); Ruben Cartuyvels (European Space Agency); Steffen Knoblauch (Heidelberg University); Joppe Massant (Ghent University); Shruti Nath (University of Oxford); Patrick Ebel (European Space Agency); Vasileios Sitokonstantinou (University of Valencia)

Paper PDF Poster File Cite
Computer Vision & Remote Sensing Disaster Management and Relief Earth Observation & Monitoring Uncertainty Quantification & Robustness

Abstract

Geospatial foundation models for Earth observation often fail to perform reliably in environments underrepresented during pretraining. We introduce SHRUG-FM, a framework for reliability-aware prediction that integrates three complementary signals: out-of-distribution (OOD) detection in the input space, OOD detection in the embedding space and task-specific predictive uncertainty. Applied to burn scar segmentation, SHRUG-FM shows that OOD scores correlate with lower performance in specific environmental conditions, while uncertainty-based flags help discard many poorly performing predictions. Linking these flags to land cover attributes from HydroATLAS shows that failures are not random but concentrated in certain geographies, such as low-elevation zones and large river basins, likely due to underrepresentation in pretraining data. SHRUG-FM provides a pathway toward safer and more interpretable deployment of GFMs in climate-sensitive applications, helping bridge the gap between benchmark performance and real-world reliability.