Ecosystems & Biodiversity

Agile Modeling for Bioacoustic Monitoring

Jenny Hamer, Rob Laber, and Tom Denton, NeurIPS 2023
Introduction to Camera Trap Recognition with Deep Learning

Zhongqi Miao, CCAI Summer School 2023

Blog Posts

Introducing The ForestBench Project

Lucas Czech, Björn Lütjens, and David Dao, November 28, 2023
Mapping Species From Crowdsourced Data Using Machine Learning

Elijah Cole and Oisin Mac Aodha, September 05, 2023
Using Machine Learning to Integrate Mangrove Restoration with Sustainable Aquaculture Intensification

Garrett Goto and Joshua Cortez, July 21, 2023
Scaling Climate Smart Shrimp in Southeast Asia using GIS and Computer Vision

Anica Araneta, June 16, 2022

Machine Learning in Robotics to Scale Climate Action

Serena Mou (Queensland University of Technology), Marius Wiggert (University of California Berkeley), November 11, 2022
Machine Learning for Oceans

Andrea Rivera-Sosa (The Coral Reef Alliance), Oliver Zielinski (DFKI/ University of Oldenburg), April 07, 2022
Machine learning for monitoring biodiversity

Sara Beery, Dave Thau, January 21, 2022

Data Extraction and Modelling from Plant Trait Literature

Richard Reeve (University of Glasgow); Neil A. Brummitt (Natural History Museum); Claire L. Harris (Biomathematics and Statistics Scotland); Ana Claudia Araujo (Natural History Museum); Ben Scott (Natural History Museum); Christina Cobbold (University of Glasgow); Glenn Marion (Biomathematics & Statistics Scotland), 2023
Mitigating Climate Change Impacts on Biodiversity via Machine Learning Powered Assessment

Oisin Mac Aodha (University of Edinburgh); Scott Loarie (iNaturalist); Thomas Brooks (IUCN), 2022
Using Machine Learning and Earth Observation Data to Identify Aquaculture Sites with High Potential for Production Intensification and Mangrove Restoration in Southeast Asia

Jack Kittinger (Arizona State University); Dane Klinger (Conservation International); Emily Corwin (Conservation International); Issa Tingzon (Thinking Machines - Philippines); Renavell Flores (Thinking Machines - Philippines); Joshua Cortez (Thinking Machines - Philippines); Pia Faustino (Thinking Machines - Philippines), 2022
ForestBench: Equitable Benchmarks for Monitoring Verification and Reporting of Nature-Based Solutions with Machine Learning

Dava Newman (MIT); Moises Exposito-Alonso (Carnegie Institution for Science); Lucas Czech (Carnegie Institution for Science); David Dao (ETH Zurich); Björn Lütjens (MIT); Lauren Gillespie (Stanford University); Hilary Hao (Climate Reality Project); Andrew Cottam (Restor), 2022

NeurIPS 2023
- Keynote: Tanya Berger-Wolf
ICLR 2023
- Keynote: Bistra Dilkina (University of Southern California)
ICLR 2020
- April 26: Main Workshop
  - Ciira wa Maina: Climate Change - The price of "progress"? Exploring AI Solutions (Invited talk)
  - Dan Morris: Climate, biodiversity, and land: using ML to protect and restore ecosystems (Invited talk)
Summer School 2024
- Day 3 - AI for Biodiversity and Ecosystems - June 26, 2024
Summer School 2023
- Day 6 - AI for Biodiversity and Ecosystems - July 10, 2023

Venue	Title
ICLR 2025	Towards the Curation of Environment-related Knowledge Graphs: Fine-tuning General-domain Language Models for Biodiversity Named Entity Recognition (Papers Track) Abstract and authors: (click to expand) Abstract: The availability of climate data fuels timely science-based climate actions. Providing policymakers and regulators with easy-to-digest, structured climate data, e.g., in the form of a knowledge graph, is critical to mitigating the adverse effects of climate change on the natural environment. Natural language processing (NLP) applications that employ Named Entity Recognition (NER) systems can aid in uncovering information hidden in millions of textual documents. In this paper, we evaluated the NER performance of transformer-based Bidirectional Encoder Representations from Transformers (BERT) models that were pre-trained on general-domain data. We fine-tuned BERT-based models on the COPIOUS dataset for the specialist task of biodiversity NER. Our experiments showed that our DeBERTa NER model demonstrated best performance, obtaining a micro-averaged F1-score of 84.18% based on entity-level evaluation. We employed our DeBERTa NER model in a biodiversity Information Extraction (IE) pipeline and applied it on the forestry compendium of the Centre for Agricultural and Biosciences International (CABI) Digital Library. We demonstrate that the pipeline enables the extraction of structured information on reproductive conditions and habitats of tree species. Authors: Geilah Tabanao (University of the Philippines Diliman); Andrew Miguel Pagdanganan (University of the Philippines Diliman); Riza Batista-Navarro (University of Manchester); Roselyn Gabud (University of the Philippines Diliman)
ICLR 2025	Climplicit: Climatic Implicit Embeddings for Global Ecological Tasks (Papers Track) Abstract and authors: (click to expand) Abstract: Deep learning on climatic data holds potential for macroecological applications. However, its adoption remains limited among scientists outside the deep learning community due to storage, compute, and technical expertise barriers. To address this, we introduce Climplicit, a spatio-temporal geolocation encoder pretrained to generate implicit climatic representations anywhere on Earth. By bypassing the need to download raw climatic rasters and train feature extractors, our model uses x3500 less disk space and significantly reduces computational needs for downstream tasks. We evaluate our Climplicit embeddings on biomes classification, species distribution modeling, and plant trait regression. We find that single-layer probing our Climplicit embeddings consistently performs better or on par with training a model from scratch on downstream tasks and overall better than alternative geolocation encoding models. Authors: Johannes Dollinger (University of Zurich); Damien Robert (University of Zurich); Elena Plekhanova (Swiss Federal Research Institute WSL); Lukas Drees (University of Zurich); Jan Dirk Wegner (University of Zurich)
ICLR 2025	Lake Water Temperature Modeling Using Physics-Informed Neural Networks (Papers Track) Abstract and authors: (click to expand) Abstract: Assessing water quality in bodies of water is important in evaluating the effects of climate change and its anthropogenic impacts. Such assessments often require good models of key indices such as water temperature, pH, or oxygen levels. In this work, we investigate time series models for lake water temperatures at multiple depths and develop a physics-informed neural network based on Koopman embeddings and LSTM that is capable of forecasting water temperatures in the long term. Experiment results show that our model can achieve a good performance and significantly outperforms the conventional LSTM model for this time series forecasting problem. Authors: Trieu Vo (Florida International University); Cuong Nguyen (Durham University); Dongsheng Luo (Florida International University); Leonardo Bobadilla (Florida International University)
ICLR 2025	Large Language Models for Monitoring Dataset Mentions in Climate Research (Papers Track) Abstract and authors: (click to expand) Abstract: Effective climate change research relies on diverse datasets to inform mitigation and adaptation strategies and policies. However, the ways these datasets are cited, used, and distributed remain poorly understood. This paper presents a machine learning framework that automates the detection and classification of dataset mentions in climate research papers. Leveraging large language models (LLMs), we generate a weakly supervised dataset through zero-shot extraction, quality assessment via an LLM-as-a-Judge, and refinement by a reasoning agent. The Phi-3.5-mini instruct model is pre-fine-tuned on this dataset, followed by fine-tuning on a smaller manually annotated subset to specialize in extracting data mentions. At inference, a ModernBERT-based classifier filters for dataset mentions, optimizing computational efficiency. Evaluated on a held-out manually annotated sample, our fine-tuned model outperforms NuExtract-v1.5 and GLiNER-large-v2.1 in dataset extraction accuracy. As a framework for monitoring dataset mentions in research papers, this approach enhances transparency, identifies data gaps, and enables researchers, funders, and policymakers to improve data discoverability and usage, leading to more informed decision-making. Authors: Aivin Solatorio (The World Bank); Rafael Macalaba (The World Bank); James Liounis (The World Bank)
ICLR 2025	Heterogenous graph neural networks for species distribution modeling (Papers Track) Abstract and authors: (click to expand) Abstract: Species distribution models (SDMs) are necessary for measuring and predicting occurrences and habitat suitability of species and their relationship with environmental factors. We introduce a novel presence-only SDM with graph neural networks (GNN). In our model, species and locations are treated as two distinct node sets, and the learning task is predicting detection records as the edges that connect locations to species. Using GNN for SDM allows us to model fine-grained interactions between species and the environment. We evaluate the potential of this methodology on the six-region dataset compiled by National Center for Ecological Analysis and Synthesis (NCEAS) for benchmarking SDMs. For each of the regions, the heterogeneous GNN model is comparable to or outperforms previously-benchmarked single-species SDMs as well as a feed-forward neural network baseline model. Authors: Lauren Harrell (Google Research); Christine Kaeser-Chen (Google DeepMind); Burcu Karagol Ayan (Google DeepMind); Keith Anderson (Google DeepMind); Michelangelo Conserva (Google Research); Elise Kleeman (Google Research); Maxim Neumann (Google DeepMind); Matt Overlan (Google DeepMind); Melissa Chapman (Google Research); Drew Purves (Google DeepMind)
ICLR 2025	SuoiAI: Building a Dataset for Aquatic Invertebrates in Vietnam (Proposals Track) Abstract and authors: (click to expand) Abstract: Understanding and monitoring aquatic biodiversity is critical for ecological health and conservation efforts. This paper proposes SuoiAI, an end-to end pipeline for building a dataset of aquatic invertebrates in Vietnam and employing machine learning (ML) techniques for species classification. We outline the methods for data collection, annotation, and model training, focusing on reducing annotation effort through semi-supervised learning and leveraging state-of-the-art object detection and classification models. Our approach aims to overcome challenges such as data scarcity, fine-grained classification, and deployment in diverse environmental conditions. Authors: Minh Teu Vo Thanh (NUOC SOLUTIONS); Lakshay Sharma (Microsoft); Tuan Dinh (NUOC SOLUTIONS); Khuong Dinh (University of Oslo); Trang Nguyen (Bowdoin); Trung Phan (Fulbright University Vietnam); Minh Do (NUOC SOLUTIONS); Duong Vu (Royal Netherlands Academy of Arts and Sciences)
NeurIPS 2024	A Deep Learning Approach to the Automated Segmentation of Bird Vocalizations from Weakly Labeled Crowd-sourced Audio (Papers Track) Abstract and authors: (click to expand) Abstract: Ecologists interested in monitoring the effects caused by climate change are increasingly turning to passive acoustic monitoring, the practice of placing autonomous audio recording units in ecosystems to monitor species richness and occupancy via species calls. However, identifying species calls in large datasets by hand is an expensive task, leading to a reliance on machine learning models. Due to a lack of annotated datasets of soundscape recordings, these models are often trained on large databases of community created focal recordings. A challenge of training on such data is that clips are given a "weak label," a single label that represents the whole clip. This includes segments that only have background noise but are labeled as calls in the training data, reducing model performance. Heuristic methods exist to convert clip-level labels to "strong" call-specific labels, where the label tightly bounds the temporal length of the call and better identifies bird vocalizations. Our work improves on the current weakly to strongly labeled method used on the training data for BirdNET, the current most popular model for audio species classification. We utilize an existing RNN-CNN hybrid, resulting in a precision improvement of 12% (going to 90% precision) against our new strongly hand-labeled dataset of Peruvian bird species. Authors: Jacob Ayers (Engineers for Exploration at UCSD); Sean Perry (University of California San Diego); Samantha Prestrelski (UC San Diego); Tianqi Zhang (Engineers for Exploration); Ludwig von Schoenfeldt (University of California San Diego); Mugen Blue (UC Merced); Gabriel Steinberg (Demining Research Community); Mathias Tobler (San Diego Zoo Wildlife Alliance); Ian Ingram (San Diego Zoo Wildlife Alliance); Curt Schurgers (UC San Diego); Ryan Kastner (University of California San Diego)
NeurIPS 2024	AI-Driven Predictive Modeling of PFAS Contamination in Aquatic Ecosystems: Exploring A Geospatial Approach (Papers Track) Abstract and authors: (click to expand) Abstract: Per- and polyfluoroalkyl substances (PFAS), a class of synthetic fluorinated compounds termed “forever chemicals”, have garnered significant attention due to their persistence, widespread environmental presence, bioaccumulative properties, and associated risks for human health. Their presence in aquatic ecosystems highlights the link between human activity and the hydrological cycle. They also disrupt aquatic life, interfere with gas exchange, and disturb the carbon cycle, contributing to greenhouse gas emissions and exacerbating climate change. Federal agencies, state governments and non-government research and public interest organizations have emphasized the need for documenting the sites and the extent of PFAS contamination. However, the time-consuming and expensive nature of data collection and analysis poses challenges. It hinders the rapid identification of locations at high risk of PFAS contamination, which may then require further sampling or remediation. To address this data limitation, our study leverages a novel geospatial dataset, machine learning models including frameworks such as Random Forest, IBM-NASA's Prithvi and UNet, and geospatial analysis to predict regions with high PFAS concentrations in surface water. Using fish data from the National Rivers and Streams Assessment (NRSA) dataset by the Environmental Protection Agency (EPA), our analysis suggests the potential value of machine learning based models for targeted deployment of sampling investigations and remediation efforts. Authors: Jowaria Khan (University of Michigan); David Andrews (Environmental Working Group); Kaley Beins (Environmental Working Group); Sydney Evans (Environmental Working Group); Alexa Friedman (Environmental Working Group); Elizabeth Bondi-Kelly (MIT)
NeurIPS 2024	Harnessing AI for Wildfire Defense: An approach to Predict and Mitigate Global Fire Risk (Papers Track) Abstract and authors: (click to expand) Abstract: Wildfires pose a critical threat to wildlife, economies, properties, and human lives globally, making accurate risk assessment essential for effective management and mitigation. This study introduces a novel machine learning-based approach utilizing a Convolutional Neural Network (CNN) to evaluate wildfire risks across diverse ecosystems. Leveraging a comprehensive dataset of remote-sensed variables—including topography, vegetation health indicators, and climatic conditions—our model operates at a spatial resolution of 1000 meters per pixel, providing enhanced precision in predicting wildfire occurrences. The CNN outperforms state-of-the-art models, achieving a fire detection ratio of 0.82 and a no-fire detection ratio of 0.87. The results demonstrate that most dataset variables are crucial for accurate risk assessment, although some are non-essential. By integrating data from regions around the globe, this study underscores the feasibility and effectiveness of implementing globally scalable wildfire prediction tools. Authors: Hassan Ashfaq (Ghulam Ishaq Khan Institute of Engineering Sciences and Technology)
NeurIPS 2024	Classification of Snow Depth Measurements for tracking plant phenological shifts in Alpine regions (Papers Track) Abstract and authors: (click to expand) Abstract: Ground-based snow depth measurements are often realized using ultrasonic or laser technologies, which by their nature measure the height of any underlying object, whether it is snow or vegetation in snow-free periods. We propose a machine learning approach to the automated classification of snow depth measurements into a snow cover class and a class corresponding to everything else, which takes into account both the temporal context and the dependencies between snow depth and other sensor measurements. Through a series of experiments we demonstrate that our approach simplifies the detection of seasonal snowmelt and corresponding onset of plant growth, which we used to assess climate-change related phenological shifts in otherwise rather poorly monitored high alpine regions. Authors: Jan Svoboda (WSL Institute for Snow and Avalanche Research SLF); Michael Zehnder (WSL Institute for Snow and Avalanche Research SLF); Marc Ruesch (WSL Institute for Snow and Avalanche Research SLF); David Liechti (WSL Institute for Snow and Avalanche Research SLF); Corinne Jones (Swiss Data Science Center); Michele Volpi (Swiss Data Science Center, ETH Zurich); Christian Rixen (WSL Institute for Snow and Avalanche Research SLF); Jürg Schweizer (WSL Institute for Snow and Avalanche Research SLF)
NeurIPS 2024	Adaptive Policy Regularization for Offline-to-Online Reinforcement Learning in HVAC Control (Papers Track) Abstract and authors: (click to expand) Abstract: Reinforcement learning (RL)-based control methods have been extensively studied to improve building heating, ventilation, and air conditioning (HVAC) efficiency. Data-driven approaches demonstrate better transferability and scalability, making them useful in real-world applications. Most prior works focus on online learning requiring simulators or models of environment dynamics. However, transferring thermal simulators between environments is inefficient. We build on recent works that employ offline training on static datasets from unknown policies. Pure offline RL is constrained by the replay buffer's distribution, we propose using offline-to-online RL to enhance pre-trained offline models through online adaptation to distribution shifts. We show that direct online fine-tuning deteriorates performance on offline policies. To address this, we propose automatically tuning the actor's regularization during training to optimize the exploration-exploitation tradeoff. Specifically, we leverage simple moving averages of mean Q-values sampled throughout training. Simulation experiments demonstrate our method outperforms state-of-the-art approaches under various conditions, improving performance by 32.9% and enhancing pre-trained models' capabilities online. Authors: Hsin-Yu Liu (University of California San Diego); Bharathan Balaji (Amazon); Rajesh Gupta (UC San Diego); Dezhi Hong (Amazon)
NeurIPS 2024	DivShift: Exploring Domain-Specific Distribution Shift in Large-Scale, Volunteer-Collected Biodiversity Datasets (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change is negatively impacting the world's biodiversity. To build automated systems to monitor these negative biodiversity impacts, large-scale, volunteer-collected datasets like iNaturalist are built from community-identified, natural imagery. However, such volunteer-based data are opportunistic and lack a structured sampling strategy, resulting in geographic, temporal, observation quality, and socioeconomic, biases that stymie uptake of these models for downstream biodiversity monitoring tasks. Here we introduce DivShift North American West Coast (DivShift-NAWC), a curated dataset of almost 8 million iNaturalist plant images across the western coast of North America, for exploring the effects of these biases on deep learning model performance. We compare model performance across four known biases and observe that they indeed confound model performance. We suggest practical strategies for curating datasets to train deep learning models for monitoring climate change's impacts on the world's biodiversity. Authors: Elena Sierra (Stanford University); Lauren Gillespie (Stanford University); Salim Soltani (University of Freiburg); Moisés Expósito-Alonso (University of California, Berkeley); Teja Kattenborn (University of Freiburg)
NeurIPS 2024	Wildflower Monitoring with Expert-annotated Images and Flowering Phenology (Papers Track) Abstract and authors: (click to expand) Abstract: Understanding biodiversity trends is essential for preservation policy planning, and advanced computer vision solutions now enable large-scale automated monitoring for many biodiversity use cases. Wildflower monitoring, in particular, presents unique challenges. Visual similarities in shape and color may exist between different species, while flowers within a species may have significant visual differences. Moreover, flowers follow a growth cycle and look distinctly different over the year, while different species flower at different times of the year. Having access to flowering phenology, more accurate predictions may be made. We propose a novel multi-modal wildflower monitoring task to better identify species, levering both expert-annotated wildflower images and flowering phenology estimates. Moreover, we benchmark several state-of-the-art models using two groups of common wildflower species that have high inter-class similarity, and show that this multi-modal approach significantly outperforms image-only baselines. With this work, we aim to encourage the development of standards for automated wildflower monitoring as a step towards bending the curve of biodiversity loss. The data and the code are publicly available https://georgianagmanolache.github.io/wildflowerpower/ Authors: Georgiana Manolache (Fontys University of Applied Science); Gerard Schouten (Fontys University of Applied Sciences)
NeurIPS 2024	Light-weight geospatial model for global deforestation attribution (Papers Track) Abstract and authors: (click to expand) Abstract: Forests are in decline worldwide and it is critical to attribute forest cover loss to its causes. We gathered a curated global dataset of all forest loss drivers and developed a neural network model to recognize the main drivers of deforestation or forest degradation at 1-km scale. Using remote sensing satellite data together with ancillary biophysical and socioeconomic data the model estimates the dominant drivers of forest loss from 2001 to 2022. Using a relatively light-weight geospatial model allowed us to to train a single world-wide model. We generated a global map of drivers of forest loss that is being validated, and present the first insights such data can provide. Authors: Anton Raichuk (Google); Michelle Sims (WRI); Radost Stanimirova (WRI); Maxim Neumann (Google)
NeurIPS 2024	Scalable and interpretable deforestation detection in the Amazon rainforest (Papers Track) Abstract and authors: (click to expand) Abstract: Deforestation of the Amazon rainforest is a major contributor to climate change, as it is a crucial precipitation regulator, as well as a large natural carbon reserve. While there have been efforts to create real-time algorithms for deforestation detection, they are oftentimes not accurate or interpretable. We leverage multiple input signals, such as satellite imagery, time-series of deforestation indices and scalar measures, to create a single deep learning model that is both interpretable and accurate. We employ a novel dataset with millions of annotated images of the Brazilian Amazon to train our model, as well as class activation mappings to investigate the added value of interpretability in this context. Authors: Rodrigo Schuller (IMPA); Francisco Ganacim (IMPA); Paulo Orenstein (IMPA)
NeurIPS 2024	Composing Open-domain Vision with RAG for Ocean Monitoring and Conservation (Proposals Track) Abstract and authors: (click to expand) Abstract: Climate change's destruction of marine biodiversity is threatening communities and economies around the world which rely on healthy oceans for their livelihoods. The challenge of applying computer vision to niche, real-world domains such as ocean conservation lies in the dynamic and diverse environments where traditional top-down learning struggle with long-tailed distributions, generalization, and domain transfer. Scalable species identification for ocean monitoring is particularly difficult due to the need to adapt models to new environments and identify rare or unseen species. To overcome these limitations, we propose leveraging bottom-up, open-domain learning frameworks–specifically vision-language models (VLMs) combined with retrieval-augmented generation (RAG)–as a resilient, scalable solution for image and video analysis in marine applications. We validate this approach through a preliminary application in classifying fish from video onboard fishing vessels, demonstrating impressive emergent retrieval and prediction capabilities without domain-specific training or knowledge of the task itself. Authors: Sepand Dyanatkar (OnDeck Fisheries AI); Angran Li (OnDeck Fisheries AI); Alexander Dungate (OnDeck Fisheries AI)
NeurIPS 2024	A Multimodal Causal Framework for Large-Scale Ecosystem Valuation: Application to Wetland Benefits for Flood Mitigation (Proposals Track) Abstract and authors: (click to expand) Abstract: Climate change is poised to alter wetland ecosystems through changes in temperature and precipitation patterns, compounding the already pronounced influence of human-driven wetland development. In this context, policymakers and environmental managers would benefit from accurate wetland valuations to guide their decision-making, as their choices regarding this critical natural resource directly impact flood mitigation efforts, biodiversity conservation, and economic activity. This paper introduces a novel multimodal causal framework for producing location-specific ecosystem valuations at a national scale to be used in cost-benefit policy analysis. It leverages recent advances in estimating heterogeneous treatment effects to flexibly determine how the expected impact of ecosystem-level changes---such as wetland loss via development---varies conditional on high-dimensional and multimodal measures that characterize the complex interactions between human and natural systems such as aerial satellite imagery, weather sequence data, land cover classifications, and water surface networks. From this effort, we aim to create a national database of location-specific wetland valuations in an approach that can be readily extended in estimating the effect of other interventions on ecosystems. We also plan to generate open-source feature embeddings for each U.S. wetland, embeddings that can be used to address other climate-related causal questions as well. Authors: Hannah Druckenmiller (Caltech); Georgia Gkioxari (Caltech); Connor Jerzak (University of Texas at Austin); SayedMorteza Malaekeh (University of Texas at Austin)
ICLR 2024	Mapping Land Naturalness from Sentinel-2 using Deep Contextual and Geographical Priors (Papers Track) Abstract and authors: (click to expand) Abstract: In recent decades, the causes and consequences of climate change have accelerated, affecting our planet on an unprecedented scale. This change is closely tied to the ways in which humans alter their surroundings. As our actions continue to impact natural areas, using satellite images to see and measure these effects has become crucial for understanding and fighting climate change. Aiming to map land naturalness on the continuum of modern human pressure, we develop a multi-modal supervised deep learning framework that addresses the unique challenges of satellite data and the task at hand. We incorporate contextual and geographical priors. These priors are represented by corresponding coordinate information and broader contextual information including and surrounding the immediate patch to be predicted. Our framework improves the model's predictive performance to map land naturalness from a given Sentinel-2 data, a multi-spectral optical satellite imagery. Recognizing that our protective measures are as effective as our grasp of the ecosystem, quantifying naturalness serves as a crucial step towards enhancing our environmental stewardship. Authors: Burak Ekim (University of the Bundeswehr); Michael Schmitt (University of the Bundeswehr Munich)
ICLR 2024	Towards Ecological Network Analysis with Gromov-Wasserstein Distances (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change is driving the widespread redistribution of species with cascading effects on predators and their prey. Formally comparing ecological interaction networks is a critical step towards understanding the impact of climate change on ecosystem functioning, yet current methods for ecological network analysis are unable to do so. We propose using the GromovWasserstein (GW) metric for quantifying dissimilarity between ecological networks. We demonstrates the corresponding optimal transport plans of this distance can be interpreted as species functional alignment between food webs. Our results show that GW transport plans align species from different mammal communities consistent with ecological understanding. Furthermore, we illustrate extensions of the GW distance to notions of averages and factorization over ecological networks. Ultimately, we propose the foundation for a novel interpretable topological data analysis framework to inform future ecological research and conservation management. Authors: Kai M Hung (Rice University); Ann Finneran (Rice University); Alex Zalles (Rice University); Lydia Beaudrot (Rice University); Cesar Uribe (Rice University)
ICLR 2024	Bee Activity Prediction and Pattern Recognition in Environmental Data (Papers Track) Abstract and authors: (click to expand) Abstract: As a consequence of climate change, biodiversity is declining rapidly. Many species like insects, especially bees, suffer from changes in temperature and rainfall patterns. Applying machine learning for monitoring and predicting specie's health and life conditions can help understanding and improving biodiversity. In this work we use data collected from cameras and sensors mounted upon beehives together with different other data sources like weather data, information extracted from satellite images and geographical information. We aim at predicting bees' health (measured as their activity) and analyzing influencing environmental conditions. We show that we are able to accurately predict bees' activity and understand their life conditions by using machine learning algorithms and explainable AI. Understanding these conditions can help to make recommendations on good locations for beehives. This work illustrates the potential of applying machine learning on sensor, satellite and weather data for monitoring and predicting species' health and hence shows the ability for adaptation to climate change and a more accurate species monitoring. Authors: Christine Preisach (University of Applied Sciences Karlsruhe); Marius Herrmann (Karlsruhe Institute of Technology)
ICLR 2024	Advancing Earth System Model Calibration: A Diffusion-Based Method (Papers Track) Honorable Mention Abstract and authors: (click to expand) Abstract: Understanding of climate impact on ecosystems globally requires site-specific model calibration. Here we introduce a novel diffusion-based uncertainty quantification (DBUQ) method for efficient model calibration. DBUQ is a score-based diffusion model that leverages Monte Carlo simulation to estimate the score function and evaluates a simple neural network to quickly generate samples for approximating parameter posterior distributions. DBUQ is stable, efficient, and can effectively calibrate the model given diverse observations, thereby enabling rapid and site-specific model calibration on a global scale. This capability significantly advances Earth system modeling and our understanding of climate impacts on Earth systems. We demonstrate DBUQ's capability in E3SM land model calibration at the Missouri Ozark AmeriFlux forest site. Both synthetic and real-data applications indicate that DBUQ produces accurate parameter posterior distributions similar to those generated by Markov Chain Monte Carlo sampling but with 30X less computing time. This efficiency marks a significant stride in model calibration, paving the way for more effective and timely climate impact analyses. Authors: Yanfang Liu (Oak Ridge National Laboratory); Dan Lu (Oak Ridge National Laboratory); Zezhong Zhang (Oak Ridge National Laboratory); Feng Bao (Florida State University); Guannan Zhang (Oak Ridge National Laboratory)
ICLR 2024	Analyzing the secondary wastewater-treatment process using Faster R-CNN and YOLOv5 object detection algorithms (Papers Track) Abstract and authors: (click to expand) Abstract: The activated sludge (AS) process is the most common type of secondary wastewater treatment, applied worldwide. Due to the complexity of microbial communities, imbalances between the different types of bacteria may occur and disturb the process, with pronounced economical and environmental consequences. Microscopic inspection of the morphology of flocs and microorganisms provides key information on AS properties and function. This is a time-consuming, highly skilled, and expensive process that is not readily available in all locations. Thus, most wastewater-treatment plants do not carry out this essential analysis, resulting in frequent operational faults. In this study, we develop a novel deep learning (DL) object detection algorithm to analyze and monitor the AS process based on a unique microscopic image database of flocs and microorganisms. Specifically, we applied YOLOv5 and Faster R-CNN algorithms as tools for segmentation and object detection to analyze the wastewater. The mean average precision (mAP) of the YOLOv5 was 0.67, outperforming the Faster R-CNN by 15%. Histogram equalization preprocessing of both bright-field and phase-contrast images significantly improved the results of the algorithm in all classes. In the case of YOLOv5, the mAP increased by 16.67%, to 0.77, where the AP of protozoa, filaments, and open floc classes outperformed the previous model by over 20%. These results demonstrate the potential of leveraging DL algorithms to enhance the analysis and monitoring of WWTPs in an affordable manner, consequently reducing environmental pollution caused by contaminated effluent. The fundamental challenge addressed herein has important global relevance, especially in an era in which the demand for high-quality wastewater reuse is expected to increase dramatically. Authors: Offir Inbar (Tel-Aviv University); Moni Shahar (Tel Aviv University); Jacob Gidron (Tel-Aviv University); Ido Cohen (Tel-Aviv University); Dror Avisar (Tel-Aviv University)
ICLR 2024	Exploring Graph Neural Networks to Predict the Seagrasses Ecosystem State in the Italian Seas (Papers Track) Abstract and authors: (click to expand) Abstract: Marine coastal ecosystems (MCEs) play a critical role in climate change adaptation and human well-being. However, they face global threats from environmental pressures, both related to climate change (CC) and direct human impacts. Leveraging the increasing availability of geospatial data, this study explores Graph Neural Networks (GNNs) to assess cumulative impacts arising from human and CC related pressures on the Seagrass ecosystem in the Italian seas. Unlike traditional machine learning (ML) models with which they were compared in this study, GNNs incorporate the spatial component of data through graph structures. While experimental results demonstrate a modest performance improvement in GNNs, the study is constrained by limited data availability, preventing the exploration of the temporal component and physical laws representable through graph structures. Future efforts aim to collect higher-resolution spatial and temporal data, considering expressible environmental processes, to enhance model learning. Authors: Angelica Bianconi (University School for Advanced Studies (IUSS) Pavia & Ca’ Foscari University of Venice); Sebastiano Vascon (Ca' Foscari University of Venice & European Centre for Living Technology); Elisa Furlan (Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC) & Ca' Foscari University of Venice); Andrea Critto (Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC) & Ca' Foscari University of Venice)
ICLR 2024	Imbalance-aware Presence-only Loss Function for Species Distribution Modeling (Papers Track) Abstract and authors: (click to expand) Abstract: In the face of significant biodiversity decline, species distribution models (SDMs) are essential for understanding the impact of climate change on species habitats by connecting environmental conditions to species occurrences. Traditionally limited by a scarcity of species observations, these models have significantly improved in performance through the integration of larger datasets provided by citizen science initiatives. However, they still suffer from the strong class imbalance between species within these datasets, often resulting in the penalization of rare species--those most critical for conservation efforts. To tackle this issue, this study assesses the effectiveness of training deep learning models using a balanced presence-only loss function on various large citizen science-based datasets. We demonstrate that this imbalance-aware loss function outperforms traditional loss functions across various datasets and tasks, particularly in accurately modeling rare species with limited observations. Authors: Robin Zbinden (EPFL); Nina van Tiel (EPFL); Marc Rußwurm (Wageningen University); Devis Tuia (EPFL)
ICLR 2024	Deep Gaussian Processes and inversion for decision support in model-based climate change mitigation and adaptation problems (Papers Track) Abstract and authors: (click to expand) Abstract: To inform their decisions, policy makers often rely on models developed by researchers that are computationally intensive and complex and that frequently run on High Performance Computers (HPC). These decision-support models are not used directly by deciders and the results of these models tend to be presented by experts as a limited number of potential scenarios that would result from a limited number of potential policy choices. Machine learning models such as Deep Gaussian Processes (DGPs) can be used to radically re-define how decision makers can use models by creating a ‘surrogate model’ or ‘emulator’ of the original model. Surrogate models can then be embedded into apps that decisions makers can use to directly explore a vast array of policy options corresponding to potential target outcomes (model inversion). To illustrate the mechanism, we give an example of application that is envisaged as part of the UK government’s Net Zero strategy. To achieve Net Zero CO2 emissions by 2050, the UK government is considering multiple options that include planting trees to capture carbon. However, the amount of CO2 captured by the trees depend on a large number of factors that include climate conditions, soil type, soil carbon, tree type, ... Depending on these factors the net balance of carbon removal after planting trees may not necessarily be positive. Hence, choosing the right place to plant the right tree is very important. A decision-helping model has been developed to tackle this problem. For a given policy input, the model outputs its impact in terms of CO2 sequestration, biodiversity and other ecosystem services. We show how DGPs can be used to create a surrogate model of this original afforestation model and how these can be embedded into an R shiny app that can then be directly used by decision makers. Authors: bertrand nortier (University of Exeter); daniel williamson (University of Exeter); mattia mancini (University of Exeter); amy binner (University of Exeter); brett day (University of Exeter); ian bateman (University of Exeter)
ICLR 2024	Global Vegetation Modeling With Pre-Trained Weather Transformers (Papers Track) Abstract and authors: (click to expand) Abstract: Accurate vegetation models can produce further insights into the complex inter-action between vegetation activity and ecosystem processes. Previous research has established that long-term trends and short-term variability of temperature and precipitation affect vegetation activity. Motivated by the recent success of Transformer-based Deep Learning models for medium-range weather forecasting, we adapt the publicly available pre-trained FourCastNet to model vegetation activity while accounting for the short-term dynamics of climate variability. We investigate how the learned global representation of the atmosphere’s state can be transferred to model the normalized difference vegetation index (NDVI). Our model globally estimates vegetation activity at a resolution of 0.25◦ while relying only on meteorological data. We demonstrate that leveraging pre-trained weather models improves the NDVI estimates compared to learning an NDVI model from scratch. Additionally, we compare our results to other recent data-driven NDVI modeling approaches from machine learning and ecology literature. We further provide experimental evidence on how much data and training time is necessary to turn FourCastNet into an effective vegetation model. Code and models are available at https://github.com/LSX-UniWue/Global-Ecosystem-Modeling. Authors: Pascal Janetzky (University Wuerzburg); Florian Gallusser (Universität Würzburg); Simon Hentschel (Julius-Maximilians-Universität of Würzburg); Andreas Hotho (University of Wuerzburg); Anna Krause (Universität Würzburg, Department of Computer Science, CHair X Data Science)
ICLR 2024	Predicting Species Occurrence Patterns from Partial Observations (Papers Track) Abstract and authors: (click to expand) Abstract: To address the interlinked biodiversity and climate crises, we need an understanding of where species occur and how these patterns are changing. However, observational data on most species remains very limited, and the amount of data available varies greatly between taxonomic groups. We introduce the problem of predicting species occurrence patterns given (a) satellite imagery, and (b) known information on the occurrence of other species. To evaluate algorithms on this task, we introduce SatButterfly, a dataset of satellite images, environmental data and observational data for butterflies, which is designed to pair with the existing SatBird dataset of bird observational data. To address this task, we propose a general model, R-Tran, for predicting species occurrence patterns that enables the use of partial observational data wherever found. We find that R-Tran outperforms other methods in predicting species encounter rates with partial information both within a taxon (birds) and across taxa (birds and butterflies). Our approach opens new perspectives to leveraging insights from species with abundant data to other species with scarce data, by modelling the ecosystems in which they co-occur. Authors: Hager Radi Abdelwahed (Mila: Quebec AI Institute); Mélisande Teng (Mila, Université de Montréal); David Rolnick (MIT)
ICLR 2024	Towards Scalable Deep Species Distribution Modelling using Global Remote Sensing (Papers Track) Abstract and authors: (click to expand) Abstract: Destruction of natural habitats and anthropogenic climate change are threatening biodiversity globally. Addressing this loss necessitates enhanced monitoring techniques to assess the impact of environmental shifts and to guide policy-making efforts. Species distribution models are crucial tools that predict species locations by interpolating observed field data with environmental information. We develop an improved, scalable method for species distribution modelling by proposing a dataset pipeline that incorporates global remote sensing imagery, land use classification data, environmental variables, and observation data, and utilising this with convolutional neural network (CNN) models to predict species presence at higher spatial and temporal resolutions than well-established species distribution modelling methods. We apply our approach to modelling Protea species distributions in the Cape Floristic Region of South Africa, demonstrating its performance in a region of high biodiversity. We train two CNN models and compare their performance to Maxent, a popular conventional species distribution modelling method. We find that the CNN models trained with remote sensing data outperform Maxent, underscoring the potential of our method as an effective and scalable solution for modelling species distribution. Authors: Emily Morris (University of Cambridge); Anil Madhavapeddy (University of Cambridge); Sadiq Jaffer (University of Cambridge); David Coomes (University of Cambridge)
ICLR 2024	An Adaptive Hydropower Management Approach for Downstream Ecosystem Preservation (Proposals Track) Abstract and authors: (click to expand) Abstract: Hydropower plants play a pivotal role in advancing clean and sustainable energy production, contributing significantly to the global transition towards renewable energy sources. However, hydropower plants are currently perceived both positively as sources of renewable energy and negatively as disruptors of ecosystems. In this work, we highlight the overlooked potential of using hydropower plant as protectors of ecosystems by using adaptive ecological discharges. To advocate for this perspective, we propose using a neural network to predict the minimum ecological discharge value at each desired time. Additionally, we present a novel framework that seamlessly integrates it into hydropower management software, taking advantage of the well-established approach of using traditional constrained optimisation algorithms. This novel approach not only protects the ecosystems from climate change but also contributes to potentially increase the electricity production. Authors: Cecília Coelho (University of Minho); Ming Jin (Virginia Tech); M. Fernanda P. Costa (Dep. Mathematics, University of Minho); Luís L. Ferrás (University of Porto)
NeurIPS 2023	A machine learning pipeline for automated insect monitoring (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change and other anthropogenic factors have led to a catastrophic decline in insects, endangering both biodiversity and the ecosystem services on which human society depends. Data on insect abundance, however, remains woefully inadequate. Camera traps, conventionally used for monitoring terrestrial vertebrates, are now being modified for insects, especially moths. We describe a complete, open-source machine learning-based software pipeline for automated monitoring of moths via camera traps, including object detection, moth/non-moth classification, fine-grained identification of moth species, and tracking individuals. We believe that our tools, which are already in use across three continents, represent the future of massively scalable data collection in entomology. Authors: Aditya Jain (Mila); Fagner Cunha (Federal University of Amazonas); Michael Bunsen (Mila, eButterfly); Léonard Pasi (EPFL); Anna Viklund (Daresay); Maxim Larrivee (Montreal Insectarium); David Rolnick (McGill University, Mila)
NeurIPS 2023	Understanding Insect Range Shifts with Out-of-Distribution Detection (Proposals Track) Abstract and authors: (click to expand) Abstract: Climate change is inducing significant range shifts in insects and other organisms. Large-scale temporal data on populations and distributions are essential for quantifying the effects of climate change on biodiversity and ecosystem services, providing valuable insights for both conservation and pest management. With images from camera traps, we aim to use Mahalanobis distance-based confidence scores to automatically detect new moth species in a region. We intend to make out-of-distribution detection interpretable by identifying morphological characteristics of different species using Grad-CAM. We hope this algorithm will be a useful tool for entomologists to study range shifts and inform climate change adaptation. Authors: Yuyan Chen (McGill University, Mila); David Rolnick (McGill University, Mila)
NeurIPS 2023	Agile Modeling for Bioacoustic Monitoring (Tutorials Track) Abstract and authors: (click to expand) Abstract: Bird, insect, and other wild animal populations are rapidly declining, highlighting the need for better monitoring, understanding, and protection of Earth’s remaining wild places. However, direct monitoring of biodiversity is difficult. Passive Acoustic Monitoring (PAM) enables detection of the vocalizing species in an ecosystem, many of which can be difficult or impossible to detect by satellite or camera trap. Large-scale PAM deployments using low-cost devices allow measuring changes over time and responses to environmental changes, and targeted deployments can discover and monitor endangered or invasive species. Machine learning methods are needed to analyze the thousands or even millions of hours of audio produced by large-scale deployments. But there are a massive number of potential signals to target for bioacoustic measurement, and many of the most interesting lack training data. Many rare species are difficult to observe. Detecting specific call-types and juvenile calls can give further insight into behavior and population health, but almost no structured datasets exist for these use-cases. No single classifier can address all of these needs, so practitioners regularly need to create new classifiers to address novel problems. Soundscape annotation efforts are very expensive, and machine learning experts are scarce, creating a bottleneck on analysis. We aim to eliminate the bottleneck by providing an efficient, self-contained active learning workflow for biologists. In this tutorial, we present an integrated workflow for analyzing large unlabeled bioacoustic datasets, adapting new agile modeling techniques to audio. Our goal is to allow experts to create a new high quality classifier for a novel class with under one hour of effort. We achieve this by leveraging transfer learning from high-quality bioacoustic models, vector search over audio databases, and lightweight Python notebook UX. The workflow can begin from a single example, proceeds through an efficient active learning loop, and finally applies the produced classifier to a large mass of unlabeled data to produce insights for ecologists and land managers. Authors: tom denton (google); Jenny Hamer (Google Research); Rob Laber (Google)
ICLR 2023	Exploring the potential of neural networks for Species Distribution Modeling (Papers Track) Abstract and authors: (click to expand) Abstract: Species distribution models (SDMs) relate species occurrence data with environmental variables and are used to understand and predict species distributions across landscapes. While some machine learning models have been adopted by the SDM community, recent advances in neural networks may have untapped potential in this field. In this work, we compare the performance of multi-layer perceptron (MLP) neural networks to well-established SDM methods on a benchmark dataset spanning 225 species in six geographical regions. We also compare the performance of MLPs trained separately for each species to an equivalent model trained on a set of species and performing multi-label classification. Our results show that MLP models achieve comparable results to state-of-the-art SDM methods, such as MaxEnt. We also find that multi-species MLPs perform slightly better than single-species MLPs. This study indicates that neural networks, along with all their convenient and valuable characteristics, are worth considering for SDMs. Authors: Robin Zbinden (EPFL); Nina van Tiel (EPFL); Benjamin Kellenberger (Yale University); Lloyd H Hughes (EPFL); Devis Tuia (EPFL)
ICLR 2023	Understanding forest resilience to drought with Shapley values (Proposals Track) Abstract and authors: (click to expand) Abstract: Increases in drought frequency, intensity, and duration due to climate change are threatening forests around the world. Climate-driven tree mortality is associated with devastating ecological and societal consequences, including the loss of carbon sequestration, habitat provisioning, and water filtration services. A spatially fine-grained understanding of the site characteristics making forests more resilient to drought is still lacking. Furthermore, the complexity of drought effects on forests, which can be cumulative and delayed, demands investigation of the most appropriate drought indices. In this study, we aim to gain a better understanding of the temporal and spatial drivers of drought-induced changes in forest vitality using Shapley values, which allow for the relevance of predictors to be quantified locally. A better understanding of the contribution of meteorological and environmental factors to trees’ response to drought can support forest managers aiming to make forests more climate-resilient. Authors: Stenka Vulova (Technische Universität Berlin); Alby Duarte Rocha (Technische Universität Berlin); Akpona Okujeni (Humboldt-Universität zu Berlin); Johannes Vogel (Freie Universität Berlin); Michael Förster (Technische Universität Berlin); Patrick Hostert (Humboldt-Universität zu Berlin); Birgit Kleinschmit (Technische Universität Berlin)
ICLR 2023	Bird Distribution Modelling using Remote Sensing and Citizen Science data (Papers Track) Overall Best Paper Abstract and authors: (click to expand) Abstract: Climate change is a major driver of biodiversity loss, changing the geographic range and abundance of many species. However, there remain significant knowl- edge gaps about the distribution of species, due principally to the amount of effort and expertise required for traditional field monitoring. We propose an approach leveraging computer vision to improve species distribution modelling, combining the wide availability of remote sensing data with sparse on-ground citizen science data from .We introduce a novel task and dataset for mapping US bird species to their habitats by predicting species encounter rates from satellite images, along with baseline models which demonstrate the power of our approach. Our methods open up possibilities for scalably modelling ecosystems properties worldwide. Authors: Mélisande Teng (Mila, Université de Montréal); Amna Elmustafa (African Institute for Mathematical Science); Benjamin Akera (McGill University); Hugo Larochelle (UdeS); David Rolnick (McGill University, Mila)
NeurIPS 2022	Optimizing toward efficiency for SAR image ship detection (Papers Track) Abstract and authors: (click to expand) Abstract: The detection and prevention of illegal fishing is critical to maintaining a healthy and functional ecosystem. Recent research on ship detection in satellite imagery has focused exclusively on performance improvements, disregarding detection efficiency. However, the speed and compute cost of vessel detection are essential for a timely intervention to prevent illegal fishing. Therefore, we investigated optimization methods that lower detection time and cost with minimal performance loss. We trained an object detection model based on a convolutional neural network (CNN) using a dataset of satellite images. Then, we designed two efficiency optimizations that can be applied to the base CNN or any other base model. The optimizations consist of a fast, cheap classification model and a statistical algorithm. The integration of the optimizations with the object detection model leads to a trade-off between speed and performance. We studied the trade-off using metrics that give different weight to execution time and performance. We show that by using a classification model the average precision of the detection model can be approximated to 99.5% in 44% of the time or to 92.7% in 25% of the time. Authors: Arthur Van Meerbeeck (KULeuven); Ruben Cartuyvels (KULeuven); Jordy Van Landeghem (KULeuven); Sien Moens (KU Leuven)
NeurIPS 2022	Estimating Chicago’s tree cover and canopy height using multi-spectral satellite imagery (Papers Track) Abstract and authors: (click to expand) Abstract: Information on urban tree canopies is fundamental to mitigating climate change as well as improving quality of life. Urban tree planting initiatives face a lack of up-to-date data about the horizontal and vertical dimensions of the tree canopy in cities. We present a pipeline that utilizes LiDAR data as ground-truth and then trains a multi-task machine learning model to generate reliable estimates of tree cover and canopy height in urban areas using multi-source multi-spectral satellite imagery for the case study of Chicago. Authors: John Francis (University College London)
NeurIPS 2022	Identifying Compound Climate Drivers of Forest Mortality with β-VAE (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change is expected to lead to higher rates of forest mortality. Forest mortality is a complex phenomenon driven by the interaction of multiple climatic variables at multiple temporal scales, further modulated by the current state of the forest (e.g. age, stem diameter, and leaf area index). Identifying the compound climate drivers of forest mortality would greatly improve understanding and projections of future forest mortality risk. Observation data are, however, limited in accuracy and sample size, particularly in regard to forest state variables and mortality events. In contrast, simulations with state-of-the-art forest models enable the exploration of novel machine learning techniques for associating forest mortality with driving climate conditions. Here we simulate 160,000 years of beech, pine and spruce forest dynamics with the forest model FORMIND. We then apply β-VAE to learn disentangled latent representations of weather conditions and identify those that are most likely to cause high forest mortality. The learned model successfully identifies three characteristic climate representations that can be interpreted as different compound drivers of forest mortality. Authors: Mohit Anand (Helmholtz Centre for Environmental Research - UFZ); Lily-belle Sweet (Helmholtz Centre for Environmental Research - UFZ); Gustau Camps-Valls (Universitat de València); Jakob Zscheischler (Helmholtz Centre for Environmental Research - UFZ)
NeurIPS 2022	Learning to forecast vegetation greenness at fine resolution over Africa with ConvLSTMs (Papers Track) Abstract and authors: (click to expand) Abstract: Forecasting the state of vegetation in response to climate and weather events is a major challenge. Its implementation will prove crucial in predicting crop yield, forest damage, or more generally the impact on ecosystems services relevant for socio-economic functioning, which if absent can lead to humanitarian disasters. Vegetation status depends on weather and environmental conditions that modulate complex ecological processes taking place at several timescales. Interactions between vegetation and different environmental drivers express responses at instantaneous but also time-lagged effects, often showing an emerging spatial context at landscape and regional scales. We formulate the land surface forecasting task as a strongly guided video prediction task where the objective is to forecast the vegetation developing at very fine resolution using topography and weather variables to guide the prediction. We use a Convolutional LSTM (ConvLSTM) architecture to address this task and predict changes in the vegetation state in Africa using Sentinel-2 satellite NDVI, having ERA5 weather reanalysis, SMAP satellite measurements, and topography (DEM of SRTMv4.1) as variables to guide the prediction. Ours results highlight how ConvLSTM models can not only forecast the seasonal evolution of NDVI at high resolution, but also the differential impacts of weather anomalies over the baselines. The model is able to predict different vegetation types, even those with very high NDVI variability during target length. Authors: Claire Robin (Biogeochemical Integration, Max-Planck-Institute for Biogeochemistry, Jena, Germany); Christian Requena-Mesa (Computer Vision Group, Friedrich Schiller University Jena; DLR Institute of Data Science, Jena; Max Planck Institute for Biogeochemistry, Jena); Vitus Benson (Max-Planck-Institute for Biogeochemistry); Jeran Poehls (Max-Planck-Institute for Biogeochemistry); Lazaro Alonzo (Max-Planck-Institute for Biogeochemistry Max-Planck-Institute for Biogeochemistry); Nuno Carvalhais (Max-Planck-Institute for Biogeochemistry); Markus Reichstein (Max Planck Institute for Biogeochemistry, Jena; Michael Stifel Center Jena for Data-Driven and Simulation Science, Jena)
NeurIPS 2022	ForestBench: Equitable Benchmarks for Monitoring, Reporting, and Verification of Nature-Based Solutions with Machine Learning (Proposals Track) Abstract and authors: (click to expand) Abstract: Restoring ecosystems and reducing deforestation are necessary tools to mitigate the anthropogenic climate crisis. Current measurements of forest carbon stock can be inaccurate, in particular for underrepresented and small-scale forests in the Global South, hindering transparency and accountability in the Monitoring, Reporting, and Verification (MRV) of these ecosystems. There is thus need for high quality datasets to properly validate ML-based solutions. To this end, we present ForestBench, which aims to collect and curate geographically-balanced gold-standard datasets of small-scale forest plots in the Global South, by collecting ground-level measurements and visual drone imagery of individual trees. These equitable validation datasets for ML-based MRV of nature-based solutions shall enable assessing the progress of ML models for estimating above-ground biomass, ground cover, and tree species diversity. Authors: Lucas Czech (Carnegie Institution for Science); Björn Lütjens (MIT); David Dao (ETH Zurich)
NeurIPS 2022	Personalizing Sustainable Agriculture with Causal Machine Learning (Proposals Track) Best Paper: Proposals Abstract and authors: (click to expand) Abstract: To fight climate change and accommodate the increasing population, global crop production has to be strengthened. To achieve the "sustainable intensification" of agriculture, transforming it from carbon emitter to carbon sink is a priority, and understanding the environmental impact of agricultural management practices is a fundamental prerequisite to that. At the same time, the global agricultural landscape is deeply heterogeneous, with differences in climate, soil, and land use inducing variations in how agricultural systems respond to farmer actions. The "personalization" of sustainable agriculture with the provision of locally adapted management advice is thus a necessary condition for the efficient uplift of green metrics, and an integral development in imminent policies. Here, we formulate personalized sustainable agriculture as a Conditional Average Treatment Effect estimation task and use Causal Machine Learning for tackling it. Leveraging climate data, land use information and employing Double Machine Learning, we estimate the heterogeneous effect of sustainable practices on the field-level Soil Organic Carbon content in Lithuania. We thus provide a data-driven perspective for targeting sustainable practices and effectively expanding the global carbon sink. Authors: Georgios Giannarakis (National Observatory of Athens); Vasileios Sitokonstantinou (National Observatory of Athens); Roxanne Suzette Lorilla (National Observatory of Athens); Charalampos Kontoes (National Observatory of Athens)
AAAI FSS 2022	Towards Generating Large Synthetic Phytoplankton Datasets for Efficient Monitoring of Harmful Algal Blooms Abstract and authors: (click to expand) Abstract: Climate change is increasing the frequency and severity of harmful algal blooms (HABs), which cause significant fish deaths in aquaculture farms. This contributes to ocean pollution and greenhouse gas (GHG) emissions since dead fish are either dumped into the ocean or taken to landfills, which in turn negatively impacts the climate. Currently, the standard method to enumerate harmful algae and other phytoplankton is to manually observe and count them under a microscope. This is a time-consuming, tedious and error-prone process, resulting in compromised management decisions by farmers. Hence, automating this process for quick and accurate HAB monitoring is extremely helpful. However, this requires large and diverse datasets of phytoplankton images, and such datasets are hard to produce quickly. In this work, we explore the feasibility of generating novel high-resolution photorealistic synthetic phytoplankton images, containing multiple species in the same image, given a small dataset of real images. To this end, we employ Generative Adversarial Networks (GANs) to generate synthetic images. We evaluate three different GAN architectures: ProjectedGAN, FastGAN, and StyleGANv2 using standard image quality metrics. We empirically show the generation of high-fidelity synthetic phytoplankton images using a training dataset of only 961 real images. Thus, this work demonstrates the ability of GANs to create large synthetic datasets of phytoplankton from small training datasets, accomplishing a key step towards sustainable systematic monitoring of harmful algal blooms. Authors: Nitpreet Bamra (University of Waterloo), Vikram Voleti (Mila, University of Montreal), Alexander Wong (University of Waterloo) and Jason Deglint (University of Waterloo)
AAAI FSS 2022	Discovering Transition Pathways Towards Coviability with Machine Learning Abstract and authors: (click to expand) Abstract: This paper presents our ongoing French-Brazilian collaborative project which aims at: (1) establishing a diagnosis of socio-ecological coviability for several sites of interest in Nordeste, the North-East region of Brazil (in the states of Paraiba, Ceara, Pernambuco, and Rio Grande do Norte known for their biodiversity hotspots and vulnerabilities to climate change) using advanced data science techniques for multisource and multimodal data fusion and (2) finding transition pathways towards coviability equilibrium using machine learning techniques. Data collected in the field by scientists, ecologists, local actors combined with volunteered information, pictures from smart-phones, and data available on-line from satellite imagery, social media, surveys, etc. can be used to compute various coviability indicators of interest for the local actors. These indicators are useful to characterize and monitor the socio-ecological coviability status along various dimensions of anthropization, human welfare, ecological and biodiversity balance, and ecosystem intactness and vulnerabilities. Authors: Laure Berti-Equille (IRD) and Rafael Raimundo (UFPB)
NeurIPS 2021	Predicting Critical Biogeochemistry of the Southern Ocean for Climate Monitoring (Papers Track) Abstract and authors: (click to expand) Abstract: The Biogeochemical-Argo (BGC-Argo) program is building a network of globally distributed, sensor-equipped robotic profiling floats, improving our understanding of the climate system and how it is changing. These floats, however, are limited in the number of variables measured. In this study, we train neural networks to predict silicate and phosphate values in the Southern Ocean from temperature, pressure, salinity, oxygen, nitrate, and location and apply these models to earth system model (ESM) and BGC-Argo data to expand the utility of this ocean observation network. We trained our neural networks on observations from the Global Ocean Ship-Based Hydrographic Investigations Program (GO-SHIP) and use dropout regularization to provide uncertainty bounds around our predicted values. Our neural network significantly improves upon linear regression but shows variable levels of uncertainty across the ranges of predicted variables. We explore the generalization of our estimators to test data outside our training distribution from both ESM and BGC-Argo data. Our use of out-of-distribution test data to examine shifts in biogeochemical parameters and calculate uncertainty bounds around estimates advance the state-of-the-art in oceanographic data and climate monitoring. We make our data and code publicly available. Authors: Ellen Park (MIT); Jae Deok Kim (MIT-WHOI); Nadege Aoki (MIT); Yumeng Cao (MIT); Yamin Arefeen (Massachusetts Institute of Technology); Matthew Beveridge (Massachusetts Institute of Technology); David P Nicholson (Woods Hole Oceanographic Institution); Iddo Drori (MIT)
NeurIPS 2021	A data integration pipeline towards reliable monitoring of phytoplankton and early detection of harmful algal blooms (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change is making oceans warmer and more acidic. Under these conditions phytoplankton can produce harmful algal blooms which cause rapid oxygen depletion and consequent death of marine plants and animals. Some species are even capable of releasing toxic substances endangering water quality and human health. Monitoring of phytoplankton and early detection of harmful algal blooms is essential for protection of marine flaura and fauna. Recent technological advances have enabled in-situ plankton image capture in real-time at low cost. However, available phytoplankton image databases have several limitations that prevent the practical usage of artificial intelligent models. We present a pipeline for integration of heterogeneous phytoplankton image datasets from around the world into a unified database that can ultimately serve as a benchmark dataset for phytoplankton research and therefore act as an important tool in building versatile machine learning models for climate adaptation planning. A machine learning model for early detection of harmful algal blooms is part of ongoing work. Authors: Bruna Guterres (Universidade Federal do Rio Grande - FURG); Sara khalid (University of Oxford); Marcelo Pias (Federal University of Rio Grande); Silvia Botelho (Federal University of Rio Grande)
NeurIPS 2021	Data Driven Study of Estuary Hypoxia (Papers Track) Abstract and authors: (click to expand) Abstract: This paper presents a data driven study of dissolved oxygen times series collected in Atlantic Canada. The main motivation of presented work was to evaluate if machine learning techniques could help to understand and anticipate hypoxic episodes in nutrient-impacted estuaries, a phenomenon that is exacerbated by increasing temperature expected to arise due to changes in climate. A major constraint was to limit ourselves to the use of dissolved oxygen time series only. Our preliminary findings shows that recurring neural networks and in particular LSTM may be suitable to predict short horizon levels while traditional results could benefit in longer range hypoxia prevention. Authors: Md Monwer Hussain (University of New-Brunswick); Guillaume Durand (National Research Council Canada); Michael Coffin (Department of Fisheries and Oceans Canada); Julio J Valdés (National Research Council Canada); Luke Poirier (Department of Fisheries and Oceans Canada)
NeurIPS 2021	High-resolution rainfall-runoff modeling using graph neural network (Papers Track) Abstract and authors: (click to expand) Abstract: Time-series modeling has shown great promise in recent studies using the latest deep learning algorithms such as LSTM (Long Short-Term Memory). These studies primarily focused on watershed-scale rainfall-runoff modeling or streamflow forecasting, but the majority of them only considered a single watershed as a unit. Although this simplification is very effective, it does not take into account spatial information, which could result in significant errors in large watersheds. Several studies investigated the use of GNN (Graph Neural Networks) for data integration by decomposing a large watershed into multiple sub-watersheds, but each sub-watershed is still treated as a whole, and the geoinformation contained within the watershed is not fully utilized. In this paper, we propose the GNRRM (Graph Neural Rainfall-Runoff Model), a novel deep learning model that makes full use of spatial information from high-resolution precipitation data, including flow direction and geographic information. When compared to baseline models, GNRRM has less over-fitting and significantly improves model performance. Our findings support the importance of hydrological data in deep learning-based rainfall-runoff modeling, and we encourage researchers to include more domain knowledge in their models. Authors: Zhongrun Xiang (University of Iowa); Ibrahim Demir (The University of Iowa)
NeurIPS 2021	Resolving Super Fine-Resolution SIF via Coarsely-Supervised U-Net Regression (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change presents challenges to crop productivity, such as increasing the likelihood of heat stress and drought. Solar-Induced Chlorophyll Fluorescence (SIF) is a powerful way to monitor how crop productivity and photosynthesis are affected by changing climatic conditions. However, satellite SIF observations are only available at a coarse spatial resolution (e.g. 3-5km) in most places, making it difficult to determine how individual crop types or farms are doing. This poses a challenging coarsely-supervised regression task; at training time, we only have access to SIF labels at a coarse resolution (3 km), yet we want to predict SIF at a very fine spatial resolution (30 meters), a 100x increase. We do have some fine-resolution input features (such as Landsat reflectance) that are correlated with SIF, but the nature of the correlation is unknown. To address this, we propose Coarsely-Supervised Regression U-Net (CSR-U-Net), a novel approach to train a U-Net for this coarse supervision setting. CSR-U-Net takes in a fine-resolution input image, and outputs a SIF prediction for each pixel; the average of the pixel predictions is trained to equal the true coarse-resolution SIF for the entire image. Even though this is a very weak form of supervision, CSR-U-Net can still learn to predict accurately, due to its inherent localization abilities, plus additional enhancements that facilitate the incorporation of scientific prior knowledge. CSR-U-Net can resolve fine-grained variations in SIF more accurately than existing averaging-based approaches, which ignore fine-resolution spatial variation during training. CSR-U-Net could also be useful for a wide range of "downscaling'" problems in climate science, such as increasing the resolution of global climate models. Authors: Joshua Fan (Cornell University); Di Chen (Cornell University); Jiaming Wen (Cornell University); Ying Sun (Cornell University); Carla P Gomes (Cornell University)
NeurIPS 2021	A hybrid convolutional neural network/active contour approach to segmenting dead trees in aerial imagery (Papers Track) Abstract and authors: (click to expand) Abstract: The stability and ability of an ecosystem to withstand climate change is directly linked to its biodiversity. Dead trees are a key indicator of overall forest health, housing one-third of forest ecosystem biodiversity, and constitute 8% of the global carbon stocks. They are decomposed by several natural factors, e.g. climate, insects and fungi. Accurate detection and modeling of dead wood mass is paramount to understanding forest ecology, the carbon cycle and decomposers. We present a novel method to construct precise shape contours of dead trees from aerial photographs by combining established convolutional neural networks with a novel active contour model in an energy minimization framework. Our approach yields superior performance accuracy over state-of-the-art in terms of precision, recall, and intersection over union of detected dead trees. This improved performance is essential to meet emerging challenges caused by climate change (and other man-made perturbations to the systems), particularly to monitor and estimate carbon stock decay rates, monitor forest health and biodiversity, and the overall effects of dead wood on and from climate change. Authors: Jacquelyn Shelton (Hong Kong Polytechnic University); Przemyslaw Polewski (TomTom Location Technology Germany GmbH); Wei Yao (The Hong Kong Polytechnic University); Marco Heurich (Bavarian Forest National Park)
NeurIPS 2021	Reducing the Barriers of Acquiring Ground-truth from Biodiversity Rich Audio Datasets Using Intelligent Sampling Techniques (Papers Track) Abstract and authors: (click to expand) Abstract: The potential of passive acoustic monitoring (PAM) as a method to reveal the consequences of climate change on the biodiversity that make up natural soundscapes can be undermined by the discrepancy between the low barrier of entry to acquire large field audio datasets and the higher barrier of acquiring reliable species level training, validation, and test subsets from the field audio. These subsets from a deployment are often required to verify any machine learning models used to assist researchers in understanding the local biodiversity. Especially as many models convey promising results from various sources that may not translate to the collected field audio. Labeling such datasets is a resource intensive process due to the lack of experts capable of identifying bioacoustics at a species level as well as the overwhelming size of many PAM audiosets. To address this challenge, we have tested different sampling techniques on an audio dataset collected over a two-week long August audio array deployment on the Scripps Coastal Reserve (SCR) Biodiversity Trail in La Jolla, California. These sampling techniques involve creating four subsets using stratified random sampling, limiting samples to the daily bird vocalization peaks, and using a hybrid convolutional neural network (CNN) and recurrent neural network (RNN) trained for bird presence/absence audio classification. We found that a stratified random sample baseline only achieved a bird presence rate of 44% in contrast with a sample that randomly selected clips with high hybrid CNN-RNN predictions that were collected during bird activity peaks at dawn and dusk yielding a bird presence rate of 95%. The significantly higher bird presence rate demonstrates how intelligent, machine learning-assisted selection of audio data can significantly reduce the amount of time that domain experts listen to audio without vocalizations of interest while building a ground truth for machine learning models. Authors: Jacob G Ayers (UC San Diego); Sean Perry (UC San Diego); Vaibhav Tiwari (UC San Diego); Mugen Blue (Cal Poly San Luis Obispo); Nishant Balaji (UC San Diego); Curt Schurgers (UC San Diego); Ryan Kastner (University of California San Diego); Mathias Tobler (San Diego Zoo Wildlife Alliance); Ian Ingram (San Diego Zoo Wildlife Alliance)
NeurIPS 2021	Two-phase training mitigates class imbalance for camera trap image classification with CNNs (Papers Track) Abstract and authors: (click to expand) Abstract: By leveraging deep learning to automatically classify camera trap images, ecologists can monitor biodiversity conservation efforts and the effects of climate change on ecosystems more efficiently. Due to the imbalanced class-distribution of camera trap datasets, current models are biased towards the majority classes. As a result, they obtain good performance for a few majority classes but poor performance for many minority classes. We used two-phase training to increase the performance for these minority classes. We trained, next to a baseline model, four models that implemented a different versions of two-phase training on a subset of the highly imbalanced Snapshot Serengeti dataset. Our results suggest that two-phase training can improve performance for many minority classes, with limited loss in performance for the other classes. We find that two-phase training based on majority undersampling increases class-specific F1-scores up to 3.0%. We also find that two-phase training outperforms using only oversampling or undersampling by 6.1% in F1-score on average. Finally, we find that a combination of over- and undersampling leads to a better performance than using them individually. Authors: Farjad Malik (KU Leuven); Simon Wouters (KU Leuven); Ruben Cartuyvels (KULeuven); Erfan Ghadery (KU Leuven); Sien Moens (KU Leuven)
NeurIPS 2021	Machine learning-enabled model-data integration for predicting subsurface water storage (Proposals Track) Abstract and authors: (click to expand) Abstract: Subsurface water storage (SWS) is a key variable of the climate system and a storage component for precipitation and radiation anomalies, inducing persistence in the climate system. It plays a critical role in climate-change projections and can mitigate the impacts of climate change on ecosystems. However, because of the difficult accessibility of the underground, hydrologic properties and dynamics of SWS are poorly known. Direct observations of SWS are limited, and accurate incorporation of SWS dynamics into Earth system land models remains challenging. We propose a machine learning-enabled model-data integration framework to improve the SWS prediction at local to conus scales in a changing climate by leveraging all the available observation and simulation resources, as well as to inform the model development and guide the observation collection. The accurate prediction will enable an optimal decision of water management and land use and improve the ecosystem's resilience to the climate change. Authors: Dan Lu (Oak Ridge National Laboratory); Eric Pierce (Oak Ridge National Laboratory); Shih-Chieh Kao (Oak Ridge National Laboratory); David Womble (Oak Ridge National Laboratory); LI LI (Pennsylvania State University); Daniella Rempe (The University of Texas at Austin)
NeurIPS 2021	Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark (Proposals Track) Abstract and authors: (click to expand) Abstract: Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks. Such models, recently coined as foundation models, have been transformational to the field of natural language processing. While similar models have also been trained on large corpuses of images, they are not well suited for remote sensing data. To stimulate the development of foundation models for Earth monitoring, we propose to develop a new benchmark comprised of a variety of downstream tasks related to climate change. We believe that this can lead to substantial improvements in many existing applications and facilitate the development of new applications. This proposal is also a call for collaboration with the aim of developing a better evaluation process to mitigate potential downsides of foundation models for Earth monitoring. Authors: Alexandre Lacoste (ServiceNow); Evan D Sherwin (Stanford University, Energy and Resources Engineering); Hannah R Kerner (University of Maryland); Hamed Alemohammad (Radiant Earth Foundation); Björn Lütjens (MIT); Jeremy A Irvin (Stanford); David Dao (ETH Zurich); Alex Chang (Service Now); Mehmet Gunturkun (Element Ai); Alexandre Drouin (ServiceNow); Pau Rodriguez (Element AI); David Vazquez (ServiceNow)
ICML 2021	Forest Terrain Identification using Semantic Segmentation on UAV Images (Papers Track) Abstract and authors: (click to expand) Abstract: Beavers' habitat is known to alter the terrain, providing biodiversity in the area, and recently their lifestyle is linked to climatic changes by reducing greenhouse gases levels in the region. To analyse the impact of beavers’ habitat on the region, it is, therefore, necessary to estimate the terrain alterations caused by beaver actions. Furthermore, such terrain analysis can also play an important role in domains like wildlife ecology, deforestation, land-cover estimations, and geological mapping. Deep learning models are known to provide better estimates on automatic feature identification and classification of a terrain. However, such models require significant training data. Pre-existing terrain datasets (both real and synthetic) like CityScapes, PASCAL, UAVID, etc, are mostly concentrated on urban areas and include roads, pathways, buildings, etc. Such datasets, therefore, are unsuitable for forest terrain analysis. This paper contributes, by providing a finely labelled novel dataset of forest imagery around beavers’ habitat, captured from a high-resolution camera on an aerial drone. The dataset consists of 100 such images labelled and classified based on 9 different classes. Furthermore, a baseline is established on this dataset using state-of-the-art semantic segmentation models based on performance metrics including Intersection Over Union (IoU), Overall Accuracy (OA), and F1 score. Authors: Muhammad Umar (Anglia Ruskin University); Lakshmi Babu Saheer (Anglia Ruskin University); Javad Zarrin (Anglia Ruskin University)
ICML 2021	Challenges in Applying Audio Classification Models to Datasets Containing Crucial Biodiversity Information (Papers Track) Abstract and authors: (click to expand) Abstract: The acoustic signature of a natural soundscape can reveal consequences of climate change on biodiversity. Hardware costs, human labor time, and expertise dedicated to labeling audio are impediments to conducting acoustic surveys across a representative portion of an ecosystem. These barriers are quickly eroding away with the advent of low-cost, easy to use, open source hardware and the expansion of the machine learning field providing pre-trained neural networks to test on retrieved acoustic data. One consistent challenge in passive acoustic monitoring (PAM) is a lack of reliability from neural networks on audio recordings collected in the field that contain crucial biodiversity information that otherwise show promising results from publicly available training and test sets. To demonstrate this challenge, we tested a hybrid recurrent neural network (RNN) and convolutional neural network (CNN) binary classifier trained for bird presence/absence on two Peruvian bird audiosets. The RNN achieved an area under the receiver operating characteristics (AUROC) of 95% on a dataset collected from Xeno-canto and Google’s AudioSet ontology in contrast to 65% across a stratified random sample of field recordings collected from the Madre de Dios region of the Peruvian Amazon. In an attempt to alleviate this discrepancy, we applied various audio data augmentation techniques in the network’s training process which led to an AUROC of 77% across the field recordings. Authors: Jacob G Ayers (UC San Diego); Yaman Jandali (University of California, San Diego); Yoo-Jin Hwang (Harvey Mudd College); Erika Joun (University of California, San Diego); Gabriel Steinberg (Binghampton University); Mathias Tobler (San Diego Zoo Wildlife Alliance); Ian Ingram (San Diego Zoo Wildlife Alliance); Ryan Kastner (University of California San Diego); Curt Schurgers (University of California San Diego)
ICML 2021	Modeling Bird Migration by Disaggregating Population Level Observations (Papers Track) Abstract and authors: (click to expand) Abstract: Birds are shifting migratory routes and timing in response to climate change, but modeling migration to better understand these changes is difficult. Some recent work leverages fluid dynamics models, but this requires individual flight speed and directional data which may not be readily available. We developed an alternate modeling method which only requires population level positional data and use it to model migration routes of the American Woodcock (Scolopax minor). We use our model to sample simulated bird trajectories and compare them to real trajectories in order to evaluate the model. Authors: Miguel Fuentes (University of Massachusetts, Amherst); Benjamin Van Doren (Cornell University); Daniel Sheldon (University of Massachusetts, Amherst)
ICML 2021	Forecasting Sea Ice Concentrations using Attention-based Ensemble LSTM (Papers Track) Abstract and authors: (click to expand) Abstract: Accurately forecasting Arctic sea ice from sub-seasonal to seasonal scales has been a major scientific effort with fundamental challenges at play. In addition to physics-based earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven sea ice forecasting, we propose an attention-based Long Short Term Memory (LSTM) ensemble method to predict monthly sea ice extent up to 1 month ahead. Using daily and monthly satellite retrieved sea ice data from NSIDC and atmospheric and oceanic variables from ERA5 reanalysis product for 39 years, we show that our multi-temporal ensemble method outperforms several baseline and recently proposed deep learning models. This will substantially improve our ability in predicting future Arctic sea ice changes, which is fundamental for forecasting transporting routes, resource development, coastal erosion, threats to Arctic coastal communities and wildlife. Authors: Sahara Ali (University of Maryland, Baltimore County); Yiyi Huang (University of Maryland, Baltimore County); Xin Huang (University of Maryland, Baltimore County); Jianwu Wang (University of Maryland, Baltimore County)
ICML 2021	Preserving the integrity of the Canadian northern ecosystems through insights provided by reinforcement learning-based Arctic fox movement models (Proposals Track) Abstract and authors: (click to expand) Abstract: Realistic modeling of the movement of the Arctic fox, one of the main predators of the circumpolar world, is crucial to understand the processes governing the distribution of the Canadian Arctic biodiversity. Current methods, however, are unable to adequately account for complex behaviors as well as intra- and interspecific relationships. We propose to harness the potential of reinforcement learning to develop innovative models that will address these shortcomings and provide the backbone to predict how vertebrate communities may be affected by environmental changes in the Arctic, an essential step towards the elaboration of rational conservation actions. Authors: Catherine Villeneuve (Université Laval); Frédéric Dulude-De Broin (Université Laval); Pierre Legagneux (Université Laval); Dominique Berteaux (Université du Québec à Rimouski); Audrey Durand (Université Laval)
ICML 2021	On the Role of Spatial Clustering Algorithms in Building Species Distribution Models from Community Science Data (Proposals Track) Best Paper: Proposals Abstract and authors: (click to expand) Abstract: This paper discusses opportunities for developments in spatial clustering methods to help leverage broad scale community science data for building species distribution models (SDMs). SDMs are tools that inform the science and policy needed to mitigate the impacts of climate change on biodiversity. Community science data span spatial and temporal scales unachievable by expert surveys alone, but they lack the structure imposed in smaller scale studies to allow adjustments for observational biases. Spatial clustering approaches can construct the necessary structure after surveys have occurred, but more work is needed to ensure that they are effective for this purpose. In this proposal, we describe the role of spatial clustering for realizing the potential of large biodiversity datasets, how existing methods approach this problem, and ideas for future work. Authors: Mark Roth (Oregon State University); Tyler Hallman (Swiss Ornithological Institute); W. Douglas Robinson (Oregon State University); Rebecca Hutchinson (Oregon State University)
ICML 2021	Leveraging Domain Adaptation for Low-Resource Geospatial Machine Learning (Proposals Track) Abstract and authors: (click to expand) Abstract: Machine learning in remote sensing has matured alongside a proliferation in availability and resolution of geospatial imagery, but its utility is bottlenecked by the need for labeled data. What's more, many labeled geospatial datasets are specific to certain regions, instruments, or extreme weather events. We investigate the application of modern domain-adaptation to multiple proposed geospatial benchmarks, uncovering unique challenges and proposing solutions to them. Authors: John M Lynch (NC State University); Sam Wookey (Masterful AI)
NeurIPS 2020	Spatio-Temporal Learning for Feature Extraction inTime-Series Images (Papers Track) Abstract and authors: (click to expand) Abstract: Earth observation programs have provided highly useful information in global climate change research over the past few decades and greatly promoted its development, especially through providing biological, physical, and chemical parameters on a global scale. Programs such as Landsat, Sentinel, SPOT, and Pleiades can be used to acquire huge volume of medium to high resolution images every day. In this work, we organize these data in time series and we exploit both temporal and spatial information they provide to generate accurate and up-to-date land cover maps that can be used to monitor vulnerable areas threatened by the ongoing climatic and anthropogenic global changes. For this purpose, we combine a fully convolutional neural network with a convolutional long short-term memory. Implementation details of the proposed spatio-temporal neural network architecture are described. Examples are provided for the monitoring of roads and mangrove forests on the West African coast. Authors: Gael Kamdem De Teyou (Huawei)
NeurIPS 2020	Mangrove Ecosystem Detection using Mixed-Resolution Imagery with a Hybrid-Convolutional Neural Network (Papers Track) Abstract and authors: (click to expand) Abstract: Mangrove forests are rich in biodiversity and are a large contributor to carbon sequestration critical in the fight against climate change. However, they are currently under threat from anthropogenic activities, so monitoring their health, extent, and productivity is vital to our ability to protect these important ecosystems. Traditionally, lower resolution satellite imagery or high resolution unmanned air vehicle (UAV) imagery has been used independently to monitor mangrove extent, both offering helpful features to predict mangrove extent. To take advantage of both of these data sources, we propose the use of a hybrid neural network, which combines a Convolutional Neural Network (CNN) feature extractor with a Multilayer-Perceptron (MLP), to accurately detect mangrove areas using both medium resolution satellite and high resolution drone imagery. We present a comparison of our novel Hybrid CNN with algorithms previously applied to mangrove image classification on a data set we collected of dwarf mangroves from consumer UAVs in Baja California Sur, Mexico, and show a 95\% intersection over union (IOU) score for mangrove image classification, outperforming all our baselines. Authors: Dillon Hicks (Engineers for Exploration); Ryan Kastner (University of California San Diego); Curt Schurgers (University of California San Diego); Astrid Hsu (University of California San Diego); Octavio Aburto (University of California San Diego)
NeurIPS 2020	Movement Tracks for the Automatic Detection of Fish Behavior in Videos (Papers Track) Abstract and authors: (click to expand) Abstract: Global warming is predicted to profoundly impact ocean ecosystems. Fish behavior is an important indicator of changes in such marine environments. Thus, the automatic identification of key fish behavior in videos represents a much needed tool for marine researchers, enabling them to study climate change-related phenomena. We offer a dataset of sablefish (Anoplopoma fimbria) startle behaviors in underwater videos, and investigate the use of deep learning (DL) methods for behavior detection on it. Our proposed detection system identifies fish instances using DL-based frameworks, determines trajectory tracks, derives novel behavior-specific features, and employs Long Short-Term Memory (LSTM) networks to identify startle behavior in sablefish. Its performance is studied by comparing it with a state-of-the-art DL-based video event detector. Authors: Declan GD McIntosh (University Of Victoria); Tunai Porto Marques (University of Victoria); Alexandra Branzan Albu (University of Victoria); Rodney Rountree (University of Victoria); Fabio De Leo Cabrera (Ocean Networks Canada)
NeurIPS 2020	Counting Cows: Tracking Illegal Cattle Ranching From High-Resolution Satellite Imagery (Papers Track) Abstract and authors: (click to expand) Abstract: Cattle farming is responsible for 8.8\% of greenhouse gas emissions worldwide. In addition to the methane emitted due to their digestive process, the growing need for grazing areas is an important driver of deforestation. While some regulations are in place for preserving the Amazon against deforestation, these are being flouted in various ways. Hence the need to scale and automate the monitoring of cattle ranching activities. Through a partnership with \textit{Anonymous under review}, we explore the feasibility of tracking and counting cattle at the continental scale from satellite imagery. With a license from Maxar Technologies, we obtained satellite imagery of the Amazon at 40cm resolution, and compiled a dataset of 903 images containing a total of 28498 cattle. Our experiments show promising results and highlight important directions for the next steps on both counting algorithms and the data collection processes for solving such challenges. Authors: Issam Hadj Laradji (Element AI); Pau Rodriguez (Element AI); Alfredo Kalaitzis (University of Oxford); David Vazquez (Element AI); Ross Young (Element AI); Ed Davey (Global Witness); Alexandre Lacoste (Element AI)
NeurIPS 2020	EarthNet2021: A novel large-scale dataset and challenge for forecasting localized climate impacts (Papers Track) Abstract and authors: (click to expand) Abstract: Climate change is global, yet its concrete impacts can strongly vary between different locations in the same region. Seasonal weather forecasts currently operate at the mesoscale (> 1 km). For more targeted mitigation and adaptation, modelling impacts to < 100 m is needed. Yet, the relationship between driving variables and Earth’s surface at such local scales remains unresolved by current physical models. Large Earth observation datasets now enable us to create machine learning models capable of translating coarse weather information into high-resolution Earth surface forecasts encompassing localized climate impacts. Here, we define high-resolution Earth surface forecasting as video prediction of satellite imagery conditional on mesoscale weather forecasts. Video prediction has been tackled with deep learning models. Developing such models requires analysis-ready datasets. We introduce EarthNet2021, a new, curated dataset containing target spatio-temporal Sentinel 2 satellite imagery at 20 m resolution, matched with high-resolution topography and mesoscale (1.28 km) weather variables. With over 32000 samples it is suitable for training deep neural networks. Comparing multiple Earth surface forecasts is not trivial. Hence, we define the EarthNetScore, a novel ranking criterion for models forecasting Earth surface reflectance. For model intercomparison we frame EarthNet2021 as a challenge with four tracks based on different test sets. These allow evaluation of model validity and robustness as well as model applicability to extreme events and the complete annual vegetation cycle. In addition to forecasting directly observable weather impacts through satellite-derived vegetation indices, capable Earth surface models will enable downstream applications such as crop yield prediction, forest health assessments, coastline management, or biodiversity monitoring. Find data, code, and how to participate at www.earthnet.tech . Authors: Christian Requena-Mesa (Computer Vision Group, Friedrich Schiller University Jena; DLR Institute of Data Science, Jena; Max Planck Institute for Biogeochemistry, Jena); Vitus Benson (Max-Planck-Institute for Biogeochemistry); Jakob Runge (Institute of Data Science, German Aerospace Center (DLR)); Joachim Denzler (Computer Vision Group, Friedrich Schiller University Jena, Germany); Markus Reichstein (Max Planck Institute for Biogeochemistry, Jena; Michael Stifel Center Jena for Data-Driven and Simulation Science, Jena)
NeurIPS 2020	VConstruct: Filling Gaps in Chl-a Data Using a Variational Autoencoder (Papers Track) Abstract and authors: (click to expand) Abstract: Remote sensing of Chlorophyll-a is vital in monitoring climate change. Chlorphylla measurements give us an idea of the algae concentrations in the ocean, which lets us monitor ocean health. However, a common problem is that the satellites used to gather the data are commonly obstructed by clouds and other artifacts. This means that time series data from satellites can suffer from spatial data loss. There are a number of algorithms that are able to reconstruct the missing parts of these images to varying degrees of accuracy, with Data INterpolating Empirical Orthogonal Functions (DINEOF) being the current standard. However, DINEOF is slow, suffers from accuracy loss in temporally homogenous waters, reliant on temporal data, and only able to generate a single potential reconstruction. We propose a machine learning approach to reconstruction of Chlorophyll-a data using a Variational Autoencoder (VAE). Our accuracy results to date are competitive with but slightly less accurate than DINEOF. We show the benefits of our method including vastly decreased computation time and ability to generate multiple potential reconstructions. Lastly, we outline our planned improvements and future work. Authors: Matthew Ehrler (University of Victoria); Neil Ernst (University of Victoria)
NeurIPS 2020	A Comparison of Data-Driven Models for Predicting Stream Water Temperature (Papers Track) Abstract and authors: (click to expand) Abstract: Changes to the Earth's climate are expected to negatively impact water resources in the future. It is important to have accurate modelling of river flow and water quality to make optimal decisions for water management. Machine learning and deep learning models have become promising methods for making such hydrological predictions. Using these models, however, requires careful consideration both of data constraints and of model complexity for a given problem. Here, we use machine learning (ML) models to predict monthly stream water temperature records at three monitoring locations in the Northwestern United States with long-term datasets, using meteorological data as predictors. We fit three ML models: a Multiple Linear Regression, a Random Forest Regression, and a Support Vector Regression, and compare them against two baseline models: a persistence model and historical model. We show that all three ML models are reasonably able to predict mean monthly stream temperatures with root mean-squared errors (RMSE) ranging from 0.63-0.91 degrees Celsius. Of the three ML models, Support Vector Regression performs the best with an error of 0.63-0.75 degrees Celsius. However, all models perform poorly on extreme values of water temperature. We identify the need for machine learning approaches for predicting extreme values for variables such as water temperature, since it has significant implications for stream ecosystems and biota. Authors: Helen Weierbach (Lawrence Berkeley); Aranildo Lima (Aquatic Informatics); Danielle Christianson (Lawrence Berkeley National Lab); Boris Faybishenko (Lawrence Berkeley National Lab); Val Hendrix (Lawrence Berkeley National Lab); Charuleka Varadharajan (Lawrence Berkeley National Lab)
NeurIPS 2020	Automated Salmonid Counting in Sonar Data (Papers Track) Abstract and authors: (click to expand) Abstract: The prosperity of salmonids is crucial for several ecological and economic functions. Accurately counting spawning salmonids during their seasonal migration is essential in monitoring threatened populations, assessing the efficacy of recovery strategies, guiding fishing season regulations, and supporting the management of commercial and recreational fisheries. While several different methods exist for counting river fish, they all rely heavily on human involvement, introducing a hefty financial and time burden. In this paper we present an automated fish counting method that utilizes data captured from ARIS sonar cameras to detect and track salmonids migrating in rivers. Our results show that our fully automated system has a 19.3% per-clip error when compared to human counting performance. There is room to improve, but our system can already decrease the amount of time field biologists and fishery managers need to spend manually watching ARIS clips. Authors: Peter Kulits (Caltech); Angelina Pan (Caltech); Sara M Beery (Caltech); Erik Young (Trout Unlimited); Pietro Perona (California Institute of Technology); Grant Van Horn (Cornell University)
NeurIPS 2020	Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya (Papers Track) Abstract and authors: (click to expand) Abstract: Glacier mapping is key to ecological monitoring in the Hindu Kush Himalaya region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated mapping from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier mapping process. Authors: Shimaa Baraka (Mila); Benjamin Akera (Makerere University); Bibek Aryal (The University of Texas at El Paso); Tenzing Sherpa (International Centre for Integrated Mountain Development); Finu Shrestha (International Centre for Integrated Mountain Development); Anthony Ortiz (Microsoft); Kris Sankaran (University of Wisconsin-Madison); Juan M Lavista Ferres (Microsoft); Mir A Matin (International Center for Integrated Mountain Development); Yoshua Bengio (Mila)
NeurIPS 2020	Investigating two super-resolution methods for downscaling precipitation: ESRGAN and CAR (Papers Track) Abstract and authors: (click to expand) Abstract: In an effort to provide optimal inputs to downstream modeling systems (e.g., a hydrodynamics model that simulates the water circulation of a lake), we hereby strive to enhance resolution of precipitation fields from a weather model by up to 9x. We test two super-resolution models: the enhanced super-resolution generative adversarial networks (ESRGAN) proposed in 2017, and the content adaptive resampler (CAR) proposed in 2020. Both models outperform simple bicubic interpolation, with the ESRGAN exceeding expectations for accuracy. We make several proposals for extending the work to ensure it can be useful tool for quantifying the impact of climate change on local ecosystems while removing reliance on energy-intensive, high-resolution weather model simulations. Authors: Campbell Watson (IBM); Chulin Wang (Northwestern University); Tim Lynar (University of New South Wales); Komminist Weldemariam (IBM Research)
NeurIPS 2020	Spatiotemporal Features Improve Fine-Grained Butterfly Image Classification (Papers Track) Abstract and authors: (click to expand) Abstract: Understanding the changing distributions of butterflies gives insight into the impacts of climate change across ecosystems and is a prerequisite for conservation efforts. eButterfly is a citizen science website created to allow people to track the butterfly species around them and use these observations to contribute to research. However, correctly identifying butterfly species is a challenging task for non-specialists and currently requires the involvement of entomologists to verify the labels of novice users on the website. We have developed a computer vision model to label butterfly images from eButterfly automatically, decreasing the need for human experts. We employ a model that incorporates geographic and temporal information of where and when the image was taken, in addition to the image itself. We show that we can successfully apply this spatiotemporal model for fine-grained image recognition, significantly improving the accuracy of our classification model compared to a baseline image recognition system trained on the same dataset. Authors: Marta Skreta (University of Toronto); Sasha Luccioni (Mila); David Rolnick (McGill University, Mila)
NeurIPS 2020	Towards Data-Driven Physics-Informed Global Precipitation Forecasting from Satellite Imagery (Papers Track) Abstract and authors: (click to expand) Abstract: Under the effects of global warming, extreme events such as floods and droughts are increasing in frequency and intensity. This trend directly affects communities and make all the more urgent widening the access to accurate precipitation forecasting systems for disaster preparedness. Nowadays, weather forecasting relies on numerical models necessitating massive computing resources that most developing countries cannot afford. Machine learning approaches are still in their infancy but already show the promise for democratizing weather predictions, by leveraging any data source and requiring less compute. In this work, we propose a methodology for data-driven and physics-aware global precipitation forecasting from satellite imagery. To fully take advantage of the available data, we design the system as three elements: 1. The atmospheric state is estimated from recent satellite data. 2. The atmospheric state is propagated forward in time. 3. The atmospheric state is used to derive the precipitation intensity within a nearby time interval. In particular, our use of stochastic methods for forecasting the atmospheric state represents a novel application in this domain. Authors: Valentina Zantedeschi (GE Global Research); Daniele De Martini (University of Oxford); Catherine Tong (University of Oxford); Christian A Schroeder de Witt (University of Oxford); Piotr Bilinski (University of Warsaw / University of Oxford); Alfredo Kalaitzis (University of Oxford); Matthew Chantry (University of Oxford); Duncan Watson-Parris (University of Oxford)
NeurIPS 2020	Hyperspectral Remote Sensing of Aquatic Microbes to Support Water Resource Management (Proposals Track) Abstract and authors: (click to expand) Abstract: Harmful algal blooms in drinking water supply and at recreational sites endanger human health. Excessive algal growth can result in low oxygen environments, making them uninhabitable for fish and other aquatic life. Harmful algae and algal blooms are predicted to increase in frequency and extent due to the warming climate, but microbial dynamics remain difficult to predict. Existing satellite remote sensing monitoring technologies are ill-equipped to discriminate harmful algae, while models do not adequately capture the complex controls on algal populations. This proposal explores the potential for Bayesian neural networks to detect phytoplankton pigments from hyperspectral remote sensing reflectance retrievals. Once developed, such a model could enable hyperspectral remote sensing retrievals to support decision making in water resource management as more advanced ocean color satellites are launched in the coming decade. While uncertainty quantification motivates the proposed use of Bayesian models, the interpretation of these uncertainties in an operational context must be carefully considered. Authors: Grace E Kim (Booz Allen Hamilton); Evan Poworoznek (NASA GSFC); Susanne Craig (NASA GSFC)
NeurIPS 2020	Artificial Intelligence, Machine Learning and Modeling for Understanding the Oceans and Climate Change (Proposals Track) Abstract and authors: (click to expand) Abstract: These changes will have a drastic impact on almost all forms of life in the ocean with further consequences on food security, ecosystem services in coastal and inland communities. Despite these impacts, scientific data and infrastructures are still lacking to understand and quantify the consequences of these perturbations on the marine ecosystem. Understanding this phenomenon is not only an urgent but also a scientifically demanding task. Consequently, it is a problem that must be addressed with a scientific cohort approach, where multi-disciplinary teams collaborate to bring the best of different scientific areas. In this proposal paper, we describe our newly launched four-years project focused on developing new artificial intelligence, machine learning, and mathematical modeling tools to contribute to the understanding of the structure, functioning, and underlying mechanisms and dynamics of the global ocean symbiome and its relation with climate change. These actions should enable the understanding of our oceans and predict and mitigate the consequences of climate change. Authors: Nayat Sánchez Pi (Inria); Luis Martí (Inria); André Abreu (Fountation Tara Océans); Olivier Bernard (Inria); Colomban de Vargas (CNRS); Damien Eveillard (Univ. Nantes); Alejandro Maass (CMM, U. Chile); Pablo Marquet (PUC); Jacques Sainte-Marie (Inria); Julien Salomin (Inria); Marc Schoenauer (INRIA); Michele Sebag (LRI, CNRS, France)
ICLR 2020	SolarNet: A Deep Learning Framework to Map Solar Plants In China From Satellite Imagery (Papers Track) Abstract and authors: (click to expand) Abstract: Renewable energy such as solar power is critical to fight the ever more serious climate, how to effectively detect renewable energy has became an important issue for governments. In this paper, we proposed a deep learning framework named SolarNet which is designed to perform semantic segmentation on large scale satellite imagery data to detect solar farms. SolarNet has successfully mapped 439 solar farms in China, covering near 2000 square kilometers, equivalent to the size of whole Shenzhen city or two and a half of New York city. To the best of our knowledge, it is the first time that we used deep learning to reveal the locations and sizes of solar farms in China, which could provide insights for solar power companies, climate finance and markets. Authors: Xin Hou (WeBank); Biao Wang (WeBank); Wanqi Hu (WeBank); lei yin (WeBank); Anbu Huang (WeBank); Haishan Wu (WeBank)
ICLR 2020	A CONTINUAL LEARNING APPROACH FOR LOCAL LEVEL ENVIRONMENTAL MONITORING IN LOW-RESOURCE SETTINGS (Papers Track) Abstract and authors: (click to expand) Abstract: An increasingly important dimension in the quest for mitigation and monitoring of environmental change is the role of citizens. The crowd-based monitoring of local level anthropogenic alterations is essential towards measurable changes in different contributing factors to climate change. With the proliferation of mobile technologies here in the African continent, it is useful to have machine learning based models that are deployed on mobile devices and that can learn continually from streams of data over extended time, possibly pertaining to different tasks of interest. In this paper, we demonstrate the localisation of deforestation indicators using lightweight models and extend to incorporate data about wildﬁres and smoke detection. The idea is to show the need and potential of continual learning approaches towards building robust models to track local environmental alterations. Authors: Arijit Patra (University of Oxford)
ICLR 2020	SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework (Papers Track) Abstract and authors: (click to expand) Abstract: Soil moisture is critical component of crop health and monitoring it can enable further actions for increasing yield or preventing catastrophic die off. As climate change increases the likelihood of extreme weather events and reduces the predictability of weather, and non-optimal soil moistures for crops may become more likely. In this work, we use a series of LSTM architectures to analyze measurements of soil moisture and vegetation indices derived from satellite imagery. The system learns to predict the future values of these measurements. These spatially sparse values and indices are used as input features to an interpolation method that infer spatially dense moisture maps at multiple depths for a future time point. This has the potential to provide advance warning for soil moistures that may be inhospitable to crops across an area with limited monitoring capacity. Authors: Conrad J Foley (Deep Planet); Sagar Vaze (deepplanet.ai); Mohamed El Amine Seddiq (Deep Planet); Aleksei Unagaev (Deep Planet); Natalia Efremova (University of Oxford)
ICLR 2020	TrueBranch: Metric Learning-based Verification of Forest Conservation Projects (Proposals Track) Best Proposal Award Abstract and authors: (click to expand) Abstract: International stakeholders increasingly invest in offsetting carbon emissions, for example, via issuing Payments for Ecosystem Services (PES) to forest conservation projects. Issuing trusted payments requires a transparent monitoring, reporting, and verification (MRV) process of the ecosystem services (e.g., carbon stored in forests). The current MRV process, however, is either too expensive (on-ground inspection of forest) or inaccurate (satellite). Recent works propose low-cost and accurate MRV via automatically determining forest carbon from drone imagery, collected by the landowners. The automation of MRV, however, opens up the possibility that landowners report untruthful drone imagery. To be robust against untruthful reporting, we propose TrueBranch, a metric learning-based algorithm that verifies the truthfulness of drone imagery from forest conservation projects. TrueBranch aims to detect untruthfully reported drone imagery by matching it with public satellite imagery. Preliminary results suggest that nominal distance metrics are not sufficient to reliably detect untruthfully reported imagery. TrueBranch leverages a method from metric learning to create a feature embedding in which truthfully and untruthfully collected imagery is easily distinguishable by distance thresholding. Authors: Simona Santamaria (ETH Zurich); David Dao (ETH Zurich); Björn Lütjens (MIT); Ce Zhang (ETH)
ICLR 2020	Using ML to close the vocabulary gap in the context of environment and climate change in Chichewa (Proposals Track) Abstract and authors: (click to expand) Abstract: In the west, alienation from nature and deteriorating opportunities to experience it, have led educators to incorporate educational programs in schools, to bring pupils in contact with nature and to enhance their understanding of issues related to the environment and its protection. In Africa, and in Malawi, where most people engage in agriculture, and spend most of their time in the 'outdoors', alienation from nature is happening too, although in different ways. Large portion of the indigenous vocabulary and knowledge remains unknown or is slowly disappearing, and there is a need to build a glossary of terms regarding environment and climate change in the vernacular to improve the dialog regarding climate change and environmental protection.. We believe that ML has a role to play in closing the ‘vocabulary gap’ of terms and concepts regarding the environment and climate change that exists in Chichewa and other Malawian languages by helping to creating a visual dictionary of key terms used to describe the environment and explain the issues involved in climate change and their meaning. Chichewa is a descriptive language, one English term may be translated using several words. Thus, the task is not to detect just literal translations, but also translations by means of ‘descriptions’ and illustrations and thus extract correspondence between terms and definitions and to measure how appropriate a term is to convey the meaning intended. As part of this project, ML can be used to identify ‘loanword patterns’, which may be useful in understanding the transmission of cultural items. Authors: Amelia Taylor (University of Malawi, The Polytechnic)
NeurIPS 2019	Natural Language Generation for Operations and Maintenance in Wind Turbines (Papers Track) Abstract and authors: (click to expand) Abstract: Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to predict faults in wind turbines, but these predictions have not been supported by suggestions on how to avert and fix occurring errors. In this paper, we present a data-to-text generation system utilising transformers to produce event descriptions of turbine faults from SCADA data capturing the operational status of turbines, and proposing maintenance strategies. Experiments show that our model learns reasonable feature representations that correspond to expert judgements. We anticipate that in making a contribution to the reliability of wind energy, we can encourage more organisations to switch to sustainable energy sources and help combat climate change. Authors: Joyjit Chatterjee (University of Hull); Nina Dethlefs (University of Hull)
NeurIPS 2019	Human-Machine Collaboration for Fast Land Cover Mapping (Papers Track) Abstract and authors: (click to expand) Abstract: We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, human labelers can interactively query model predictions on unlabeled data, choose which data to label, and see the resulting effect on the model's predictions. This bi-directional feedback loop allows humans to learn how the model responds to new data. Our hypothesis is that this rich feedback allows human labelers to create mental models that enable them to better choose which biases to introduce to the model. We implement this framework for fine-tuning high-resolution land cover segmentation models and evaluate it against traditional active learning based approaches. More specifically, we fine-tune a deep neural network -- trained to segment high-resolution aerial imagery into different land cover classes in Maryland, USA -- to a new spatial area in New York, USA. We find that the tight loop turns the algorithm and the human operator into a hybrid system that can produce land cover maps of large areas more efficiently than the traditional workflows. Authors: Caleb Robinson (Georgia Institute of Technology); Anthony Ortiz (University of Texas at El Paso); Nikolay Malkin (Yale University); Blake Elias (Microsoft); Andi Peng (Microsoft); Dan Morris (Microsoft); Bistra Dilkina (University of Southern California); Nebojsa Jojic (Microsoft Research)
NeurIPS 2019	VideoGasNet: Deep Learning for Natural Gas Methane Leak Classification Using An Infrared Camera (Papers Track) Abstract and authors: (click to expand) Abstract: Mitigating methane leakage from the natural gas system have become an increasing concern for climate change. Efficacious methane leak detection and classification can make the mitigation process more efficient and cost effective. Optical gas imaging is widely used for the purpose of leak detection, but it cannot directly provide detection results and leak sizes. Few studies have examined the possibility of leak classification using videos taken by the infrared camera (IR), an optical gas imaging device. In this study, we consider the leak classification problem as a video classification problem and investigated the application of deep learning techniques in methane leak detection. Firstly we collected the first methane leak video dataset - GasVid, which has ~1 M frames of labeled videos of methane leaks from different leaking equipment, covering a wide range of leak sizes (5.3-2051.6 g\ce{CH4}/h) and imaging distances (4.6-15.6 m). Secondly, we studied three deep learning algorithms, including 2D Convolutional Neural Networks (CNN) model, 3D CNN and the Convolutional Long Short Term Memory (ConvLSTM). We find that 3D CNN is the most outstanding and robust architecture, which was named VideoGasNet. The leak-non-leak detection accuracy can reach 100%, and the highest small-medium-large classification accuracy is 78.2% with our 3D CNN network. In summary, VideoGasNet greatly extends the capabilities of IR camera-based leak monitoring system from leak detection only to automated leak classification with high accuracy and fast processing speed, significant mitigation efficiency. Authors: Jingfan Wang (Stanford University)
NeurIPS 2019	A Deep Learning-based Framework for the Detection of Schools of Herring in Echograms (Papers Track) Abstract and authors: (click to expand) Abstract: Tracking the abundance of underwater species is crucial for understanding the effects of climate change on marine ecosystems. Biologists typically monitor underwater sites with echosounders and visualize data as 2D images (echograms); they interpret these data manually or semi-automatically, which is time-consuming and prone to inconsistencies. This paper proposes a deep learning framework for the automatic detection of schools of herring from echograms. Experiments demonstrated that our approach outperforms a traditional machine learning algorithm that uses hand-crafted features. Our framework could easily be expanded to detect more species of interest to sustainable fisheries. Authors: Alireza Rezvanifar (University of Victoria); Tunai Porto Marques (University of Victoria ); Melissa Cote (University of Victoria); Alexandra Branzan Albu (University of Victoria); Alex Slonimer (ASL Environmental Sciences); Thomas Tolhurst (ASL Environmental Sciences ); Kaan Ersahin (ASL Environmental Sciences ); Todd Mudge (ASL Environmental Sciences ); Stephane Gauthier (Fisheries and Oceans Canada)
NeurIPS 2019	Emulating Numeric Hydroclimate Models with Physics-Informed cGANs (Papers Track) Honorable Mention Abstract and authors: (click to expand) Abstract: Process-based numerical simulations, including those for climate modeling applications, are compute and resource intensive, requiring extensive customization and hand-engineering for encoding governing equations and other domain knowledge. On the other hand, modern deep learning employs a significantly simpler and more efficient computational workflow, and has been shown impressive results across a myriad of applications in the computational sciences. In this work, we investigate the potential of deep generative learning models, specifically conditional Generative Adversarial Networks (cGANs), to simulate the output of a physics-based model of the spatial distribution of the water content of mountain snowpack - the snow water equivalent (SWE). We show preliminary results indicating that the cGAN model is able to learn diverse mappings between meteorological forcings and SWE output. Thus physics based cGANs provide a means for fast and accurate SWE modeling that can have significant impact in a variety of applications (e.g., hydropower forecasting, agriculture, and water supply management). In climate science, the Snowpack and SWE are seen as some of the best indicative variables for investigating climate change and its impact. The massive speedups, diverse sampling, and sensitivity/saliency modelling that cGANs can bring to SWE estimation will be extremely important to investigating variables linked to climate change as well as predicting and forecasting the potential effects of climate change to come. Authors: Ashray Manepalli (terrafuse); Adrian Albert (terrafuse, inc.); Alan Rhoades (Lawrence Berkeley National Lab); Daniel Feldman (Lawrence Berkeley National Lab)
NeurIPS 2019	FutureArctic - beyond Computational Ecology (Proposals Track) Abstract and authors: (click to expand) Abstract: This paper presents the Future Arctic initiative, a multi-disciplinary training network where machine learning researchers and ecologists cooperatively study both long- and short-term responses to future climate in Iceland. Authors: Steven Latre (UAntwerpen); Dimitri Papadimitriou (UAntwerpen); Ivan Janssens (UAntwerpen); Eric Struyf (UAntwerpen); Erik Verbruggen (UAntwerpen); Ivika Ostonen (UT); Josep Penuelas (UAB); Boris Rewald (RootEcology); Andreas Richter (University of Vienna); Michael Bahn (University of Innsbruck)
NeurIPS 2019	Autonomous Sensing and Scientific Machine Learning for Monitoring Greenhouse Gas Emissions (Proposals Track) Abstract and authors: (click to expand) Abstract: Greenhouse gas emissions are a key driver of climate change. In order to develop and tune climate models, measurements of natural and anthropogenic phenomenon are necessary. Traditional methods (i.e., physical sample collection and ex situ analysis) tend to be sample sparse and low resolution, whereas global remote sensing methods tend to miss small- and mid-scale dynamic phenomenon. In situ instrumentation carried by a robotic platform is suited to study greenhouse gas emissions at unprecedented spatial and temporal resolution. However, collecting scientifically rich datasets of dynamic or transient emission events requires accurate and flexible models of gas emission dynamics. Motivated by applications in seasonal Arctic thawing and volcanic outgassing, we propose the use of scientific machine learning, in which traditional scientific models (in the form of ODEs/PDEs) are combined with machine learning techniques (generally neural networks) to better incorporate data into a structured, interpretable model. Our technical contributions will primarily involve developing these hybrid models and leveraging model uncertainty estimates during sensor planning to collect data that efficiently improves gas emission models in small-data domains. Authors: Genevieve Flaspohler (MIT); Victoria Preston (MIT); Nicholas Roy (MIT); John Fisher (MIT); Adam Soule (Woods Hole Oceanographic Institution); Anna Michel (Woods Hole Oceanographic Institution)
ICML 2019	Policy Search with Non-uniform State Representations for Environmental Sampling (Research Track) Abstract and authors: (click to expand) Abstract: Surveying fragile ecosystems like coral reefs is important to monitor the effects of climate change. We present an adaptive sampling technique that generates efficient trajectories covering hotspots in the region of interest at a high rate. A key feature of our sampling algorithm is the ability to generate action plans for any new hotspot distribution using the parameters learned on other similar looking distributions. Authors: Sandeep Manjanna (McGill University); Herke van Hoof (University of Amsterdam); Gregory Dudek (McGill University)
ICML 2019	Mapping land use and land cover changes faster and at scale with deep learning on the cloud (Research Track) Abstract and authors: (click to expand) Abstract: Policymakers rely on Land Use and Land Cover (LULC) maps for evaluation and planning. They use these maps to plan climate-smart agriculture policy, improve housing resilience (to earthquakes or other natural disasters), and understand how to grow commerce in small communities. A number of institutions have created global land use maps from historic satellite imagery. However, these maps can be outdated and are often inaccurate, particularly in their representation of developing countries. We worked with the European Space Agency (ESA) to develop a LULC deep learning workflow on the cloud that can ingest Sentinel-2 optical imagery for a large scale LULC change detection. It’s an end-to-end workflow that sits on top of two comprehensive tools, SentinelHub, and eo-learn, which seamlessly link earth observation data with machine learning libraries. It can take in the labeled LULC and associated AOI in shapefiles, set up a task to fetch cloud-free, time series imagery stacks within the defined time interval by the users. It will pair the satellite imagery tile with it’s labeled LULC mask for the supervised deep learning model training on the cloud. Once a well-performing model is trained, it can be exported as a Tensorflow/Pytorch serving docker image to work with our cloud-based model inference pipeline. The inference pipeline can automatically scale with the number of images to be processed. Changes in land use are heavily influenced by human activities (e.g. agriculture, deforestation, human settlement expansion) and have been a great source of greenhouse gas emissions. Sustainable forest and land management practices vary from region to region, which means having flexible, scalable tools will be critical. With these tools, we can empower analysts, engineers, and decision-makers to see where contributions to climate-smart agricultural, forestry and urban resilience programs can be made. Authors: Zhuangfang Yi (Development Seed); Drew Bollinger (Development Seed); Devis Peressutti (Sinergise)
ICML 2019	Deep Learning for Wildlife Conservation and Restoration Efforts (Deployed Track) Abstract and authors: (click to expand) Abstract: Climate change and environmental degradation are causing species extinction worldwide. Automatic wildlife sensing is an urgent requirement to track biodiversity losses on Earth. Recent improvements in machine learning can accelerate the development of large-scale monitoring systems that would help track conservation outcomes and target efforts. In this paper, we present one such system we developed. 'Tidzam' is a Deep Learning framework for wildlife detection, identification, and geolocalization, designed for the Tidmarsh Wildlife Sanctuary, the site of the largest freshwater wetland restoration in Massachusetts. Authors: Clement Duhart (MIT Media Lab)
ICML 2019	Reinforcement Learning for Sustainable Agriculture (Ideas Track) Abstract and authors: (click to expand) Abstract: The growing population and the changing climate will push modern agriculture to its limits in an increasing number of regions on earth. Establishing next-generation sustainable food supply systems will mean producing more food on less arable land, while keeping the environmental impact to a minimum. Modern machine learning methods have achieved super-human performance on a variety of tasks, simply learning from the outcomes of their actions. We propose a path towards more sustainable agriculture, considering plant development an optimization problem with respect to certain parameters, such as yield and environmental impact, which can be optimized in an automated way. Specifically, we propose to use reinforcement learning to autonomously explore and learn ways of influencing the development of certain types of plants, controlling environmental parameters, such as irrigation or nutrient supply, and receiving sensory feedback, such as camera images, humidity, and moisture measurements. The trained system will thus be able to provide instructions for optimal treatment of a local population of plants, based on non-invasive measurements, such as imaging. Authors: Jonathan Binas (Mila, Montreal); Leonie Luginbuehl (Department of Plant Sciences, University of Cambridge); Yoshua Bengio (Mila)
ICML 2019	Harness the Power of Artificial intelligence and -Omics to Identify Soil Microbial Functions in Climate Change Projection (Ideas Track) Abstract and authors: (click to expand) Abstract: Contemporary Earth system models (ESMs) omit one of the significant drivers of the terrestrial carbon cycle, soil microbial communities. Soil microbial community not only directly emit greenhouse gasses into the atmosphere through the respiration process, but also release diverse enzymes to catalyze the decomposition of soil organic matter and determine nutrient availability for aboveground vegetation. Therefore, soil microbial community control over terrestrial carbon dynamics and their feedbacks to climate. Currently, inadequate representation of soil microbial communities in ESMs has introduced significant uncertainty in current terrestrial carbon-climate feedbacks. Mitigation of this uncertainty requires to identify functions, diversity, and environmental adaptation of soil microbial communities under global climate change. The revolution of -omics technology allows high throughput quantification of diverse soil enzymes, enabling large-scale studies of microbial functions in climate change. Such studies may lead to revolutionary solutions to predicting microbial-mediated climate-carbon feedbacks at the global scale based on gene-level environmental adaptation strategies of the microbial community. A key initial step in this direction is to identify the biogeography and environmental adaptation of soil enzyme functions based on the massive amount of data generated by -omics technologies. Here we propose to make this step. Artificial intelligence is a powerful, ideal tool for this leap forward. Our project is to integrate Artificial intelligence technologies and global -omics data to represent climate controls on microbial enzyme functions and mapping biogeography of soil enzyme functional groups at global scale. This outcome of this study will allow us to improve the representation of microbial function in earth system modeling and mitigate uncertainty in current climate projection. Authors: Yang Song (Oak Ridge National Lab); Dali Wang (Oak Ridge National Lab); Melanie Mayes (Oak Ridge National Lab)

Ecosystems & Biodiversity

Tutorials

Blog Posts

Discussion Seminars and Webinars

Innovation Grants

Talks

Workshop Papers