Ecosystems & Biodiversity

Tutorials

Blog Posts

Webinars

Workshop Papers

Venue Title
NeurIPS 2023 A machine learning pipeline for automated insect monitoring (Papers Track)
Abstract and authors: (click to expand)

Abstract: Climate change and other anthropogenic factors have led to a catastrophic decline in insects, endangering both biodiversity and the ecosystem services on which human society depends. Data on insect abundance, however, remains woefully inadequate. Camera traps, conventionally used for monitoring terrestrial vertebrates, are now being modified for insects, especially moths. We describe a complete, open-source machine learning-based software pipeline for automated monitoring of moths via camera traps, including object detection, moth/non-moth classification, fine-grained identification of moth species, and tracking individuals. We believe that our tools, which are already in use across three continents, represent the future of massively scalable data collection in entomology.

Authors: Aditya Jain (Mila); Fagner Cunha (Federal University of Amazonas); Michael Bunsen (Mila, eButterfly); Léonard Pasi (EPFL); Anna Viklund (Daresay); Maxim Larrivee (Montreal Insectarium); David Rolnick (McGill University, Mila)

NeurIPS 2023 Understanding Insect Range Shifts with Out-of-Distribution Detection (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Climate change is inducing significant range shifts in insects and other organisms. Large-scale temporal data on populations and distributions are essential for quantifying the effects of climate change on biodiversity and ecosystem services, providing valuable insights for both conservation and pest management. With images from camera traps, we aim to use Mahalanobis distance-based confidence scores to automatically detect new moth species in a region. We intend to make out-of-distribution detection interpretable by identifying morphological characteristics of different species using Grad-CAM. We hope this algorithm will be a useful tool for entomologists to study range shifts and inform climate change adaptation.

Authors: Yuyan Chen (McGill University, Mila); David Rolnick (McGill University, Mila)

NeurIPS 2023 Agile Modeling for Bioacoustic Monitoring (Tutorials Track)
Abstract and authors: (click to expand)

Abstract: Bird, insect, and other wild animal populations are rapidly declining, highlighting the need for better monitoring, understanding, and protection of Earth’s remaining wild places. However, direct monitoring of biodiversity is difficult. Passive Acoustic Monitoring (PAM) enables detection of the vocalizing species in an ecosystem, many of which can be difficult or impossible to detect by satellite or camera trap. Large-scale PAM deployments using low-cost devices allow measuring changes over time and responses to environmental changes, and targeted deployments can discover and monitor endangered or invasive species. Machine learning methods are needed to analyze the thousands or even millions of hours of audio produced by large-scale deployments. But there are a massive number of potential signals to target for bioacoustic measurement, and many of the most interesting lack training data. Many rare species are difficult to observe. Detecting specific call-types and juvenile calls can give further insight into behavior and population health, but almost no structured datasets exist for these use-cases. No single classifier can address all of these needs, so practitioners regularly need to create new classifiers to address novel problems. Soundscape annotation efforts are very expensive, and machine learning experts are scarce, creating a bottleneck on analysis. We aim to eliminate the bottleneck by providing an efficient, self-contained active learning workflow for biologists. In this tutorial, we present an integrated workflow for analyzing large unlabeled bioacoustic datasets, adapting new agile modeling techniques to audio. Our goal is to allow experts to create a new high quality classifier for a novel class with under one hour of effort. We achieve this by leveraging transfer learning from high-quality bioacoustic models, vector search over audio databases, and lightweight Python notebook UX. The workflow can begin from a single example, proceeds through an efficient active learning loop, and finally applies the produced classifier to a large mass of unlabeled data to produce insights for ecologists and land managers.

Authors: tom denton (google); Jenny Hamer (Google Research); Rob Laber (Google)

ICLR 2023 Exploring the potential of neural networks for Species Distribution Modeling (Papers Track)
Abstract and authors: (click to expand)

Abstract: Species distribution models (SDMs) relate species occurrence data with environmental variables and are used to understand and predict species distributions across landscapes. While some machine learning models have been adopted by the SDM community, recent advances in neural networks may have untapped potential in this field. In this work, we compare the performance of multi-layer perceptron (MLP) neural networks to well-established SDM methods on a benchmark dataset spanning 225 species in six geographical regions. We also compare the performance of MLPs trained separately for each species to an equivalent model trained on a set of species and performing multi-label classification. Our results show that MLP models achieve comparable results to state-of-the-art SDM methods, such as MaxEnt. We also find that multi-species MLPs perform slightly better than single-species MLPs. This study indicates that neural networks, along with all their convenient and valuable characteristics, are worth considering for SDMs.

Authors: Robin Zbinden (EPFL); Nina van Tiel (EPFL); Benjamin Kellenberger (Yale University); Lloyd H Hughes (EPFL); Devis Tuia (EPFL)

ICLR 2023 Understanding forest resilience to drought with Shapley values (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Increases in drought frequency, intensity, and duration due to climate change are threatening forests around the world. Climate-driven tree mortality is associated with devastating ecological and societal consequences, including the loss of carbon sequestration, habitat provisioning, and water filtration services. A spatially fine-grained understanding of the site characteristics making forests more resilient to drought is still lacking. Furthermore, the complexity of drought effects on forests, which can be cumulative and delayed, demands investigation of the most appropriate drought indices. In this study, we aim to gain a better understanding of the temporal and spatial drivers of drought-induced changes in forest vitality using Shapley values, which allow for the relevance of predictors to be quantified locally. A better understanding of the contribution of meteorological and environmental factors to trees’ response to drought can support forest managers aiming to make forests more climate-resilient.

Authors: Stenka Vulova (Technische Universität Berlin); Alby Duarte Rocha (Technische Universität Berlin); Akpona Okujeni (Humboldt-Universität zu Berlin); Johannes Vogel (Freie Universität Berlin); Michael Förster (Technische Universität Berlin); Patrick Hostert (Humboldt-Universität zu Berlin); Birgit Kleinschmit (Technische Universität Berlin)

ICLR 2023 Bird Distribution Modelling using Remote Sensing and Citizen Science data (Papers Track) Overall Best Paper
Abstract and authors: (click to expand)

Abstract: Climate change is a major driver of biodiversity loss, changing the geographic range and abundance of many species. However, there remain significant knowl- edge gaps about the distribution of species, due principally to the amount of effort and expertise required for traditional field monitoring. We propose an approach leveraging computer vision to improve species distribution modelling, combining the wide availability of remote sensing data with sparse on-ground citizen science data from .We introduce a novel task and dataset for mapping US bird species to their habitats by predicting species encounter rates from satellite images, along with baseline models which demonstrate the power of our approach. Our methods open up possibilities for scalably modelling ecosystems properties worldwide.

Authors: Mélisande Teng (Mila, Université de Montréal); Amna Elmustafa (African Institute for Mathematical Science); Benjamin Akera (McGill University); Hugo Larochelle (UdeS); David Rolnick (McGill University, Mila)

NeurIPS 2022 Optimizing toward efficiency for SAR image ship detection (Papers Track)
Abstract and authors: (click to expand)

Abstract: The detection and prevention of illegal fishing is critical to maintaining a healthy and functional ecosystem. Recent research on ship detection in satellite imagery has focused exclusively on performance improvements, disregarding detection efficiency. However, the speed and compute cost of vessel detection are essential for a timely intervention to prevent illegal fishing. Therefore, we investigated optimization methods that lower detection time and cost with minimal performance loss. We trained an object detection model based on a convolutional neural network (CNN) using a dataset of satellite images. Then, we designed two efficiency optimizations that can be applied to the base CNN or any other base model. The optimizations consist of a fast, cheap classification model and a statistical algorithm. The integration of the optimizations with the object detection model leads to a trade-off between speed and performance. We studied the trade-off using metrics that give different weight to execution time and performance. We show that by using a classification model the average precision of the detection model can be approximated to 99.5% in 44% of the time or to 92.7% in 25% of the time.

Authors: Arthur Van Meerbeeck (KULeuven); Ruben Cartuyvels (KULeuven); Jordy Van Landeghem (KULeuven); Sien Moens (KU Leuven)

NeurIPS 2022 Estimating Chicago’s tree cover and canopy height using multi-spectral satellite imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: Information on urban tree canopies is fundamental to mitigating climate change as well as improving quality of life. Urban tree planting initiatives face a lack of up-to-date data about the horizontal and vertical dimensions of the tree canopy in cities. We present a pipeline that utilizes LiDAR data as ground-truth and then trains a multi-task machine learning model to generate reliable estimates of tree cover and canopy height in urban areas using multi-source multi-spectral satellite imagery for the case study of Chicago.

Authors: John Francis (University College London)

NeurIPS 2022 Identifying Compound Climate Drivers of Forest Mortality with β-VAE (Papers Track)
Abstract and authors: (click to expand)

Abstract: Climate change is expected to lead to higher rates of forest mortality. Forest mortality is a complex phenomenon driven by the interaction of multiple climatic variables at multiple temporal scales, further modulated by the current state of the forest (e.g. age, stem diameter, and leaf area index). Identifying the compound climate drivers of forest mortality would greatly improve understanding and projections of future forest mortality risk. Observation data are, however, limited in accuracy and sample size, particularly in regard to forest state variables and mortality events. In contrast, simulations with state-of-the-art forest models enable the exploration of novel machine learning techniques for associating forest mortality with driving climate conditions. Here we simulate 160,000 years of beech, pine and spruce forest dynamics with the forest model FORMIND. We then apply β-VAE to learn disentangled latent representations of weather conditions and identify those that are most likely to cause high forest mortality. The learned model successfully identifies three characteristic climate representations that can be interpreted as different compound drivers of forest mortality.

Authors: Mohit Anand (Helmholtz Centre for Environmental Research - UFZ); Lily-belle Sweet (Helmholtz Centre for Environmental Research - UFZ); Gustau Camps-Valls (Universitat de València); Jakob Zscheischler (Helmholtz Centre for Environmental Research - UFZ)

NeurIPS 2022 Learning to forecast vegetation greenness at fine resolution over Africa with ConvLSTMs (Papers Track)
Abstract and authors: (click to expand)

Abstract: Forecasting the state of vegetation in response to climate and weather events is a major challenge. Its implementation will prove crucial in predicting crop yield, forest damage, or more generally the impact on ecosystems services relevant for socio-economic functioning, which if absent can lead to humanitarian disasters. Vegetation status depends on weather and environmental conditions that modulate complex ecological processes taking place at several timescales. Interactions between vegetation and different environmental drivers express responses at instantaneous but also time-lagged effects, often showing an emerging spatial context at landscape and regional scales. We formulate the land surface forecasting task as a strongly guided video prediction task where the objective is to forecast the vegetation developing at very fine resolution using topography and weather variables to guide the prediction. We use a Convolutional LSTM (ConvLSTM) architecture to address this task and predict changes in the vegetation state in Africa using Sentinel-2 satellite NDVI, having ERA5 weather reanalysis, SMAP satellite measurements, and topography (DEM of SRTMv4.1) as variables to guide the prediction. Ours results highlight how ConvLSTM models can not only forecast the seasonal evolution of NDVI at high resolution, but also the differential impacts of weather anomalies over the baselines. The model is able to predict different vegetation types, even those with very high NDVI variability during target length.

Authors: Claire Robin (Biogeochemical Integration, Max-Planck-Institute for Biogeochemistry, Jena, Germany); Christian Requena-Mesa (Computer Vision Group, Friedrich Schiller University Jena; DLR Institute of Data Science, Jena; Max Planck Institute for Biogeochemistry, Jena); Vitus Benson (Max-Planck-Institute for Biogeochemistry); Jeran Poehls (Max-Planck-Institute for Biogeochemistry); Lazaro Alonzo (Max-Planck-Institute for Biogeochemistry Max-Planck-Institute for Biogeochemistry); Nuno Carvalhais (Max-Planck-Institute for Biogeochemistry); Markus Reichstein (Max Planck Institute for Biogeochemistry, Jena; Michael Stifel Center Jena for Data-Driven and Simulation Science, Jena)

NeurIPS 2022 ForestBench: Equitable Benchmarks for Monitoring, Reporting, and Verification of Nature-Based Solutions with Machine Learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Restoring ecosystems and reducing deforestation are necessary tools to mitigate the anthropogenic climate crisis. Current measurements of forest carbon stock can be inaccurate, in particular for underrepresented and small-scale forests in the Global South, hindering transparency and accountability in the Monitoring, Reporting, and Verification (MRV) of these ecosystems. There is thus need for high quality datasets to properly validate ML-based solutions. To this end, we present ForestBench, which aims to collect and curate geographically-balanced gold-standard datasets of small-scale forest plots in the Global South, by collecting ground-level measurements and visual drone imagery of individual trees. These equitable validation datasets for ML-based MRV of nature-based solutions shall enable assessing the progress of ML models for estimating above-ground biomass, ground cover, and tree species diversity.

Authors: Lucas Czech (Carnegie Institution for Science); Björn Lütjens (MIT); David Dao (ETH Zurich)

NeurIPS 2022 Personalizing Sustainable Agriculture with Causal Machine Learning (Proposals Track) Best Paper: Proposals
Abstract and authors: (click to expand)

Abstract: To fight climate change and accommodate the increasing population, global crop production has to be strengthened. To achieve the "sustainable intensification" of agriculture, transforming it from carbon emitter to carbon sink is a priority, and understanding the environmental impact of agricultural management practices is a fundamental prerequisite to that. At the same time, the global agricultural landscape is deeply heterogeneous, with differences in climate, soil, and land use inducing variations in how agricultural systems respond to farmer actions. The "personalization" of sustainable agriculture with the provision of locally adapted management advice is thus a necessary condition for the efficient uplift of green metrics, and an integral development in imminent policies. Here, we formulate personalized sustainable agriculture as a Conditional Average Treatment Effect estimation task and use Causal Machine Learning for tackling it. Leveraging climate data, land use information and employing Double Machine Learning, we estimate the heterogeneous effect of sustainable practices on the field-level Soil Organic Carbon content in Lithuania. We thus provide a data-driven perspective for targeting sustainable practices and effectively expanding the global carbon sink.

Authors: Georgios Giannarakis (National Observatory of Athens); Vasileios Sitokonstantinou (National Observatory of Athens); Roxanne Suzette Lorilla (National Observatory of Athens); Charalampos Kontoes (National Observatory of Athens)

AAAI FSS 2022 Towards Generating Large Synthetic Phytoplankton Datasets for Efficient Monitoring of Harmful Algal Blooms
Abstract and authors: (click to expand)

Abstract: Climate change is increasing the frequency and severity of harmful algal blooms (HABs), which cause significant fish deaths in aquaculture farms. This contributes to ocean pollution and greenhouse gas (GHG) emissions since dead fish are either dumped into the ocean or taken to landfills, which in turn negatively impacts the climate. Currently, the standard method to enumerate harmful algae and other phytoplankton is to manually observe and count them under a microscope. This is a time-consuming, tedious and error-prone process, resulting in compromised management decisions by farmers. Hence, automating this process for quick and accurate HAB monitoring is extremely helpful. However, this requires large and diverse datasets of phytoplankton images, and such datasets are hard to produce quickly. In this work, we explore the feasibility of generating novel high-resolution photorealistic synthetic phytoplankton images, containing multiple species in the same image, given a small dataset of real images. To this end, we employ Generative Adversarial Networks (GANs) to generate synthetic images. We evaluate three different GAN architectures: ProjectedGAN, FastGAN, and StyleGANv2 using standard image quality metrics. We empirically show the generation of high-fidelity synthetic phytoplankton images using a training dataset of only 961 real images. Thus, this work demonstrates the ability of GANs to create large synthetic datasets of phytoplankton from small training datasets, accomplishing a key step towards sustainable systematic monitoring of harmful algal blooms.

Authors: Nitpreet Bamra (University of Waterloo), Vikram Voleti (Mila, University of Montreal), Alexander Wong (University of Waterloo) and Jason Deglint (University of Waterloo)

AAAI FSS 2022 Discovering Transition Pathways Towards Coviability with Machine Learning
Abstract and authors: (click to expand)

Abstract: This paper presents our ongoing French-Brazilian collaborative project which aims at: (1) establishing a diagnosis of socio-ecological coviability for several sites of interest in Nordeste, the North-East region of Brazil (in the states of Paraiba, Ceara, Pernambuco, and Rio Grande do Norte known for their biodiversity hotspots and vulnerabilities to climate change) using advanced data science techniques for multisource and multimodal data fusion and (2) finding transition pathways towards coviability equilibrium using machine learning techniques. Data collected in the field by scientists, ecologists, local actors combined with volunteered information, pictures from smart-phones, and data available on-line from satellite imagery, social media, surveys, etc. can be used to compute various coviability indicators of interest for the local actors. These indicators are useful to characterize and monitor the socio-ecological coviability status along various dimensions of anthropization, human welfare, ecological and biodiversity balance, and ecosystem intactness and vulnerabilities.

Authors: Laure Berti-Equille (IRD) and Rafael Raimundo (UFPB)

NeurIPS 2021 Predicting Critical Biogeochemistry of the Southern Ocean for Climate Monitoring (Papers Track)
Abstract and authors: (click to expand)

Abstract: The Biogeochemical-Argo (BGC-Argo) program is building a network of globally distributed, sensor-equipped robotic profiling floats, improving our understanding of the climate system and how it is changing. These floats, however, are limited in the number of variables measured. In this study, we train neural networks to predict silicate and phosphate values in the Southern Ocean from temperature, pressure, salinity, oxygen, nitrate, and location and apply these models to earth system model (ESM) and BGC-Argo data to expand the utility of this ocean observation network. We trained our neural networks on observations from the Global Ocean Ship-Based Hydrographic Investigations Program (GO-SHIP) and use dropout regularization to provide uncertainty bounds around our predicted values. Our neural network significantly improves upon linear regression but shows variable levels of uncertainty across the ranges of predicted variables. We explore the generalization of our estimators to test data outside our training distribution from both ESM and BGC-Argo data. Our use of out-of-distribution test data to examine shifts in biogeochemical parameters and calculate uncertainty bounds around estimates advance the state-of-the-art in oceanographic data and climate monitoring. We make our data and code publicly available.

Authors: Ellen Park (MIT); Jae Deok Kim (MIT-WHOI); Nadege Aoki (MIT); Yumeng Cao (MIT); Yamin Arefeen (Massachusetts Institute of Technology); Matthew Beveridge (Massachusetts Institute of Technology); David P Nicholson (Woods Hole Oceanographic Institution); Iddo Drori (MIT)

NeurIPS 2021 A data integration pipeline towards reliable monitoring of phytoplankton and early detection of harmful algal blooms (Papers Track)
Abstract and authors: (click to expand)

Abstract: Climate change is making oceans warmer and more acidic. Under these conditions phytoplankton can produce harmful algal blooms which cause rapid oxygen depletion and consequent death of marine plants and animals. Some species are even capable of releasing toxic substances endangering water quality and human health. Monitoring of phytoplankton and early detection of harmful algal blooms is essential for protection of marine flaura and fauna. Recent technological advances have enabled in-situ plankton image capture in real-time at low cost. However, available phytoplankton image databases have several limitations that prevent the practical usage of artificial intelligent models. We present a pipeline for integration of heterogeneous phytoplankton image datasets from around the world into a unified database that can ultimately serve as a benchmark dataset for phytoplankton research and therefore act as an important tool in building versatile machine learning models for climate adaptation planning. A machine learning model for early detection of harmful algal blooms is part of ongoing work.

Authors: Bruna Guterres (Universidade Federal do Rio Grande - FURG); Sara khalid (University of Oxford); Marcelo Pias (Federal University of Rio Grande); Silvia Botelho (Federal University of Rio Grande)

NeurIPS 2021 Data Driven Study of Estuary Hypoxia (Papers Track)
Abstract and authors: (click to expand)

Abstract: This paper presents a data driven study of dissolved oxygen times series collected in Atlantic Canada. The main motivation of presented work was to evaluate if machine learning techniques could help to understand and anticipate hypoxic episodes in nutrient-impacted estuaries, a phenomenon that is exacerbated by increasing temperature expected to arise due to changes in climate. A major constraint was to limit ourselves to the use of dissolved oxygen time series only. Our preliminary findings shows that recurring neural networks and in particular LSTM may be suitable to predict short horizon levels while traditional results could benefit in longer range hypoxia prevention.

Authors: Md Monwer Hussain (University of New-Brunswick); Guillaume Durand (National Research Council Canada); Michael Coffin (Department of Fisheries and Oceans Canada); Julio J Valdés (National Research Council Canada); Luke Poirier (Department of Fisheries and Oceans Canada)

NeurIPS 2021 High-resolution rainfall-runoff modeling using graph neural network (Papers Track)
Abstract and authors: (click to expand)

Abstract: Time-series modeling has shown great promise in recent studies using the latest deep learning algorithms such as LSTM (Long Short-Term Memory). These studies primarily focused on watershed-scale rainfall-runoff modeling or streamflow forecasting, but the majority of them only considered a single watershed as a unit. Although this simplification is very effective, it does not take into account spatial information, which could result in significant errors in large watersheds. Several studies investigated the use of GNN (Graph Neural Networks) for data integration by decomposing a large watershed into multiple sub-watersheds, but each sub-watershed is still treated as a whole, and the geoinformation contained within the watershed is not fully utilized. In this paper, we propose the GNRRM (Graph Neural Rainfall-Runoff Model), a novel deep learning model that makes full use of spatial information from high-resolution precipitation data, including flow direction and geographic information. When compared to baseline models, GNRRM has less over-fitting and significantly improves model performance. Our findings support the importance of hydrological data in deep learning-based rainfall-runoff modeling, and we encourage researchers to include more domain knowledge in their models.

Authors: Zhongrun Xiang (University of Iowa); Ibrahim Demir (The University of Iowa)

NeurIPS 2021 Resolving Super Fine-Resolution SIF via Coarsely-Supervised U-Net Regression (Papers Track)
Abstract and authors: (click to expand)

Abstract: Climate change presents challenges to crop productivity, such as increasing the likelihood of heat stress and drought. Solar-Induced Chlorophyll Fluorescence (SIF) is a powerful way to monitor how crop productivity and photosynthesis are affected by changing climatic conditions. However, satellite SIF observations are only available at a coarse spatial resolution (e.g. 3-5km) in most places, making it difficult to determine how individual crop types or farms are doing. This poses a challenging coarsely-supervised regression task; at training time, we only have access to SIF labels at a coarse resolution (3 km), yet we want to predict SIF at a very fine spatial resolution (30 meters), a 100x increase. We do have some fine-resolution input features (such as Landsat reflectance) that are correlated with SIF, but the nature of the correlation is unknown. To address this, we propose Coarsely-Supervised Regression U-Net (CSR-U-Net), a novel approach to train a U-Net for this coarse supervision setting. CSR-U-Net takes in a fine-resolution input image, and outputs a SIF prediction for each pixel; the average of the pixel predictions is trained to equal the true coarse-resolution SIF for the entire image. Even though this is a very weak form of supervision, CSR-U-Net can still learn to predict accurately, due to its inherent localization abilities, plus additional enhancements that facilitate the incorporation of scientific prior knowledge. CSR-U-Net can resolve fine-grained variations in SIF more accurately than existing averaging-based approaches, which ignore fine-resolution spatial variation during training. CSR-U-Net could also be useful for a wide range of "downscaling'" problems in climate science, such as increasing the resolution of global climate models.

Authors: Joshua Fan (Cornell University); Di Chen (Cornell University); Jiaming Wen (Cornell University); Ying Sun (Cornell University); Carla P Gomes (Cornell University)

NeurIPS 2021 A hybrid convolutional neural network/active contour approach to segmenting dead trees in aerial imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: The stability and ability of an ecosystem to withstand climate change is directly linked to its biodiversity. Dead trees are a key indicator of overall forest health, housing one-third of forest ecosystem biodiversity, and constitute 8% of the global carbon stocks. They are decomposed by several natural factors, e.g. climate, insects and fungi. Accurate detection and modeling of dead wood mass is paramount to understanding forest ecology, the carbon cycle and decomposers. We present a novel method to construct precise shape contours of dead trees from aerial photographs by combining established convolutional neural networks with a novel active contour model in an energy minimization framework. Our approach yields superior performance accuracy over state-of-the-art in terms of precision, recall, and intersection over union of detected dead trees. This improved performance is essential to meet emerging challenges caused by climate change (and other man-made perturbations to the systems), particularly to monitor and estimate carbon stock decay rates, monitor forest health and biodiversity, and the overall effects of dead wood on and from climate change.

Authors: Jacquelyn Shelton (Hong Kong Polytechnic University); Przemyslaw Polewski (TomTom Location Technology Germany GmbH); Wei Yao (The Hong Kong Polytechnic University); Marco Heurich (Bavarian Forest National Park)

NeurIPS 2021 Reducing the Barriers of Acquiring Ground-truth from Biodiversity Rich Audio Datasets Using Intelligent Sampling Techniques (Papers Track)
Abstract and authors: (click to expand)

Abstract: The potential of passive acoustic monitoring (PAM) as a method to reveal the consequences of climate change on the biodiversity that make up natural soundscapes can be undermined by the discrepancy between the low barrier of entry to acquire large field audio datasets and the higher barrier of acquiring reliable species level training, validation, and test subsets from the field audio. These subsets from a deployment are often required to verify any machine learning models used to assist researchers in understanding the local biodiversity. Especially as many models convey promising results from various sources that may not translate to the collected field audio. Labeling such datasets is a resource intensive process due to the lack of experts capable of identifying bioacoustics at a species level as well as the overwhelming size of many PAM audiosets. To address this challenge, we have tested different sampling techniques on an audio dataset collected over a two-week long August audio array deployment on the Scripps Coastal Reserve (SCR) Biodiversity Trail in La Jolla, California. These sampling techniques involve creating four subsets using stratified random sampling, limiting samples to the daily bird vocalization peaks, and using a hybrid convolutional neural network (CNN) and recurrent neural network (RNN) trained for bird presence/absence audio classification. We found that a stratified random sample baseline only achieved a bird presence rate of 44% in contrast with a sample that randomly selected clips with high hybrid CNN-RNN predictions that were collected during bird activity peaks at dawn and dusk yielding a bird presence rate of 95%. The significantly higher bird presence rate demonstrates how intelligent, machine learning-assisted selection of audio data can significantly reduce the amount of time that domain experts listen to audio without vocalizations of interest while building a ground truth for machine learning models.

Authors: Jacob G Ayers (UC San Diego); Sean Perry (UC San Diego); Vaibhav Tiwari (UC San Diego); Mugen Blue (Cal Poly San Luis Obispo); Nishant Balaji (UC San Diego); Curt Schurgers (UC San Diego); Ryan Kastner (University of California San Diego); Mathias Tobler (San Diego Zoo Wildlife Alliance); Ian Ingram (San Diego Zoo Wildlife Alliance)

NeurIPS 2021 Two-phase training mitigates class imbalance for camera trap image classification with CNNs (Papers Track)
Abstract and authors: (click to expand)

Abstract: By leveraging deep learning to automatically classify camera trap images, ecologists can monitor biodiversity conservation efforts and the effects of climate change on ecosystems more efficiently. Due to the imbalanced class-distribution of camera trap datasets, current models are biased towards the majority classes. As a result, they obtain good performance for a few majority classes but poor performance for many minority classes. We used two-phase training to increase the performance for these minority classes. We trained, next to a baseline model, four models that implemented a different versions of two-phase training on a subset of the highly imbalanced Snapshot Serengeti dataset. Our results suggest that two-phase training can improve performance for many minority classes, with limited loss in performance for the other classes. We find that two-phase training based on majority undersampling increases class-specific F1-scores up to 3.0%. We also find that two-phase training outperforms using only oversampling or undersampling by 6.1% in F1-score on average. Finally, we find that a combination of over- and undersampling leads to a better performance than using them individually.

Authors: Farjad Malik (KU Leuven); Simon Wouters (KU Leuven); Ruben Cartuyvels (KULeuven); Erfan Ghadery (KU Leuven); Sien Moens (KU Leuven)

NeurIPS 2021 Machine learning-enabled model-data integration for predicting subsurface water storage (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Subsurface water storage (SWS) is a key variable of the climate system and a storage component for precipitation and radiation anomalies, inducing persistence in the climate system. It plays a critical role in climate-change projections and can mitigate the impacts of climate change on ecosystems. However, because of the difficult accessibility of the underground, hydrologic properties and dynamics of SWS are poorly known. Direct observations of SWS are limited, and accurate incorporation of SWS dynamics into Earth system land models remains challenging. We propose a machine learning-enabled model-data integration framework to improve the SWS prediction at local to conus scales in a changing climate by leveraging all the available observation and simulation resources, as well as to inform the model development and guide the observation collection. The accurate prediction will enable an optimal decision of water management and land use and improve the ecosystem's resilience to the climate change.

Authors: Dan Lu (Oak Ridge National Laboratory); Eric Pierce (Oak Ridge National Laboratory); Shih-Chieh Kao (Oak Ridge National Laboratory); David Womble (Oak Ridge National Laboratory); LI LI (Pennsylvania State University); Daniella Rempe (The University of Texas at Austin)

NeurIPS 2021 Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks. Such models, recently coined as foundation models, have been transformational to the field of natural language processing. While similar models have also been trained on large corpuses of images, they are not well suited for remote sensing data. To stimulate the development of foundation models for Earth monitoring, we propose to develop a new benchmark comprised of a variety of downstream tasks related to climate change. We believe that this can lead to substantial improvements in many existing applications and facilitate the development of new applications. This proposal is also a call for collaboration with the aim of developing a better evaluation process to mitigate potential downsides of foundation models for Earth monitoring.

Authors: Alexandre Lacoste (ServiceNow); Evan D Sherwin (Stanford University, Energy and Resources Engineering); Hannah R Kerner (University of Maryland); Hamed Alemohammad (Radiant Earth Foundation); Björn Lütjens (MIT); Jeremy A Irvin (Stanford); David Dao (ETH Zurich); Alex Chang (Service Now); Mehmet Gunturkun (Element Ai); Alexandre Drouin (ServiceNow); Pau Rodriguez (Element AI); David Vazquez (ServiceNow)

ICML 2021 Forest Terrain Identification using Semantic Segmentation on UAV Images (Papers Track)
Abstract and authors: (click to expand)

Abstract: Beavers' habitat is known to alter the terrain, providing biodiversity in the area, and recently their lifestyle is linked to climatic changes by reducing greenhouse gases levels in the region. To analyse the impact of beavers’ habitat on the region, it is, therefore, necessary to estimate the terrain alterations caused by beaver actions. Furthermore, such terrain analysis can also play an important role in domains like wildlife ecology, deforestation, land-cover estimations, and geological mapping. Deep learning models are known to provide better estimates on automatic feature identification and classification of a terrain. However, such models require significant training data. Pre-existing terrain datasets (both real and synthetic) like CityScapes, PASCAL, UAVID, etc, are mostly concentrated on urban areas and include roads, pathways, buildings, etc. Such datasets, therefore, are unsuitable for forest terrain analysis. This paper contributes, by providing a finely labelled novel dataset of forest imagery around beavers’ habitat, captured from a high-resolution camera on an aerial drone. The dataset consists of 100 such images labelled and classified based on 9 different classes. Furthermore, a baseline is established on this dataset using state-of-the-art semantic segmentation models based on performance metrics including Intersection Over Union (IoU), Overall Accuracy (OA), and F1 score.

Authors: Muhammad Umar (Anglia Ruskin University); Lakshmi Babu Saheer (Anglia Ruskin University); Javad Zarrin (Anglia Ruskin University)

ICML 2021 Challenges in Applying Audio Classification Models to Datasets Containing Crucial Biodiversity Information (Papers Track)
Abstract and authors: (click to expand)

Abstract: The acoustic signature of a natural soundscape can reveal consequences of climate change on biodiversity. Hardware costs, human labor time, and expertise dedicated to labeling audio are impediments to conducting acoustic surveys across a representative portion of an ecosystem. These barriers are quickly eroding away with the advent of low-cost, easy to use, open source hardware and the expansion of the machine learning field providing pre-trained neural networks to test on retrieved acoustic data. One consistent challenge in passive acoustic monitoring (PAM) is a lack of reliability from neural networks on audio recordings collected in the field that contain crucial biodiversity information that otherwise show promising results from publicly available training and test sets. To demonstrate this challenge, we tested a hybrid recurrent neural network (RNN) and convolutional neural network (CNN) binary classifier trained for bird presence/absence on two Peruvian bird audiosets. The RNN achieved an area under the receiver operating characteristics (AUROC) of 95% on a dataset collected from Xeno-canto and Google’s AudioSet ontology in contrast to 65% across a stratified random sample of field recordings collected from the Madre de Dios region of the Peruvian Amazon. In an attempt to alleviate this discrepancy, we applied various audio data augmentation techniques in the network’s training process which led to an AUROC of 77% across the field recordings.

Authors: Jacob G Ayers (UC San Diego); Yaman Jandali (University of California, San Diego); Yoo-Jin Hwang (Harvey Mudd College); Erika Joun (University of California, San Diego); Gabriel Steinberg (Binghampton University); Mathias Tobler (San Diego Zoo Wildlife Alliance); Ian Ingram (San Diego Zoo Wildlife Alliance); Ryan Kastner (University of California San Diego); Curt Schurgers (University of California San Diego)

ICML 2021 Modeling Bird Migration by Disaggregating Population Level Observations (Papers Track)
Abstract and authors: (click to expand)

Abstract: Birds are shifting migratory routes and timing in response to climate change, but modeling migration to better understand these changes is difficult. Some recent work leverages fluid dynamics models, but this requires individual flight speed and directional data which may not be readily available. We developed an alternate modeling method which only requires population level positional data and use it to model migration routes of the American Woodcock (Scolopax minor). We use our model to sample simulated bird trajectories and compare them to real trajectories in order to evaluate the model.

Authors: Miguel Fuentes (University of Massachusetts, Amherst); Benjamin Van Doren (Cornell University); Daniel Sheldon (University of Massachusetts, Amherst)

ICML 2021 Forecasting Sea Ice Concentrations using Attention-based Ensemble LSTM (Papers Track)
Abstract and authors: (click to expand)

Abstract: Accurately forecasting Arctic sea ice from sub-seasonal to seasonal scales has been a major scientific effort with fundamental challenges at play. In addition to physics-based earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven sea ice forecasting, we propose an attention-based Long Short Term Memory (LSTM) ensemble method to predict monthly sea ice extent up to 1 month ahead. Using daily and monthly satellite retrieved sea ice data from NSIDC and atmospheric and oceanic variables from ERA5 reanalysis product for 39 years, we show that our multi-temporal ensemble method outperforms several baseline and recently proposed deep learning models. This will substantially improve our ability in predicting future Arctic sea ice changes, which is fundamental for forecasting transporting routes, resource development, coastal erosion, threats to Arctic coastal communities and wildlife.

Authors: Sahara Ali (University of Maryland, Baltimore County); Yiyi Huang (University of Maryland, Baltimore County); Xin Huang (University of Maryland, Baltimore County); Jianwu Wang (University of Maryland, Baltimore County)

ICML 2021 Preserving the integrity of the Canadian northern ecosystems through insights provided by reinforcement learning-based Arctic fox movement models (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Realistic modeling of the movement of the Arctic fox, one of the main predators of the circumpolar world, is crucial to understand the processes governing the distribution of the Canadian Arctic biodiversity. Current methods, however, are unable to adequately account for complex behaviors as well as intra- and interspecific relationships. We propose to harness the potential of reinforcement learning to develop innovative models that will address these shortcomings and provide the backbone to predict how vertebrate communities may be affected by environmental changes in the Arctic, an essential step towards the elaboration of rational conservation actions.

Authors: Catherine Villeneuve (Université Laval); Frédéric Dulude-De Broin (Université Laval); Pierre Legagneux (Université Laval); Dominique Berteaux (Université du Québec à Rimouski); Audrey Durand (Université Laval)

ICML 2021 On the Role of Spatial Clustering Algorithms in Building Species Distribution Models from Community Science Data (Proposals Track) Best Paper: Proposals
Abstract and authors: (click to expand)

Abstract: This paper discusses opportunities for developments in spatial clustering methods to help leverage broad scale community science data for building species distribution models (SDMs). SDMs are tools that inform the science and policy needed to mitigate the impacts of climate change on biodiversity. Community science data span spatial and temporal scales unachievable by expert surveys alone, but they lack the structure imposed in smaller scale studies to allow adjustments for observational biases. Spatial clustering approaches can construct the necessary structure after surveys have occurred, but more work is needed to ensure that they are effective for this purpose. In this proposal, we describe the role of spatial clustering for realizing the potential of large biodiversity datasets, how existing methods approach this problem, and ideas for future work.

Authors: Mark Roth (Oregon State University); Tyler Hallman (Swiss Ornithological Institute); W. Douglas Robinson (Oregon State University); Rebecca Hutchinson (Oregon State University)

ICML 2021 Leveraging Domain Adaptation for Low-Resource Geospatial Machine Learning (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Machine learning in remote sensing has matured alongside a proliferation in availability and resolution of geospatial imagery, but its utility is bottlenecked by the need for labeled data. What's more, many labeled geospatial datasets are specific to certain regions, instruments, or extreme weather events. We investigate the application of modern domain-adaptation to multiple proposed geospatial benchmarks, uncovering unique challenges and proposing solutions to them.

Authors: John M Lynch (NC State University); Sam Wookey (Masterful AI)

NeurIPS 2020 Spatio-Temporal Learning for Feature Extraction inTime-Series Images (Papers Track)
Abstract and authors: (click to expand)

Abstract: Earth observation programs have provided highly useful information in global climate change research over the past few decades and greatly promoted its development, especially through providing biological, physical, and chemical parameters on a global scale. Programs such as Landsat, Sentinel, SPOT, and Pleiades can be used to acquire huge volume of medium to high resolution images every day. In this work, we organize these data in time series and we exploit both temporal and spatial information they provide to generate accurate and up-to-date land cover maps that can be used to monitor vulnerable areas threatened by the ongoing climatic and anthropogenic global changes. For this purpose, we combine a fully convolutional neural network with a convolutional long short-term memory. Implementation details of the proposed spatio-temporal neural network architecture are described. Examples are provided for the monitoring of roads and mangrove forests on the West African coast.

Authors: Gael Kamdem De Teyou (Huawei)

NeurIPS 2020 Mangrove Ecosystem Detection using Mixed-Resolution Imagery with a Hybrid-Convolutional Neural Network (Papers Track)
Abstract and authors: (click to expand)

Abstract: Mangrove forests are rich in biodiversity and are a large contributor to carbon sequestration critical in the fight against climate change. However, they are currently under threat from anthropogenic activities, so monitoring their health, extent, and productivity is vital to our ability to protect these important ecosystems. Traditionally, lower resolution satellite imagery or high resolution unmanned air vehicle (UAV) imagery has been used independently to monitor mangrove extent, both offering helpful features to predict mangrove extent. To take advantage of both of these data sources, we propose the use of a hybrid neural network, which combines a Convolutional Neural Network (CNN) feature extractor with a Multilayer-Perceptron (MLP), to accurately detect mangrove areas using both medium resolution satellite and high resolution drone imagery. We present a comparison of our novel Hybrid CNN with algorithms previously applied to mangrove image classification on a data set we collected of dwarf mangroves from consumer UAVs in Baja California Sur, Mexico, and show a 95\% intersection over union (IOU) score for mangrove image classification, outperforming all our baselines.

Authors: Dillon Hicks (Engineers for Exploration); Ryan Kastner (University of California San Diego); Curt Schurgers (University of California San Diego); Astrid Hsu (University of California San Diego); Octavio Aburto (University of California San Diego)

NeurIPS 2020 Movement Tracks for the Automatic Detection of Fish Behavior in Videos (Papers Track)
Abstract and authors: (click to expand)

Abstract: Global warming is predicted to profoundly impact ocean ecosystems. Fish behavior is an important indicator of changes in such marine environments. Thus, the automatic identification of key fish behavior in videos represents a much needed tool for marine researchers, enabling them to study climate change-related phenomena. We offer a dataset of sablefish (Anoplopoma fimbria) startle behaviors in underwater videos, and investigate the use of deep learning (DL) methods for behavior detection on it. Our proposed detection system identifies fish instances using DL-based frameworks, determines trajectory tracks, derives novel behavior-specific features, and employs Long Short-Term Memory (LSTM) networks to identify startle behavior in sablefish. Its performance is studied by comparing it with a state-of-the-art DL-based video event detector.

Authors: Declan GD McIntosh (University Of Victoria); Tunai Porto Marques (University of Victoria); Alexandra Branzan Albu (University of Victoria); Rodney Rountree (University of Victoria); Fabio De Leo Cabrera (Ocean Networks Canada)

NeurIPS 2020 Counting Cows: Tracking Illegal Cattle Ranching From High-Resolution Satellite Imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: Cattle farming is responsible for 8.8\% of greenhouse gas emissions worldwide. In addition to the methane emitted due to their digestive process, the growing need for grazing areas is an important driver of deforestation. While some regulations are in place for preserving the Amazon against deforestation, these are being flouted in various ways. Hence the need to scale and automate the monitoring of cattle ranching activities. Through a partnership with \textit{Anonymous under review}, we explore the feasibility of tracking and counting cattle at the continental scale from satellite imagery. With a license from Maxar Technologies, we obtained satellite imagery of the Amazon at 40cm resolution, and compiled a dataset of 903 images containing a total of 28498 cattle. Our experiments show promising results and highlight important directions for the next steps on both counting algorithms and the data collection processes for solving such challenges.

Authors: Issam Hadj Laradji (Element AI); Pau Rodriguez (Element AI); Alfredo Kalaitzis (University of Oxford); David Vazquez (Element AI); Ross Young (Element AI); Ed Davey (Global Witness); Alexandre Lacoste (Element AI)

NeurIPS 2020 EarthNet2021: A novel large-scale dataset and challenge for forecasting localized climate impacts (Papers Track)
Abstract and authors: (click to expand)

Abstract: Climate change is global, yet its concrete impacts can strongly vary between different locations in the same region. Seasonal weather forecasts currently operate at the mesoscale (> 1 km). For more targeted mitigation and adaptation, modelling impacts to < 100 m is needed. Yet, the relationship between driving variables and Earth’s surface at such local scales remains unresolved by current physical models. Large Earth observation datasets now enable us to create machine learning models capable of translating coarse weather information into high-resolution Earth surface forecasts encompassing localized climate impacts. Here, we define high-resolution Earth surface forecasting as video prediction of satellite imagery conditional on mesoscale weather forecasts. Video prediction has been tackled with deep learning models. Developing such models requires analysis-ready datasets. We introduce EarthNet2021, a new, curated dataset containing target spatio-temporal Sentinel 2 satellite imagery at 20 m resolution, matched with high-resolution topography and mesoscale (1.28 km) weather variables. With over 32000 samples it is suitable for training deep neural networks. Comparing multiple Earth surface forecasts is not trivial. Hence, we define the EarthNetScore, a novel ranking criterion for models forecasting Earth surface reflectance. For model intercomparison we frame EarthNet2021 as a challenge with four tracks based on different test sets. These allow evaluation of model validity and robustness as well as model applicability to extreme events and the complete annual vegetation cycle. In addition to forecasting directly observable weather impacts through satellite-derived vegetation indices, capable Earth surface models will enable downstream applications such as crop yield prediction, forest health assessments, coastline management, or biodiversity monitoring. Find data, code, and how to participate at www.earthnet.tech .

Authors: Christian Requena-Mesa (Computer Vision Group, Friedrich Schiller University Jena; DLR Institute of Data Science, Jena; Max Planck Institute for Biogeochemistry, Jena); Vitus Benson (Max-Planck-Institute for Biogeochemistry); Jakob Runge (Institute of Data Science, German Aerospace Center (DLR)); Joachim Denzler (Computer Vision Group, Friedrich Schiller University Jena, Germany); Markus Reichstein (Max Planck Institute for Biogeochemistry, Jena; Michael Stifel Center Jena for Data-Driven and Simulation Science, Jena)

NeurIPS 2020 VConstruct: Filling Gaps in Chl-a Data Using a Variational Autoencoder (Papers Track)
Abstract and authors: (click to expand)

Abstract: Remote sensing of Chlorophyll-a is vital in monitoring climate change. Chlorphylla measurements give us an idea of the algae concentrations in the ocean, which lets us monitor ocean health. However, a common problem is that the satellites used to gather the data are commonly obstructed by clouds and other artifacts. This means that time series data from satellites can suffer from spatial data loss. There are a number of algorithms that are able to reconstruct the missing parts of these images to varying degrees of accuracy, with Data INterpolating Empirical Orthogonal Functions (DINEOF) being the current standard. However, DINEOF is slow, suffers from accuracy loss in temporally homogenous waters, reliant on temporal data, and only able to generate a single potential reconstruction. We propose a machine learning approach to reconstruction of Chlorophyll-a data using a Variational Autoencoder (VAE). Our accuracy results to date are competitive with but slightly less accurate than DINEOF. We show the benefits of our method including vastly decreased computation time and ability to generate multiple potential reconstructions. Lastly, we outline our planned improvements and future work.

Authors: Matthew Ehrler (University of Victoria); Neil Ernst (University of Victoria)

NeurIPS 2020 A Comparison of Data-Driven Models for Predicting Stream Water Temperature (Papers Track)
Abstract and authors: (click to expand)

Abstract: Changes to the Earth's climate are expected to negatively impact water resources in the future. It is important to have accurate modelling of river flow and water quality to make optimal decisions for water management. Machine learning and deep learning models have become promising methods for making such hydrological predictions. Using these models, however, requires careful consideration both of data constraints and of model complexity for a given problem. Here, we use machine learning (ML) models to predict monthly stream water temperature records at three monitoring locations in the Northwestern United States with long-term datasets, using meteorological data as predictors. We fit three ML models: a Multiple Linear Regression, a Random Forest Regression, and a Support Vector Regression, and compare them against two baseline models: a persistence model and historical model. We show that all three ML models are reasonably able to predict mean monthly stream temperatures with root mean-squared errors (RMSE) ranging from 0.63-0.91 degrees Celsius. Of the three ML models, Support Vector Regression performs the best with an error of 0.63-0.75 degrees Celsius. However, all models perform poorly on extreme values of water temperature. We identify the need for machine learning approaches for predicting extreme values for variables such as water temperature, since it has significant implications for stream ecosystems and biota.

Authors: Helen Weierbach (Lawrence Berkeley); Aranildo Lima (Aquatic Informatics); Danielle Christianson (Lawrence Berkeley National Lab); Boris Faybishenko (Lawrence Berkeley National Lab); Val Hendrix (Lawrence Berkeley National Lab); Charuleka Varadharajan (Lawrence Berkeley National Lab)

NeurIPS 2020 Automated Salmonid Counting in Sonar Data (Papers Track)
Abstract and authors: (click to expand)

Abstract: The prosperity of salmonids is crucial for several ecological and economic functions. Accurately counting spawning salmonids during their seasonal migration is essential in monitoring threatened populations, assessing the efficacy of recovery strategies, guiding fishing season regulations, and supporting the management of commercial and recreational fisheries. While several different methods exist for counting river fish, they all rely heavily on human involvement, introducing a hefty financial and time burden. In this paper we present an automated fish counting method that utilizes data captured from ARIS sonar cameras to detect and track salmonids migrating in rivers. Our results show that our fully automated system has a 19.3% per-clip error when compared to human counting performance. There is room to improve, but our system can already decrease the amount of time field biologists and fishery managers need to spend manually watching ARIS clips.

Authors: Peter Kulits (Caltech); Angelina Pan (Caltech); Sara M Beery (Caltech); Erik Young (Trout Unlimited); Pietro Perona (California Institute of Technology); Grant Van Horn (Cornell University)

NeurIPS 2020 Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya (Papers Track)
Abstract and authors: (click to expand)

Abstract: Glacier mapping is key to ecological monitoring in the Hindu Kush Himalaya region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated mapping from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier mapping process.

Authors: Shimaa Baraka (Mila); Benjamin Akera (Makerere University); Bibek Aryal (The University of Texas at El Paso); Tenzing Sherpa (International Centre for Integrated Mountain Development); Finu Shrestha (International Centre for Integrated Mountain Development); Anthony Ortiz (Microsoft); Kris Sankaran (University of Wisconsin-Madison); Juan M Lavista Ferres (Microsoft); Mir A Matin (International Center for Integrated Mountain Development); Yoshua Bengio (Mila)

NeurIPS 2020 Investigating two super-resolution methods for downscaling precipitation: ESRGAN and CAR (Papers Track)
Abstract and authors: (click to expand)

Abstract: In an effort to provide optimal inputs to downstream modeling systems (e.g., a hydrodynamics model that simulates the water circulation of a lake), we hereby strive to enhance resolution of precipitation fields from a weather model by up to 9x. We test two super-resolution models: the enhanced super-resolution generative adversarial networks (ESRGAN) proposed in 2017, and the content adaptive resampler (CAR) proposed in 2020. Both models outperform simple bicubic interpolation, with the ESRGAN exceeding expectations for accuracy. We make several proposals for extending the work to ensure it can be useful tool for quantifying the impact of climate change on local ecosystems while removing reliance on energy-intensive, high-resolution weather model simulations.

Authors: Campbell Watson (IBM); Chulin Wang (Northwestern University); Tim Lynar (University of New South Wales); Komminist Weldemariam (IBM Research)

NeurIPS 2020 Spatiotemporal Features Improve Fine-Grained Butterfly Image Classification (Papers Track)
Abstract and authors: (click to expand)

Abstract: Understanding the changing distributions of butterflies gives insight into the impacts of climate change across ecosystems and is a prerequisite for conservation efforts. eButterfly is a citizen science website created to allow people to track the butterfly species around them and use these observations to contribute to research. However, correctly identifying butterfly species is a challenging task for non-specialists and currently requires the involvement of entomologists to verify the labels of novice users on the website. We have developed a computer vision model to label butterfly images from eButterfly automatically, decreasing the need for human experts. We employ a model that incorporates geographic and temporal information of where and when the image was taken, in addition to the image itself. We show that we can successfully apply this spatiotemporal model for fine-grained image recognition, significantly improving the accuracy of our classification model compared to a baseline image recognition system trained on the same dataset.

Authors: Marta Skreta (University of Toronto); Sasha Luccioni (Mila); David Rolnick (McGill University, Mila)

NeurIPS 2020 Towards Data-Driven Physics-Informed Global Precipitation Forecasting from Satellite Imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: Under the effects of global warming, extreme events such as floods and droughts are increasing in frequency and intensity. This trend directly affects communities and make all the more urgent widening the access to accurate precipitation forecasting systems for disaster preparedness. Nowadays, weather forecasting relies on numerical models necessitating massive computing resources that most developing countries cannot afford. Machine learning approaches are still in their infancy but already show the promise for democratizing weather predictions, by leveraging any data source and requiring less compute. In this work, we propose a methodology for data-driven and physics-aware global precipitation forecasting from satellite imagery. To fully take advantage of the available data, we design the system as three elements: 1. The atmospheric state is estimated from recent satellite data. 2. The atmospheric state is propagated forward in time. 3. The atmospheric state is used to derive the precipitation intensity within a nearby time interval. In particular, our use of stochastic methods for forecasting the atmospheric state represents a novel application in this domain.

Authors: Valentina Zantedeschi (GE Global Research); Daniele De Martini (University of Oxford); Catherine Tong (University of Oxford); Christian A Schroeder de Witt (University of Oxford); Piotr Bilinski (University of Warsaw / University of Oxford); Alfredo Kalaitzis (University of Oxford); Matthew Chantry (University of Oxford); Duncan Watson-Parris (University of Oxford)

NeurIPS 2020 Hyperspectral Remote Sensing of Aquatic Microbes to Support Water Resource Management (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Harmful algal blooms in drinking water supply and at recreational sites endanger human health. Excessive algal growth can result in low oxygen environments, making them uninhabitable for fish and other aquatic life. Harmful algae and algal blooms are predicted to increase in frequency and extent due to the warming climate, but microbial dynamics remain difficult to predict. Existing satellite remote sensing monitoring technologies are ill-equipped to discriminate harmful algae, while models do not adequately capture the complex controls on algal populations. This proposal explores the potential for Bayesian neural networks to detect phytoplankton pigments from hyperspectral remote sensing reflectance retrievals. Once developed, such a model could enable hyperspectral remote sensing retrievals to support decision making in water resource management as more advanced ocean color satellites are launched in the coming decade. While uncertainty quantification motivates the proposed use of Bayesian models, the interpretation of these uncertainties in an operational context must be carefully considered.

Authors: Grace E Kim (Booz Allen Hamilton); Evan Poworoznek (NASA GSFC); Susanne Craig (NASA GSFC)

NeurIPS 2020 Artificial Intelligence, Machine Learning and Modeling for Understanding the Oceans and Climate Change (Proposals Track)
Abstract and authors: (click to expand)

Abstract: These changes will have a drastic impact on almost all forms of life in the ocean with further consequences on food security, ecosystem services in coastal and inland communities. Despite these impacts, scientific data and infrastructures are still lacking to understand and quantify the consequences of these perturbations on the marine ecosystem. Understanding this phenomenon is not only an urgent but also a scientifically demanding task. Consequently, it is a problem that must be addressed with a scientific cohort approach, where multi-disciplinary teams collaborate to bring the best of different scientific areas. In this proposal paper, we describe our newly launched four-years project focused on developing new artificial intelligence, machine learning, and mathematical modeling tools to contribute to the understanding of the structure, functioning, and underlying mechanisms and dynamics of the global ocean symbiome and its relation with climate change. These actions should enable the understanding of our oceans and predict and mitigate the consequences of climate change.

Authors: Nayat Sánchez Pi (Inria); Luis Martí (Inria); André Abreu (Fountation Tara Océans); Olivier Bernard (Inria); Colomban de Vargas (CNRS); Damien Eveillard (Univ. Nantes); Alejandro Maass (CMM, U. Chile); Pablo Marquet (PUC); Jacques Sainte-Marie (Inria); Julien Salomin (Inria); Marc Schoenauer (INRIA); Michele Sebag (LRI, CNRS, France)

ICLR 2020 SolarNet: A Deep Learning Framework to Map Solar Plants In China From Satellite Imagery (Papers Track)
Abstract and authors: (click to expand)

Abstract: Renewable energy such as solar power is critical to fight the ever more serious climate, how to effectively detect renewable energy has became an important issue for governments. In this paper, we proposed a deep learning framework named SolarNet which is designed to perform semantic segmentation on large scale satellite imagery data to detect solar farms. SolarNet has successfully mapped 439 solar farms in China, covering near 2000 square kilometers, equivalent to the size of whole Shenzhen city or two and a half of New York city. To the best of our knowledge, it is the first time that we used deep learning to reveal the locations and sizes of solar farms in China, which could provide insights for solar power companies, climate finance and markets.

Authors: Xin Hou (WeBank); Biao Wang (WeBank); Wanqi Hu (WeBank); lei yin (WeBank); Anbu Huang (WeBank); Haishan Wu (WeBank)

ICLR 2020 A CONTINUAL LEARNING APPROACH FOR LOCAL LEVEL ENVIRONMENTAL MONITORING IN LOW-RESOURCE SETTINGS (Papers Track)
Abstract and authors: (click to expand)

Abstract: An increasingly important dimension in the quest for mitigation and monitoring of environmental change is the role of citizens. The crowd-based monitoring of local level anthropogenic alterations is essential towards measurable changes in different contributing factors to climate change. With the proliferation of mobile technologies here in the African continent, it is useful to have machine learning based models that are deployed on mobile devices and that can learn continually from streams of data over extended time, possibly pertaining to different tasks of interest. In this paper, we demonstrate the localisation of deforestation indicators using lightweight models and extend to incorporate data about wildfires and smoke detection. The idea is to show the need and potential of continual learning approaches towards building robust models to track local environmental alterations.

Authors: Arijit Patra (University of Oxford)

ICLR 2020 SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework (Papers Track)
Abstract and authors: (click to expand)

Abstract: Soil moisture is critical component of crop health and monitoring it can enable further actions for increasing yield or preventing catastrophic die off. As climate change increases the likelihood of extreme weather events and reduces the predictability of weather, and non-optimal soil moistures for crops may become more likely. In this work, we use a series of LSTM architectures to analyze measurements of soil moisture and vegetation indices derived from satellite imagery. The system learns to predict the future values of these measurements. These spatially sparse values and indices are used as input features to an interpolation method that infer spatially dense moisture maps at multiple depths for a future time point. This has the potential to provide advance warning for soil moistures that may be inhospitable to crops across an area with limited monitoring capacity.

Authors: Conrad J Foley (Deep Planet); Sagar Vaze (deepplanet.ai); Mohamed El Amine Seddiq (Deep Planet); Aleksei Unagaev (Deep Planet); Natalia Efremova (University of Oxford)

ICLR 2020 TrueBranch: Metric Learning-based Verification of Forest Conservation Projects (Proposals Track) Best Proposal Award
Abstract and authors: (click to expand)

Abstract: International stakeholders increasingly invest in offsetting carbon emissions, for example, via issuing Payments for Ecosystem Services (PES) to forest conservation projects. Issuing trusted payments requires a transparent monitoring, reporting, and verification (MRV) process of the ecosystem services (e.g., carbon stored in forests). The current MRV process, however, is either too expensive (on-ground inspection of forest) or inaccurate (satellite). Recent works propose low-cost and accurate MRV via automatically determining forest carbon from drone imagery, collected by the landowners. The automation of MRV, however, opens up the possibility that landowners report untruthful drone imagery. To be robust against untruthful reporting, we propose TrueBranch, a metric learning-based algorithm that verifies the truthfulness of drone imagery from forest conservation projects. TrueBranch aims to detect untruthfully reported drone imagery by matching it with public satellite imagery. Preliminary results suggest that nominal distance metrics are not sufficient to reliably detect untruthfully reported imagery. TrueBranch leverages a method from metric learning to create a feature embedding in which truthfully and untruthfully collected imagery is easily distinguishable by distance thresholding.

Authors: Simona Santamaria (ETH Zurich); David Dao (ETH Zurich); Björn Lütjens (MIT); Ce Zhang (ETH)

ICLR 2020 Using ML to close the vocabulary gap in the context of environment and climate change in Chichewa (Proposals Track)
Abstract and authors: (click to expand)

Abstract: In the west, alienation from nature and deteriorating opportunities to experience it, have led educators to incorporate educational programs in schools, to bring pupils in contact with nature and to enhance their understanding of issues related to the environment and its protection. In Africa, and in Malawi, where most people engage in agriculture, and spend most of their time in the 'outdoors', alienation from nature is happening too, although in different ways. Large portion of the indigenous vocabulary and knowledge remains unknown or is slowly disappearing, and there is a need to build a glossary of terms regarding environment and climate change in the vernacular to improve the dialog regarding climate change and environmental protection.. We believe that ML has a role to play in closing the ‘vocabulary gap’ of terms and concepts regarding the environment and climate change that exists in Chichewa and other Malawian languages by helping to creating a visual dictionary of key terms used to describe the environment and explain the issues involved in climate change and their meaning. Chichewa is a descriptive language, one English term may be translated using several words. Thus, the task is not to detect just literal translations, but also translations by means of ‘descriptions’ and illustrations and thus extract correspondence between terms and definitions and to measure how appropriate a term is to convey the meaning intended. As part of this project, ML can be used to identify ‘loanword patterns’, which may be useful in understanding the transmission of cultural items.

Authors: Amelia Taylor (University of Malawi, The Polytechnic)

NeurIPS 2019 Natural Language Generation for Operations and Maintenance in Wind Turbines (Papers Track)
Abstract and authors: (click to expand)

Abstract: Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to predict faults in wind turbines, but these predictions have not been supported by suggestions on how to avert and fix occurring errors. In this paper, we present a data-to-text generation system utilising transformers to produce event descriptions of turbine faults from SCADA data capturing the operational status of turbines, and proposing maintenance strategies. Experiments show that our model learns reasonable feature representations that correspond to expert judgements. We anticipate that in making a contribution to the reliability of wind energy, we can encourage more organisations to switch to sustainable energy sources and help combat climate change.

Authors: Joyjit Chatterjee (University of Hull); Nina Dethlefs (University of Hull)

NeurIPS 2019 Human-Machine Collaboration for Fast Land Cover Mapping (Papers Track)
Abstract and authors: (click to expand)

Abstract: We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, human labelers can interactively query model predictions on unlabeled data, choose which data to label, and see the resulting effect on the model's predictions. This bi-directional feedback loop allows humans to learn how the model responds to new data. Our hypothesis is that this rich feedback allows human labelers to create mental models that enable them to better choose which biases to introduce to the model. We implement this framework for fine-tuning high-resolution land cover segmentation models and evaluate it against traditional active learning based approaches. More specifically, we fine-tune a deep neural network -- trained to segment high-resolution aerial imagery into different land cover classes in Maryland, USA -- to a new spatial area in New York, USA. We find that the tight loop turns the algorithm and the human operator into a hybrid system that can produce land cover maps of large areas more efficiently than the traditional workflows.

Authors: Caleb Robinson (Georgia Institute of Technology); Anthony Ortiz (University of Texas at El Paso); Nikolay Malkin (Yale University); Blake Elias (Microsoft); Andi Peng (Microsoft); Dan Morris (Microsoft); Bistra Dilkina (University of Southern California); Nebojsa Jojic (Microsoft Research)

NeurIPS 2019 VideoGasNet: Deep Learning for Natural Gas Methane Leak Classification Using An Infrared Camera (Papers Track)
Abstract and authors: (click to expand)

Abstract: Mitigating methane leakage from the natural gas system have become an increasing concern for climate change. Efficacious methane leak detection and classification can make the mitigation process more efficient and cost effective. Optical gas imaging is widely used for the purpose of leak detection, but it cannot directly provide detection results and leak sizes. Few studies have examined the possibility of leak classification using videos taken by the infrared camera (IR), an optical gas imaging device. In this study, we consider the leak classification problem as a video classification problem and investigated the application of deep learning techniques in methane leak detection. Firstly we collected the first methane leak video dataset - GasVid, which has ~1 M frames of labeled videos of methane leaks from different leaking equipment, covering a wide range of leak sizes (5.3-2051.6 g\ce{CH4}/h) and imaging distances (4.6-15.6 m). Secondly, we studied three deep learning algorithms, including 2D Convolutional Neural Networks (CNN) model, 3D CNN and the Convolutional Long Short Term Memory (ConvLSTM). We find that 3D CNN is the most outstanding and robust architecture, which was named VideoGasNet. The leak-non-leak detection accuracy can reach 100%, and the highest small-medium-large classification accuracy is 78.2% with our 3D CNN network. In summary, VideoGasNet greatly extends the capabilities of IR camera-based leak monitoring system from leak detection only to automated leak classification with high accuracy and fast processing speed, significant mitigation efficiency.

Authors: Jingfan Wang (Stanford University)

NeurIPS 2019 A Deep Learning-based Framework for the Detection of Schools of Herring in Echograms (Papers Track)
Abstract and authors: (click to expand)

Abstract: Tracking the abundance of underwater species is crucial for understanding the effects of climate change on marine ecosystems. Biologists typically monitor underwater sites with echosounders and visualize data as 2D images (echograms); they interpret these data manually or semi-automatically, which is time-consuming and prone to inconsistencies. This paper proposes a deep learning framework for the automatic detection of schools of herring from echograms. Experiments demonstrated that our approach outperforms a traditional machine learning algorithm that uses hand-crafted features. Our framework could easily be expanded to detect more species of interest to sustainable fisheries.

Authors: Alireza Rezvanifar (University of Victoria); Tunai Porto Marques (University of Victoria ); Melissa Cote (University of Victoria); Alexandra Branzan Albu (University of Victoria); Alex Slonimer (ASL Environmental Sciences); Thomas Tolhurst (ASL Environmental Sciences ); Kaan Ersahin (ASL Environmental Sciences ); Todd Mudge (ASL Environmental Sciences ); Stephane Gauthier (Fisheries and Oceans Canada)

NeurIPS 2019 Emulating Numeric Hydroclimate Models with Physics-Informed cGANs (Papers Track) Honorable Mention
Abstract and authors: (click to expand)

Abstract: Process-based numerical simulations, including those for climate modeling applications, are compute and resource intensive, requiring extensive customization and hand-engineering for encoding governing equations and other domain knowledge. On the other hand, modern deep learning employs a significantly simpler and more efficient computational workflow, and has been shown impressive results across a myriad of applications in the computational sciences. In this work, we investigate the potential of deep generative learning models, specifically conditional Generative Adversarial Networks (cGANs), to simulate the output of a physics-based model of the spatial distribution of the water content of mountain snowpack - the snow water equivalent (SWE). We show preliminary results indicating that the cGAN model is able to learn diverse mappings between meteorological forcings and SWE output. Thus physics based cGANs provide a means for fast and accurate SWE modeling that can have significant impact in a variety of applications (e.g., hydropower forecasting, agriculture, and water supply management). In climate science, the Snowpack and SWE are seen as some of the best indicative variables for investigating climate change and its impact. The massive speedups, diverse sampling, and sensitivity/saliency modelling that cGANs can bring to SWE estimation will be extremely important to investigating variables linked to climate change as well as predicting and forecasting the potential effects of climate change to come.

Authors: Ashray Manepalli (terrafuse); Adrian Albert (terrafuse, inc.); Alan Rhoades (Lawrence Berkeley National Lab); Daniel Feldman (Lawrence Berkeley National Lab)

NeurIPS 2019 FutureArctic - beyond Computational Ecology (Proposals Track)
Abstract and authors: (click to expand)

Abstract: This paper presents the Future Arctic initiative, a multi-disciplinary training network where machine learning researchers and ecologists cooperatively study both long- and short-term responses to future climate in Iceland.

Authors: Steven Latre (UAntwerpen); Dimitri Papadimitriou (UAntwerpen); Ivan Janssens (UAntwerpen); Eric Struyf (UAntwerpen); Erik Verbruggen (UAntwerpen); Ivika Ostonen (UT); Josep Penuelas (UAB); Boris Rewald (RootEcology); Andreas Richter (University of Vienna); Michael Bahn (University of Innsbruck)

NeurIPS 2019 Autonomous Sensing and Scientific Machine Learning for Monitoring Greenhouse Gas Emissions (Proposals Track)
Abstract and authors: (click to expand)

Abstract: Greenhouse gas emissions are a key driver of climate change. In order to develop and tune climate models, measurements of natural and anthropogenic phenomenon are necessary. Traditional methods (i.e., physical sample collection and ex situ analysis) tend to be sample sparse and low resolution, whereas global remote sensing methods tend to miss small- and mid-scale dynamic phenomenon. In situ instrumentation carried by a robotic platform is suited to study greenhouse gas emissions at unprecedented spatial and temporal resolution. However, collecting scientifically rich datasets of dynamic or transient emission events requires accurate and flexible models of gas emission dynamics. Motivated by applications in seasonal Arctic thawing and volcanic outgassing, we propose the use of scientific machine learning, in which traditional scientific models (in the form of ODEs/PDEs) are combined with machine learning techniques (generally neural networks) to better incorporate data into a structured, interpretable model. Our technical contributions will primarily involve developing these hybrid models and leveraging model uncertainty estimates during sensor planning to collect data that efficiently improves gas emission models in small-data domains.

Authors: Genevieve Flaspohler (MIT); Victoria Preston (MIT); Nicholas Roy (MIT); John Fisher (MIT); Adam Soule (Woods Hole Oceanographic Institution); Anna Michel (Woods Hole Oceanographic Institution)

ICML 2019 Policy Search with Non-uniform State Representations for Environmental Sampling (Research Track)
Abstract and authors: (click to expand)

Abstract: Surveying fragile ecosystems like coral reefs is important to monitor the effects of climate change. We present an adaptive sampling technique that generates efficient trajectories covering hotspots in the region of interest at a high rate. A key feature of our sampling algorithm is the ability to generate action plans for any new hotspot distribution using the parameters learned on other similar looking distributions.

Authors: Sandeep Manjanna (McGill University); Herke van Hoof (University of Amsterdam); Gregory Dudek (McGill University)

ICML 2019 Mapping land use and land cover changes faster and at scale with deep learning on the cloud (Research Track)
Abstract and authors: (click to expand)

Abstract: Policymakers rely on Land Use and Land Cover (LULC) maps for evaluation and planning. They use these maps to plan climate-smart agriculture policy, improve housing resilience (to earthquakes or other natural disasters), and understand how to grow commerce in small communities. A number of institutions have created global land use maps from historic satellite imagery. However, these maps can be outdated and are often inaccurate, particularly in their representation of developing countries. We worked with the European Space Agency (ESA) to develop a LULC deep learning workflow on the cloud that can ingest Sentinel-2 optical imagery for a large scale LULC change detection. It’s an end-to-end workflow that sits on top of two comprehensive tools, SentinelHub, and eo-learn, which seamlessly link earth observation data with machine learning libraries. It can take in the labeled LULC and associated AOI in shapefiles, set up a task to fetch cloud-free, time series imagery stacks within the defined time interval by the users. It will pair the satellite imagery tile with it’s labeled LULC mask for the supervised deep learning model training on the cloud. Once a well-performing model is trained, it can be exported as a Tensorflow/Pytorch serving docker image to work with our cloud-based model inference pipeline. The inference pipeline can automatically scale with the number of images to be processed. Changes in land use are heavily influenced by human activities (e.g. agriculture, deforestation, human settlement expansion) and have been a great source of greenhouse gas emissions. Sustainable forest and land management practices vary from region to region, which means having flexible, scalable tools will be critical. With these tools, we can empower analysts, engineers, and decision-makers to see where contributions to climate-smart agricultural, forestry and urban resilience programs can be made.

Authors: Zhuangfang Yi (Development Seed); Drew Bollinger (Development Seed); Devis Peressutti (Sinergise)

ICML 2019 Deep Learning for Wildlife Conservation and Restoration Efforts (Deployed Track)
Abstract and authors: (click to expand)

Abstract: Climate change and environmental degradation are causing species extinction worldwide. Automatic wildlife sensing is an urgent requirement to track biodiversity losses on Earth. Recent improvements in machine learning can accelerate the development of large-scale monitoring systems that would help track conservation outcomes and target efforts. In this paper, we present one such system we developed. 'Tidzam' is a Deep Learning framework for wildlife detection, identification, and geolocalization, designed for the Tidmarsh Wildlife Sanctuary, the site of the largest freshwater wetland restoration in Massachusetts.

Authors: Clement Duhart (MIT Media Lab)

ICML 2019 Reinforcement Learning for Sustainable Agriculture (Ideas Track)
Abstract and authors: (click to expand)

Abstract: The growing population and the changing climate will push modern agriculture to its limits in an increasing number of regions on earth. Establishing next-generation sustainable food supply systems will mean producing more food on less arable land, while keeping the environmental impact to a minimum. Modern machine learning methods have achieved super-human performance on a variety of tasks, simply learning from the outcomes of their actions. We propose a path towards more sustainable agriculture, considering plant development an optimization problem with respect to certain parameters, such as yield and environmental impact, which can be optimized in an automated way. Specifically, we propose to use reinforcement learning to autonomously explore and learn ways of influencing the development of certain types of plants, controlling environmental parameters, such as irrigation or nutrient supply, and receiving sensory feedback, such as camera images, humidity, and moisture measurements. The trained system will thus be able to provide instructions for optimal treatment of a local population of plants, based on non-invasive measurements, such as imaging.

Authors: Jonathan Binas (Mila, Montreal); Leonie Luginbuehl (Department of Plant Sciences, University of Cambridge); Yoshua Bengio (Mila)

ICML 2019 Harness the Power of Artificial intelligence and -Omics to Identify Soil Microbial Functions in Climate Change Projection (Ideas Track)
Abstract and authors: (click to expand)

Abstract: Contemporary Earth system models (ESMs) omit one of the significant drivers of the terrestrial carbon cycle, soil microbial communities. Soil microbial community not only directly emit greenhouse gasses into the atmosphere through the respiration process, but also release diverse enzymes to catalyze the decomposition of soil organic matter and determine nutrient availability for aboveground vegetation. Therefore, soil microbial community control over terrestrial carbon dynamics and their feedbacks to climate. Currently, inadequate representation of soil microbial communities in ESMs has introduced significant uncertainty in current terrestrial carbon-climate feedbacks. Mitigation of this uncertainty requires to identify functions, diversity, and environmental adaptation of soil microbial communities under global climate change. The revolution of -omics technology allows high throughput quantification of diverse soil enzymes, enabling large-scale studies of microbial functions in climate change. Such studies may lead to revolutionary solutions to predicting microbial-mediated climate-carbon feedbacks at the global scale based on gene-level environmental adaptation strategies of the microbial community. A key initial step in this direction is to identify the biogeography and environmental adaptation of soil enzyme functions based on the massive amount of data generated by -omics technologies. Here we propose to make this step. Artificial intelligence is a powerful, ideal tool for this leap forward. Our project is to integrate Artificial intelligence technologies and global -omics data to represent climate controls on microbial enzyme functions and mapping biogeography of soil enzyme functional groups at global scale. This outcome of this study will allow us to improve the representation of microbial function in earth system modeling and mitigate uncertainty in current climate projection.

Authors: Yang Song (Oak Ridge National Lab); Dali Wang (Oak Ridge National Lab); Melanie Mayes (Oak Ridge National Lab)