-
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
Authors:
Sagar Soni,
Akshay Dudhane,
Hiyam Debary,
Mustansar Fiaz,
Muhammad Akhtar Munir,
Muhammad Sohail Danish,
Paolo Fraccaro,
Campbell D Watson,
Levente J Klein,
Fahad Shahbaz Khan,
Salman Khan
Abstract:
Automated analysis of vast Earth observation data via interactive Vision-Language Models (VLMs) can unlock new opportunities for environmental monitoring, disaster response, and {resource management}. Existing generic VLMs do not perform well on Remote Sensing data, while the recent Geo-spatial VLMs remain restricted to a fixed resolution and few sensor modalities. In this paper, we introduce Eart…
▽ More
Automated analysis of vast Earth observation data via interactive Vision-Language Models (VLMs) can unlock new opportunities for environmental monitoring, disaster response, and {resource management}. Existing generic VLMs do not perform well on Remote Sensing data, while the recent Geo-spatial VLMs remain restricted to a fixed resolution and few sensor modalities. In this paper, we introduce EarthDial, a conversational assistant specifically designed for Earth Observation (EO) data, transforming complex, multi-sensory Earth observations into interactive, natural language dialogues. EarthDial supports multi-spectral, multi-temporal, and multi-resolution imagery, enabling a wide range of remote sensing tasks, including classification, detection, captioning, question answering, visual reasoning, and visual grounding. To achieve this, we introduce an extensive instruction tuning dataset comprising over 11.11M instruction pairs covering RGB, Synthetic Aperture Radar (SAR), and multispectral modalities such as Near-Infrared (NIR) and infrared. Furthermore, EarthDial handles bi-temporal and multi-temporal sequence analysis for applications like change detection. Our extensive experimental results on 44 downstream datasets demonstrate that EarthDial outperforms existing generic and domain-specific models, achieving better generalization across various EO tasks. Our source codes and pre-trained models are at https://github.com/hiyamdebary/EarthDial.
△ Less
Submitted 7 April, 2025; v1 submitted 19 December, 2024;
originally announced December 2024.
-
Aboveground carbon biomass estimate with Physics-informed deep network
Authors:
Juan Nathaniel,
Levente J. Klein,
Campbell D. Watson,
Gabrielle Nyirjesy,
Conrad M. Albrecht
Abstract:
The global carbon cycle is a key process to understand how our climate is changing. However, monitoring the dynamics is difficult because a high-resolution robust measurement of key state parameters including the aboveground carbon biomass (AGB) is required. Here, we use deep neural network to generate a wall-to-wall map of AGB within the Continental USA (CONUS) with 30-meter spatial resolution fo…
▽ More
The global carbon cycle is a key process to understand how our climate is changing. However, monitoring the dynamics is difficult because a high-resolution robust measurement of key state parameters including the aboveground carbon biomass (AGB) is required. Here, we use deep neural network to generate a wall-to-wall map of AGB within the Continental USA (CONUS) with 30-meter spatial resolution for the year 2021. We combine radar and optical hyperspectral imagery, with a physical climate parameter of SIF-based GPP. Validation results show that a masked variation of UNet has the lowest validation RMSE of 37.93 $\pm$ 1.36 Mg C/ha, as compared to 52.30 $\pm$ 0.03 Mg C/ha for random forest algorithm. Furthermore, models that learn from SIF-based GPP in addition to radar and optical imagery reduce validation RMSE by almost 10% and the standard deviation by 40%. Finally, we apply our model to measure losses in AGB from the recent 2021 Caldor wildfire in California, and validate our analysis with Sentinel-based burn index.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning
Authors:
Conrad M Albrecht,
Fernando Marianno,
Levente J Klein
Abstract:
A key challenge of supervised learning is the availability of human-labeled data. We evaluate a big data processing pipeline to auto-generate labels for remote sensing data. It is based on rasterized statistical features extracted from surveys such as e.g. LiDAR measurements. Using simple combinations of the rasterized statistical layers, it is demonstrated that multiple classes can be generated a…
▽ More
A key challenge of supervised learning is the availability of human-labeled data. We evaluate a big data processing pipeline to auto-generate labels for remote sensing data. It is based on rasterized statistical features extracted from surveys such as e.g. LiDAR measurements. Using simple combinations of the rasterized statistical layers, it is demonstrated that multiple classes can be generated at accuracies of ~0.9. As proof of concept, we utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas with multiple land cover classes. The general method proposed here is platform independent, and it can be adapted to generate labels for other satellite modalities in order to enable machine learning on overhead imagery for land use classification and object detection.
△ Less
Submitted 31 January, 2022;
originally announced February 2022.
-
S3RP: Self-Supervised Super-Resolution and Prediction for Advection-Diffusion Process
Authors:
Chulin Wang,
Kyongmin Yeo,
Xiao Jin,
Andres Codas,
Levente J. Klein,
Bruce Elmegreen
Abstract:
We present a super-resolution model for an advection-diffusion process with limited information. While most of the super-resolution models assume high-resolution (HR) ground-truth data in the training, in many cases such HR dataset is not readily accessible. Here, we show that a Recurrent Convolutional Network trained with physics-based regularizations is able to reconstruct the HR information wit…
▽ More
We present a super-resolution model for an advection-diffusion process with limited information. While most of the super-resolution models assume high-resolution (HR) ground-truth data in the training, in many cases such HR dataset is not readily accessible. Here, we show that a Recurrent Convolutional Network trained with physics-based regularizations is able to reconstruct the HR information without having the HR ground-truth data. Moreover, considering the ill-posed nature of a super-resolution problem, we employ the Recurrent Wasserstein Autoencoder to model the uncertainty.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Quantification of Carbon Sequestration in Urban Forests
Authors:
Levente J. Klein,
Wang Zhou,
Conrad M. Albrecht
Abstract:
Vegetation, trees in particular, sequester carbon by absorbing carbon dioxide from the atmosphere. However, the lack of efficient quantification methods of carbon stored in trees renders it difficult to track the process. We present an approach to estimate the carbon storage in trees based on fusing multi-spectral aerial imagery and LiDAR data to identify tree coverage, geometric shape, and tree s…
▽ More
Vegetation, trees in particular, sequester carbon by absorbing carbon dioxide from the atmosphere. However, the lack of efficient quantification methods of carbon stored in trees renders it difficult to track the process. We present an approach to estimate the carbon storage in trees based on fusing multi-spectral aerial imagery and LiDAR data to identify tree coverage, geometric shape, and tree species -- key attributes to carbon storage quantification. We demonstrate that tree species information and their three-dimensional geometric shapes can be estimated from aerial imagery in order to determine the tree's biomass. Specifically, we estimate a total of $52,000$ tons of carbon sequestered in trees for New York City's borough Manhattan.
△ Less
Submitted 20 July, 2021; v1 submitted 31 May, 2021;
originally announced June 2021.
-
PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data
Authors:
Wang Zhou,
Levente J. Klein,
Siyuan Lu
Abstract:
An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform. The framework simplifies the development of industrial machine learning solutions leveraging geospatial data to the extent that the user inputs are minimized to merely a text file containing labeled GPS coordinates. PAIRS AutoGeo automatically gathers…
▽ More
An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform. The framework simplifies the development of industrial machine learning solutions leveraging geospatial data to the extent that the user inputs are minimized to merely a text file containing labeled GPS coordinates. PAIRS AutoGeo automatically gathers required data at the location coordinates, assembles the training data, performs quality check, and trains multiple machine learning models for subsequent deployment. The framework is validated using a realistic industrial use case of tree species classification. Open-source tree species data are used as the input to train a random forest classifier and a modified ResNet model for 10-way tree species classification based on aerial imagery, which leads to an accuracy of $59.8\%$ and $81.4\%$, respectively. This use case exemplifies how PAIRS AutoGeo enables users to leverage machine learning without extensive geospatial expertise.
△ Less
Submitted 12 December, 2020;
originally announced December 2020.