-
Accelerating Earth Science Discovery via Multi-Agent LLM Systems
Authors:
Dmitrii Pantiukhin,
Boris Shapkin,
Ivan Kuznetsov,
Antonia Anna Jost,
Nikolay Koldunov
Abstract:
This Perspective explores the transformative potential of Multi-Agent Systems (MAS) powered by Large Language Models (LLMs) in the geosciences. Users of geoscientific data repositories face challenges due to the complexity and diversity of data formats, inconsistent metadata practices, and a considerable number of unprocessed datasets. MAS possesses transformative potential for improving scientist…
▽ More
This Perspective explores the transformative potential of Multi-Agent Systems (MAS) powered by Large Language Models (LLMs) in the geosciences. Users of geoscientific data repositories face challenges due to the complexity and diversity of data formats, inconsistent metadata practices, and a considerable number of unprocessed datasets. MAS possesses transformative potential for improving scientists' interaction with geoscientific data by enabling intelligent data processing, natural language interfaces, and collaborative problem-solving capabilities. We illustrate this approach with "PANGAEA GPT", a specialized MAS pipeline integrated with the diverse PANGAEA database for Earth and Environmental Science, demonstrating how MAS-driven workflows can effectively manage complex datasets and accelerate scientific discovery. We discuss how MAS can address current data challenges in geosciences, highlight advancements in other scientific fields, and propose future directions for integrating MAS into geoscientific data processing pipelines. In this Perspective, we show how MAS can fundamentally improve data accessibility, promote cross-disciplinary collaboration, and accelerate geoscientific discoveries.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
Robustness of AI-based weather forecasts in a changing climate
Authors:
Thomas Rackow,
Nikolay Koldunov,
Christian Lessig,
Irina Sandu,
Mihai Alexe,
Matthew Chantry,
Mariana Clare,
Jesper Dramsch,
Florian Pappenberger,
Xabier Pedruzo-Bagazgoitia,
Steffen Tietsche,
Thomas Jung
Abstract:
Data-driven machine learning models for weather forecasting have made transformational progress in the last 1-2 years, with state-of-the-art ones now outperforming the best physics-based models for a wide range of skill scores. Given the strong links between weather and climate modelling, this raises the question whether machine learning models could also revolutionize climate science, for example…
▽ More
Data-driven machine learning models for weather forecasting have made transformational progress in the last 1-2 years, with state-of-the-art ones now outperforming the best physics-based models for a wide range of skill scores. Given the strong links between weather and climate modelling, this raises the question whether machine learning models could also revolutionize climate science, for example by informing mitigation and adaptation to climate change or to generate larger ensembles for more robust uncertainty estimates. Here, we show that current state-of-the-art machine learning models trained for weather forecasting in present-day climate produce skillful forecasts across different climate states corresponding to pre-industrial, present-day, and future 2.9K warmer climates. This indicates that the dynamics shaping the weather on short timescales may not differ fundamentally in a changing climate. It also demonstrates out-of-distribution generalization capabilities of the machine learning models that are a critical prerequisite for climate applications. Nonetheless, two of the models show a global-mean cold bias in the forecasts for the future warmer climate state, i.e. they drift towards the colder present-day climate they have been trained for. A similar result is obtained for the pre-industrial case where two out of three models show a warming. We discuss possible remedies for these biases and analyze their spatial distribution, revealing complex warming and cooling patterns that are partly related to missing ocean-sea ice and land surface information in the training data. Despite these current limitations, our results suggest that data-driven machine learning models will provide powerful tools for climate science and transform established approaches by complementing conventional physics-based models.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
Emerging AI-based weather prediction models as downscaling tools
Authors:
Nikolay Koldunov,
Thomas Rackow,
Christian Lessig,
Sergey Danilov,
Suvarchal K. Cheedela,
Dmitry Sidorenko,
Irina Sandu,
Thomas Jung
Abstract:
The demand for high-resolution information on climate change is critical for accurate projections and decision-making. Presently, this need is addressed through high-resolution climate models or downscaling. High-resolution models are computationally demanding and creating ensemble simulations with them is typically prohibitively expensive. Downscaling methods are more affordable but are typically…
▽ More
The demand for high-resolution information on climate change is critical for accurate projections and decision-making. Presently, this need is addressed through high-resolution climate models or downscaling. High-resolution models are computationally demanding and creating ensemble simulations with them is typically prohibitively expensive. Downscaling methods are more affordable but are typically limited to small regions. This study proposes the use of existing AI-based numerical weather prediction systems (AI-NWP) to perform global downscaling of climate information from low-resolution climate models. Our results demonstrate that AI-NWP initalized from low-resolution initial conditions can develop detailed forecasts closely resembling the resolution of the training data using a one day lead time. We constructed year-long atmospheric fields using AI-NWP forecasts initialized from smoothed ERA5 and low-resolution CMIP6 models. Our analysis for 2-metre temperature indicates that AI-NWP can generate high-quality, long-term datasets and potentially perform bias correction, bringing climate model outputs closer to observed data. The study highlights the potential for off-the-shelf AI-NWP to enhance climate data downscaling, offering a simple and computationally efficient alternative to traditional downscaling techniques. The downscaled data can be used either directly for localized climate information or as boundary conditions for further dynamical downscaling.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Enabling Dynamic and Intelligent Workflows for HPC, Data Analytics, and AI Convergence
Authors:
Jorge Ejarque,
Rosa M. Badia,
Loïc Albertin,
Giovanni Aloisio,
Enrico Baglione,
Yolanda Becerra,
Stefan Boschert,
Julian R. Berlin,
Alessandro D'Anca,
Donatello Elia,
François Exertier,
Sandro Fiore,
José Flich,
Arnau Folch,
Steven J Gibbons,
Nikolay Koldunov,
Francesc Lordan,
Stefano Lorito,
Finn Løvholt,
Jorge Macías,
Fabrizio Marozzo,
Alberto Michelini,
Marisol Monterrubio-Velasco,
Marta Pienkowska,
Josep de la Puente
, et al. (12 additional authors not shown)
Abstract:
The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target simulations and modelling of physical phenomena,…
▽ More
The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target simulations and modelling of physical phenomena, current needs require in addition data analytics (DA) and artificial intelligence (AI) tasks. However, the development of these workflows is hampered by the lack of proper programming models and environments that support the integration of HPC, DA, and AI, as well as the lack of tools to easily deploy and execute the workflows in HPC systems. To progress in this direction, this paper presents use cases where complex workflows are required and investigates the main issues to be addressed for the HPC/DA/AI convergence. Based on this study, the paper identifies the challenges of a new workflow platform to manage complex workflows. Finally, it proposes a development approach for such a workflow platform addressing these challenges in two directions: first, by defining a software stack that provides the functionalities to manage these complex workflows; and second, by proposing the HPC Workflow as a Service (HPCWaaS) paradigm, which leverages the software stack to facilitate the reusability of complex workflows in federated HPC infrastructures. Proposals presented in this work are subject to study and development as part of the EuroHPC eFlows4HPC project.
△ Less
Submitted 13 May, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.