-
ClimaEmpact: Domain-Aligned Small Language Models and Datasets for Extreme Weather Analytics
Authors:
Deeksha Varshney,
Keane Ong,
Rui Mao,
Erik Cambria,
Gianmarco Mengaldo
Abstract:
Accurate assessments of extreme weather events are vital for research and policy, yet localized and granular data remain scarce in many parts of the world. This data gap limits our ability to analyze potential outcomes and implications of extreme weather events, hindering effective decision-making. Large Language Models (LLMs) can process vast amounts of unstructured text data, extract meaningful…
▽ More
Accurate assessments of extreme weather events are vital for research and policy, yet localized and granular data remain scarce in many parts of the world. This data gap limits our ability to analyze potential outcomes and implications of extreme weather events, hindering effective decision-making. Large Language Models (LLMs) can process vast amounts of unstructured text data, extract meaningful insights, and generate detailed assessments by synthesizing information from multiple sources. Furthermore, LLMs can seamlessly transfer their general language understanding to smaller models, enabling these models to retain key knowledge while being fine-tuned for specific tasks. In this paper, we propose Extreme Weather Reasoning-Aware Alignment (EWRA), a method that enhances small language models (SLMs) by incorporating structured reasoning paths derived from LLMs, and ExtremeWeatherNews, a large dataset of extreme weather event-related news articles. EWRA and ExtremeWeatherNews together form the overall framework, ClimaEmpact, that focuses on addressing three critical extreme-weather tasks: categorization of tangible vulnerabilities/impacts, topic labeling, and emotion analysis. By aligning SLMs with advanced reasoning strategies on ExtremeWeatherNews (and its derived dataset ExtremeAlign used specifically for SLM alignment), EWRA improves the SLMs' ability to generate well-grounded and domain-specific responses for extreme weather analytics. Our results show that the approach proposed guides SLMs to output domain-aligned responses, surpassing the performance of task-specific models and offering enhanced real-world applicability for extreme weather analytics.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
Dynamical errors in machine learning forecasts
Authors:
Zhou Fang,
Gianmarco Mengaldo
Abstract:
In machine learning forecasting, standard error metrics such as mean absolute error (MAE) and mean squared error (MSE) quantify discrepancies between predictions and target values. However, these metrics do not directly evaluate the physical and/or dynamical consistency of forecasts, an increasingly critical concern in scientific and engineering applications.
Indeed, a fundamental yet often over…
▽ More
In machine learning forecasting, standard error metrics such as mean absolute error (MAE) and mean squared error (MSE) quantify discrepancies between predictions and target values. However, these metrics do not directly evaluate the physical and/or dynamical consistency of forecasts, an increasingly critical concern in scientific and engineering applications.
Indeed, a fundamental yet often overlooked question is whether machine learning forecasts preserve the dynamical behavior of the underlying system. Addressing this issue is essential for assessing the fidelity of machine learning models and identifying potential failure modes, particularly in applications where maintaining correct dynamical behavior is crucial.
In this work, we investigate the relationship between standard forecasting error metrics, such as MAE and MSE, and the dynamical properties of the underlying system. To achieve this goal, we use two recently developed dynamical indices: the instantaneous dimension ($d$), and the inverse persistence ($θ$). Our results indicate that larger forecast errors -- e.g., higher MSE -- tend to occur in states with higher $d$ (higher complexity) and higher $θ$ (lower persistence). To further assess dynamical consistency, we propose error metrics based on the dynamical indices that measure the discrepancy of the forecasted $d$ and $θ$ versus their correct values. Leveraging these dynamical indices-based metrics, we analyze direct and recursive forecasting strategies for three canonical datasets -- Lorenz, Kuramoto-Sivashinsky equation, and Kolmogorov flow -- as well as a real-world weather forecasting task. Our findings reveal substantial distortions in dynamical properties in ML forecasts, especially for long forecast lead times or long recursive simulations, providing complementary information on ML forecast fidelity that can be used to improve ML models.
△ Less
Submitted 16 April, 2025; v1 submitted 15 April, 2025;
originally announced April 2025.
-
CondensNet: Enabling stable long-term climate simulations via hybrid deep learning models with adaptive physical constraints
Authors:
Xin Wang,
Juntao Yang,
Jeff Adie,
Simon See,
Kalli Furtado,
Chen Chen,
Troy Arcomano,
Romit Maulik,
Gianmarco Mengaldo
Abstract:
Accurate and efficient climate simulations are crucial for understanding Earth's evolving climate. However, current general circulation models (GCMs) face challenges in capturing unresolved physical processes, such as cloud and convection. A common solution is to adopt cloud resolving models, that provide more accurate results than the standard subgrid parametrisation schemes typically used in GCM…
▽ More
Accurate and efficient climate simulations are crucial for understanding Earth's evolving climate. However, current general circulation models (GCMs) face challenges in capturing unresolved physical processes, such as cloud and convection. A common solution is to adopt cloud resolving models, that provide more accurate results than the standard subgrid parametrisation schemes typically used in GCMs. However, cloud resolving models, also referred to as super paramtetrizations, remain computationally prohibitive. Hybrid modeling, which integrates deep learning with equation-based GCMs, offers a promising alternative but often struggles with long-term stability and accuracy issues. In this work, we find that water vapor oversaturation during condensation is a key factor compromising the stability of hybrid models. To address this, we introduce CondensNet, a novel neural network architecture that embeds a self-adaptive physical constraint to correct unphysical condensation processes. CondensNet effectively mitigates water vapor oversaturation, enhancing simulation stability while maintaining accuracy and improving computational efficiency compared to super parameterization schemes.
We integrate CondensNet into a GCM to form PCNN-GCM (Physics-Constrained Neural Network GCM), a hybrid deep learning framework designed for long-term stable climate simulations in real-world conditions, including ocean and land. PCNN-GCM represents a significant milestone in hybrid climate modeling, as it shows a novel way to incorporate physical constraints adaptively, paving the way for accurate, lightweight, and stable long-term climate simulations.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Multiscale Dynamical Indices Reveal Scale-Dependent Atmospheric Dynamics
Authors:
Chenyu Dong,
Gabriele Messori,
Davide Faranda,
Adriano Gualandi,
Valerio Lucarini,
Gianmarco Mengaldo
Abstract:
Geophysical systems are inherently complex and span multiple spatial and temporal scales, making their dynamics challenging to understand and predict. This challenge is especially pronounced for extreme events, which are primarily governed by their instantaneous properties rather than their average characteristics. Advances in dynamical systems theory, including the development of local dynamical…
▽ More
Geophysical systems are inherently complex and span multiple spatial and temporal scales, making their dynamics challenging to understand and predict. This challenge is especially pronounced for extreme events, which are primarily governed by their instantaneous properties rather than their average characteristics. Advances in dynamical systems theory, including the development of local dynamical indices such as local dimension and inverse persistence, have provided powerful tools for studying these short-lasting phenomena. However, existing applications of such indices often rely on predefined fixed spatial domains and scales, with limited discussion on the influence of spatial scales on the results. In this work, we present a novel spatially multiscale methodology that applies a sliding window method to compute dynamical indices, enabling the exploration of scale-dependent properties. Applying this framework to high-impact European summertime heatwaves, we reconcile previously different perspectives, thereby underscoring the importance of spatial scales in such analyses. Furthermore, we emphasize that our novel methodology has broad applicability to other atmospheric phenomena, as well as to other geophysical and spatio-temporal systems.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
Revisiting the predictability of dynamical systems: a new local data-driven approach
Authors:
Chenyu Dong,
Davide Faranda,
Adriano Gualandi,
Valerio Lucarini,
Gianmarco Mengaldo
Abstract:
Nonlinear dynamical systems are ubiquitous in nature and they are hard to forecast. Not only they may be sensitive to small perturbations in their initial conditions, but they are often composed of processes acting at multiple scales. Classical approaches based on the Lyapunov spectrum rely on the knowledge of the dynamic forward operator, or of a data-derived approximation of it. This operator is…
▽ More
Nonlinear dynamical systems are ubiquitous in nature and they are hard to forecast. Not only they may be sensitive to small perturbations in their initial conditions, but they are often composed of processes acting at multiple scales. Classical approaches based on the Lyapunov spectrum rely on the knowledge of the dynamic forward operator, or of a data-derived approximation of it. This operator is typically unknown, or the data are too noisy to derive a faithful representation. Here, we propose a new data-driven approach to analyze the local predictability of dynamical systems. This method, based on the concept of recurrence, is closely linked to the well-established framework of local dynamical indices. Applied to both idealized systems and real-world datasets, this new index shows results consistent with existing knowledge, proving its effectiveness in estimating local predictability. Additionally, we discuss its relationship with local dynamical indices, illustrating how it complements the previous framework as a more direct measure of predictability. Furthermore, we explore its reflection of the scale-dependent nature of predictability, its extension that includes a weighting strategy, and its real-time application. We believe these aspects collectively demonstrate its potential as a powerful diagnostic tool for complex systems.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package
Authors:
Marcin Rogowski,
Brandon C. Y. Yeung,
Oliver T. Schmidt,
Romit Maulik,
Lisandro Dalcin,
Matteo Parsani,
Gianmarco Mengaldo
Abstract:
We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the…
▽ More
We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD (https://github.com/MathEXLab/PySPOD) library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py (https://mpi4py.readthedocs.io/en/stable/). An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.
△ Less
Submitted 31 July, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Online data-driven changepoint detection for high-dimensional dynamical systems
Authors:
Sen Lin,
Gianmarco Mengaldo,
Romit Maulik
Abstract:
The detection of anomalies or transitions in complex dynamical systems is of critical importance to various applications. In this study, we propose the use of machine learning to detect changepoints for high-dimensional dynamical systems. Here, changepoints indicate instances in time when the underlying dynamical system has a fundamentally different characteristic - which may be due to a change in…
▽ More
The detection of anomalies or transitions in complex dynamical systems is of critical importance to various applications. In this study, we propose the use of machine learning to detect changepoints for high-dimensional dynamical systems. Here, changepoints indicate instances in time when the underlying dynamical system has a fundamentally different characteristic - which may be due to a change in the model parameters or due to intermittent phenomena arising from the same model. We propose two complementary approaches to achieve this, with the first devised using arguments from probabilistic unsupervised learning and the latter devised using supervised deep learning. Our emphasis is also on detection for high-dimensional dynamical systems, for which we introduce the use of dimensionality reduction techniques to accelerate the deployment of transition detection algorithms. Our experiments demonstrate that transitions can be detected efficiently, in real-time, for the two-dimensional forced Kolmogorov flow, which is characterized by anomalous regimes in phase space where dynamics are perturbed off the attractor at uneven intervals.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Fully-discrete spatial eigenanalysis of discontinuous spectral element methods: insights into well-resolved and under-resolved vortical flows
Authors:
Niccolò Tonicello,
Rodrigo C Moura,
Guido Lodato,
Gianmarco Mengaldo
Abstract:
This study presents a comprehensive spatial eigenanalysis of fully-discrete discontinuous spectral element methods, now generalizing previous spatial eigenanalysis that did not include time integration errors. The influence of discrete time integration is discussed in detail for different explicit Runge-Kutta (1st to 4th order accurate) schemes combined with either Discontinuous Galerkin (DG) or S…
▽ More
This study presents a comprehensive spatial eigenanalysis of fully-discrete discontinuous spectral element methods, now generalizing previous spatial eigenanalysis that did not include time integration errors. The influence of discrete time integration is discussed in detail for different explicit Runge-Kutta (1st to 4th order accurate) schemes combined with either Discontinuous Galerkin (DG) or Spectral Difference (SD) methods, both here recovered from the Flux Reconstruction (FR) scheme. Selected numerical experiments using the improved SD method by Liang and Jameson [1] are performed to quantify the influence of time integration errors on actual simulations. These involve test cases of varied complexity, from one-dimensional linear advection equation studies to well-resolved and under-resolved inviscid vortical flows. It is shown that, while both well-resolved and under-resolved simulations of linear problems correlate well with the eigenanalysis prediction of time integration errors, the correlation can be much worse for under-resolved nonlinear problems. The effect of mesh regularity is also considered, where time integration errors are found to be, in the case of irregular grids, less pronounced than those of the spatial discretisation. In fact, for the under-resolved vortical flows considered, the predominance of spatial errors made it practically impossible for time integration errors to be distinctly identified. Nevertheless, for well-resolved nonlinear simulations, the effect of time integration errors could still be recognized. This highlights that the interaction between space and time discretisation errors is more complex than otherwise anticipated, contributing to the current understanding about when eigenanalysis can effectively predict the behaviour of numerical errors in practical under-resolved nonlinear problems, including under-resolved turbulence computations.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
PyParSVD: A streaming, distributed and randomized singular-value-decomposition library
Authors:
Romit Maulik,
Gianmarco Mengaldo
Abstract:
We introduce PyParSVD\footnote{https://github.com/Romit-Maulik/PyParSVD}, a Python library that implements a streaming, distributed and randomized algorithm for the singular value decomposition. To demonstrate its effectiveness, we extract coherent structures from scientific data. Futhermore, we show weak scaling assessments on up to 256 nodes of the Theta machine at Argonne Leadership Computing F…
▽ More
We introduce PyParSVD\footnote{https://github.com/Romit-Maulik/PyParSVD}, a Python library that implements a streaming, distributed and randomized algorithm for the singular value decomposition. To demonstrate its effectiveness, we extract coherent structures from scientific data. Futhermore, we show weak scaling assessments on up to 256 nodes of the Theta machine at Argonne Leadership Computing Facility, demonstrating potential for large-scale data analyses of practical data sets.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
A fast multi-resolution lattice Green's function method for elliptic difference equations
Authors:
Benedikt Dorschner,
Ke Yu,
Gianmarco Mengaldo,
Tim Colonius
Abstract:
We propose a mesh refinement technique for solving elliptic difference equations on unbounded domains based on the fast lattice Green's function (FLGF) method. The FLGF method exploits the regularity of the Cartesian mesh and uses the fast multipole method in conjunction with fast Fourier transforms to yield linear complexity and decrease time-to-solution. We extend this method to a multi-resoluti…
▽ More
We propose a mesh refinement technique for solving elliptic difference equations on unbounded domains based on the fast lattice Green's function (FLGF) method. The FLGF method exploits the regularity of the Cartesian mesh and uses the fast multipole method in conjunction with fast Fourier transforms to yield linear complexity and decrease time-to-solution. We extend this method to a multi-resolution scheme and allow for locally refined Cartesian blocks embedded in the computational domain. Appropriately chosen interpolation and regularization operators retain consistency between the discrete Laplace operator and its inverse on the unbounded domain. Second-order accuracy and linear complexity are maintained, while significantly reducing the number of degrees of freedom and hence the computational cost.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Nektar++: enhancing the capability and application of high-fidelity spectral/$hp$ element methods
Authors:
David Moxey,
Chris D. Cantwell,
Yan Bao,
Andrea Cassinelli,
Giacomo Castiglioni,
Sehun Chun,
Emilia Juda,
Ehsan Kazemi,
Kilian Lackhove,
Julian Marcon,
Gianmarco Mengaldo,
Douglas Serson,
Michael Turner,
Hui Xu,
Joaquim Peiró,
Robert M. Kirby,
Spencer J. Sherwin
Abstract:
Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/$hp$ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a…
▽ More
Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/$hp$ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a wide range of application areas. In this paper, we present the algorithmic, implementation and application developments associated with our Nektar++ version 5.0 release. We describe some of the key software and performance developments, including our strategies on parallel I/O, on in situ processing, the use of collective operations for exploiting current and emerging hardware, and interfaces to enable multi-solver coupling. Furthermore, we provide details on a newly developed Python interface that enables a more rapid introduction for new users unfamiliar with spectral/$hp$ element methods, C++ and/or Nektar++. This release also incorporates a number of numerical method developments - in particular: the method of moving frames, which provides an additional approach for the simulation of equations on embedded curvilinear manifolds and domains; a means of handling spatially variable polynomial order; and a novel technique for quasi-3D simulations to permit spatially-varying perturbations to the geometry in the homogeneous direction. Finally, we demonstrate the new application-level features provided in this release, namely: a facility for generating high-order curvilinear meshes called NekMesh; a novel new AcousticSolver for aeroacoustic problems; our development of a 'thick' strip model for the modelling of fluid-structure interaction problems in the context of vortex-induced vibrations. We conclude by commenting some directions for future code development and expansion.
△ Less
Submitted 26 November, 2019; v1 submitted 8 June, 2019;
originally announced June 2019.
-
Non-modal analysis of spectral element methods: Towards accurate and robust large-eddy simulations
Authors:
Pablo Fernandez,
Rodrigo Moura,
Gianmarco Mengaldo,
Jaime Peraire
Abstract:
We introduce a \textit{non-modal} analysis technique that characterizes the diffusion properties of spectral element methods for linear convection-diffusion systems. While strictly speaking only valid for linear problems, the analysis is devised so that it can give critical insights on two questions: (i) Why do spectral element methods suffer from stability issues in under-resolved computations of…
▽ More
We introduce a \textit{non-modal} analysis technique that characterizes the diffusion properties of spectral element methods for linear convection-diffusion systems. While strictly speaking only valid for linear problems, the analysis is devised so that it can give critical insights on two questions: (i) Why do spectral element methods suffer from stability issues in under-resolved computations of nonlinear problems? And, (ii) why do they successfully predict under-resolved turbulent flows even without a subgrid-scale model? The answer to these two questions can in turn provide crucial guidelines to construct more robust and accurate schemes for complex under-resolved flows, commonly found in industrial applications. For illustration purposes, this analysis technique is applied to the hybridized discontinuous Galerkin methods as representatives of spectral element methods. The effect of the polynomial order, the upwinding parameter and the Péclet number on the so-called \textit{short-term diffusion} of the scheme are investigated. From a purely non-modal analysis point of view, polynomial orders between $2$ and $4$ with standard upwinding are well suited for under-resolved turbulence simulations. For lower polynomial orders, diffusion is introduced in scales that are much larger than the grid resolution. For higher polynomial orders, as well as for strong under/over-upwinding, robustness issues can be expected. The non-modal analysis results are then tested against under-resolved turbulence simulations of the Burgers, Euler and Navier-Stokes equations. While devised in the linear setting, our non-modal analysis succeeds to predict the behavior of the scheme in the nonlinear problems considered.
△ Less
Submitted 24 April, 2018;
originally announced April 2018.