Search | arXiv e-print repository

Surya: Foundation Model for Heliophysics

Authors: Sujit Roy, Johannes Schmude, Rohit Lal, Vishal Gaur, Marcus Freitag, Julian Kuehnert, Theodore van Kessel, Dinesha V. Hegde, Andrés Muñoz-Jaramillo, Johannes Jakubik, Etienne Vos, Kshitiz Mandal, Ata Akbari Asanjan, Joao Lucas de Sousa Almeida, Amy Lin, Talwinder Singh, Kang Yang, Chetraj Pandey, Jinsu Hong, Berkay Aydin, Thorsten Kurth, Ryan McGranaghan, Spiridon Kasapis, Vishal Upendran, Shah Bahauddin , et al. (8 additional authors not shown)

Abstract: Heliophysics is central to understanding and forecasting space weather events and solar activity. Despite decades of high-resolution observations from the Solar Dynamics Observatory (SDO), most models remain task-specific and constrained by scarce labeled data, limiting their capacity to generalize across solar phenomena. We introduce Surya, a 366M parameter foundation model for heliophysics desig… ▽ More Heliophysics is central to understanding and forecasting space weather events and solar activity. Despite decades of high-resolution observations from the Solar Dynamics Observatory (SDO), most models remain task-specific and constrained by scarce labeled data, limiting their capacity to generalize across solar phenomena. We introduce Surya, a 366M parameter foundation model for heliophysics designed to learn general-purpose solar representations from multi-instrument SDO observations, including eight Atmospheric Imaging Assembly (AIA) channels and five Helioseismic and Magnetic Imager (HMI) products. Surya employs a spatiotemporal transformer architecture with spectral gating and long--short range attention, pretrained on high-resolution solar image forecasting tasks and further optimized through autoregressive rollout tuning. Zero-shot evaluations demonstrate its ability to forecast solar dynamics and flare events, while downstream fine-tuning with parameter-efficient Low-Rank Adaptation (LoRA) shows strong performance on solar wind forecasting, active region segmentation, solar flare forecasting, and EUV spectra. Surya is the first foundation model in heliophysics that uses time advancement as a pretext task on full-resolution SDO data. Its novel architecture and performance suggest that the model is able to learn the underlying physics behind solar evolution. △ Less

Submitted 21 August, 2025; v1 submitted 18 August, 2025; originally announced August 2025.

arXiv:2508.14107 [pdf, ps, other]

SuryaBench: Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction

Authors: Sujit Roy, Dinesha V. Hegde, Johannes Schmude, Amy Lin, Vishal Gaur, Rohit Lal, Kshitiz Mandal, Talwinder Singh, Andrés Muñoz-Jaramillo, Kang Yang, Chetraj Pandey, Jinsu Hong, Berkay Aydin, Ryan McGranaghan, Spiridon Kasapis, Vishal Upendran, Shah Bahauddin, Daniel da Silva, Marcus Freitag, Iksha Gurung, Nikolai Pogorelov, Campbell Watson, Manil Maskey, Juan Bernabe-Moreno, Rahul Ramachandran

Abstract: This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO), specifically designed to advance machine learning (ML) applications in solar physics and space weather forecasting. The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI), spanning a solar c… ▽ More This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO), specifically designed to advance machine learning (ML) applications in solar physics and space weather forecasting. The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI), spanning a solar cycle from May 2010 to July 2024. To ensure suitability for ML tasks, the data has been preprocessed, including correction of spacecraft roll angles, orbital adjustments, exposure normalization, and degradation compensation. We also provide auxiliary application benchmark datasets complementing the core SDO dataset. These provide benchmark applications for central heliophysics and space weather tasks such as active region segmentation, active region emergence forecasting, coronal field extrapolation, solar flare prediction, solar EUV spectra prediction, and solar wind speed estimation. By establishing a unified, standardized data collection, this dataset aims to facilitate benchmarking, enhance reproducibility, and accelerate the development of AI-driven models for critical space weather prediction tasks, bridging gaps between solar physics, machine learning, and operational forecasting. △ Less

Submitted 17 August, 2025; originally announced August 2025.

arXiv:2504.11172 [pdf, ps, other]

TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data

Authors: Benedikt Blumenstiel, Paolo Fraccaro, Valerio Marsocci, Johannes Jakubik, Stefano Maurogiovanni, Mikolaj Czerkawski, Rocco Sedona, Gabriele Cavallaro, Thomas Brunschwiler, Juan Bernabe-Moreno, Nicolas Longépé

Abstract: Large-scale foundation models in Earth Observation can learn versatile, label-efficient representations by leveraging massive amounts of unlabeled data. However, existing public datasets are often limited in scale, geographic coverage, or sensor variety. We introduce TerraMesh, a new globally diverse, multimodal dataset combining optical, synthetic aperture radar, elevation, and land-cover modalit… ▽ More Large-scale foundation models in Earth Observation can learn versatile, label-efficient representations by leveraging massive amounts of unlabeled data. However, existing public datasets are often limited in scale, geographic coverage, or sensor variety. We introduce TerraMesh, a new globally diverse, multimodal dataset combining optical, synthetic aperture radar, elevation, and land-cover modalities in an Analysis-Ready Data format. TerraMesh includes over 9~million samples with eight spatiotemporal aligned modalities, enabling large-scale pre-training. We provide detailed data processing steps, comprehensive statistics, and empirical evidence demonstrating improved model performance when pre-trained on TerraMesh. The dataset is hosted at https://huggingface.co/datasets/ibm-esa-geospatial/TerraMesh. △ Less

Submitted 1 August, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

Comments: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops

arXiv:2504.11171 [pdf, ps, other]

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Authors: Johannes Jakubik, Felix Yang, Benedikt Blumenstiel, Erik Scheurer, Rocco Sedona, Stefano Maurogiovanni, Jente Bosmans, Nikolaos Dionelis, Valerio Marsocci, Niklas Kopp, Rahul Ramachandran, Paolo Fraccaro, Thomas Brunschwiler, Gabriele Cavallaro, Juan Bernabe-Moreno, Nicolas Longépé

Abstract: We present TerraMind, the first any-to-any generative, multimodal foundation model for Earth observation (EO). Unlike other multimodal models, TerraMind is pretrained on dual-scale representations combining both token-level and pixel-level data across modalities. On a token level, TerraMind encodes high-level contextual information to learn cross-modal relationships, while on a pixel level, TerraM… ▽ More We present TerraMind, the first any-to-any generative, multimodal foundation model for Earth observation (EO). Unlike other multimodal models, TerraMind is pretrained on dual-scale representations combining both token-level and pixel-level data across modalities. On a token level, TerraMind encodes high-level contextual information to learn cross-modal relationships, while on a pixel level, TerraMind leverages fine-grained representations to capture critical spatial nuances. We pretrained TerraMind on nine geospatial modalities of a global, large-scale dataset. In this paper, we demonstrate that (i) TerraMind's dual-scale early fusion approach unlocks a range of zero-shot and few-shot applications for Earth observation, (ii) TerraMind introduces "Thinking-in-Modalities" (TiM) -- the capability of generating additional artificial data during finetuning and inference to improve the model output -- and (iii) TerraMind achieves beyond state-of-the-art performance in community-standard benchmarks for EO like PANGAEA. The pretraining dataset, the model weights, and our code are open-sourced under a permissive license. △ Less

Submitted 10 September, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

Comments: Accepted at ICCV'25

arXiv:2407.09434 [pdf, other]

Foundation Models for the Electric Power Grid

Authors: Hendrik F. Hamann, Thomas Brunschwiler, Blazhe Gjorgiev, Leonardo S. A. Martins, Alban Puech, Anna Varbella, Jonas Weiss, Juan Bernabe-Moreno, Alexandre Blondin Massé, Seong Choi, Ian Foster, Bri-Mathias Hodge, Rishabh Jain, Kibaek Kim, Vincent Mai, François Mirallès, Martin De Montigny, Octavio Ramos-Leaños, Hussein Suprême, Le Xie, El-Nasser S. Youssef, Arnaud Zinflou, Alexander J. Belyi, Ricardo J. Bessa, Bishnu Prasad Bhattarai , et al. (2 additional authors not shown)

Abstract: Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transi… ▽ More Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transition and climate change. In this paper, we call for the development of, and state why we believe in, the potential of FMs for electric grids. We highlight their strengths and weaknesses amidst the challenges of a changing grid. We argue that an FM learning from diverse grid data and topologies could unlock transformative capabilities, pioneering a new approach in leveraging AI to redefine how we manage complexity and uncertainty in the electric grid. Finally, we discuss a power grid FM concept, namely GridFM, based on graph neural networks and show how different downstream tasks benefit. △ Less

Submitted 12 November, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

Comments: Major equal contributors: H.F.H., T.B., B.G., L.S.A.M., A.P., A.V., J.W.; Significant equal contributors: J.B., A.B.M., S.C., I.F., B.H., R.J., K.K., V.M., F.M., M.D.M., O.R., H.S., L.X., E.S.Y., A.Z.; Other equal contributors: A.J.B., R.J.B., B.P.B., J.S., S.S; Lead contact: H.F.H

arXiv:2311.04007 [pdf, other]

The Energy Prediction Smart-Meter Dataset: Analysis of Previous Competitions and Beyond

Authors: Direnc Pekaslan, Jose Maria Alonso-Moral, Kasun Bandara, Christoph Bergmeir, Juan Bernabe-Moreno, Robert Eigenmann, Nils Einecke, Selvi Ergen, Rakshitha Godahewa, Hansika Hewamalage, Jesus Lago, Steffen Limmer, Sven Rebhan, Boris Rabinovich, Dilini Rajapasksha, Heda Song, Christian Wagner, Wenlong Wu, Luis Magdalena, Isaac Triguero

Abstract: This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuz… ▽ More This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) in 2021 (named as XEP). These competitions focus on accurate energy consumption forecasting and the importance of interpretability in understanding the underlying factors. The challenge aims to predict monthly and yearly estimated consumption for households, addressing the accurate billing problem with limited historical smart meter data. The dataset comprises 3,248 smart meters, with varying data availability ranging from a minimum of one month to a year. This paper delves into the challenges, solutions and analysing issues related to the provided real-world smart meter data, developing accurate predictions at the household level, and introducing evaluation criteria for assessing interpretability. Additionally, this paper discusses aspects beyond the competitions: opportunities for energy disaggregation and pattern detection applications at the household level, significance of communicating energy-driven factors for optimised billing, and emphasising the importance of responsible AI and data privacy considerations. These aspects provide insights into the broader implications and potential advancements in energy consumption prediction. Overall, these competitions provide a dataset for residential energy research and serve as a catalyst for exploring accurate forecasting, enhancing interpretability, and driving progress towards the discussion of various aspects such as energy disaggregation, demand response programs or behavioural interventions. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2304.10385 [pdf, other]

Conditions for a quadratic quantum speedup in nonlinear transforms with applications to energy contract pricing

Authors: Gabriele Agliardi, Corey O'Meara, Kavitha Yogaraj, Kumar Ghosh, Piergiacomo Sabino, Marina Fernández-Campoamor, Giorgio Cortiana, Juan Bernabé-Moreno, Francesco Tacchino, Antonio Mezzacapo, Omar Shehab

Abstract: Computing nonlinear functions over multilinear forms is a general problem with applications in risk analysis. For instance in the domain of energy economics, accurate and timely risk management demands for efficient simulation of millions of scenarios, largely benefiting from computational speedups. We develop a novel hybrid quantum-classical algorithm based on polynomial approximation of nonlinea… ▽ More Computing nonlinear functions over multilinear forms is a general problem with applications in risk analysis. For instance in the domain of energy economics, accurate and timely risk management demands for efficient simulation of millions of scenarios, largely benefiting from computational speedups. We develop a novel hybrid quantum-classical algorithm based on polynomial approximation of nonlinear functions, computed through Quantum Hadamard Products, and we rigorously assess the conditions for its end-to-end speedup for different implementation variants against classical algorithms. In our setting, a quadratic quantum speedup, up to polylogarithmic factors, can be proven only when forms are bilinear and approximating polynomials have second degree, if efficient loading unitaries are available for the input data sets. We also enhance the bidirectional encoding, that allows tuning the balance between circuit depth and width, proposing an improved version that can be exploited for the calculation of inner products. Lastly, we exploit the dynamic circuit capabilities, recently introduced on IBM Quantum devices, to reduce the average depth of the Quantum Hadamard Product circuit. A proof of principle is implemented and validated on IBM Quantum systems. △ Less

Submitted 5 August, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

ACM Class: F.1.3

arXiv:2206.02270 [pdf, other]

Estimating building energy efficiency from street view imagery, aerial imagery, and land surface temperature data

Authors: Kevin Mayer, Lukas Haas, Tianyuan Huang, Juan Bernabé-Moreno, Ram Rajagopal, Martin Fischer

Abstract: Current methods to determine the energy efficiency of buildings require on-site visits of certified energy auditors which makes the process slow, costly, and geographically incomplete. To accelerate the identification of promising retrofit targets on a large scale, we propose to estimate building energy efficiency from widely available and remotely sensed data sources only, namely street view, aer… ▽ More Current methods to determine the energy efficiency of buildings require on-site visits of certified energy auditors which makes the process slow, costly, and geographically incomplete. To accelerate the identification of promising retrofit targets on a large scale, we propose to estimate building energy efficiency from widely available and remotely sensed data sources only, namely street view, aerial view, footprint, and satellite-borne land surface temperature (LST) data. After collecting data for almost 40,000 buildings in the United Kingdom, we combine these data sources by training multiple end-to-end deep learning models with the objective to classify buildings as energy efficient (EU rating A-D) or inefficient (EU rating E-G). After evaluating the trained models quantitatively as well as qualitatively, we extend our analysis by studying the predictive power of each data source in an ablation study. We find that the end-to-end deep learning model trained on all four data sources achieves a macro-averaged F1 score of 64.64% and outperforms the k-NN and SVM-based baseline models by 14.13 to 12.02 percentage points, respectively. Thus, this work shows the potential and complementary nature of remotely sensed data in predicting energy efficiency and opens up new opportunities for future work to integrate additional data sources. △ Less

Submitted 24 August, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

arXiv:2202.08024 [pdf, other]

doi 10.1109/ICSA-C54293.2022.00033

Towards AutoQML: A Cloud-Based Automated Circuit Architecture Search Framework

Authors: Raúl Berganza Gómez, Corey O'Meara, Giorgio Cortiana, Christian B. Mendl, Juan Bernabé-Moreno

Abstract: The learning process of classical machine learning algorithms is tuned by hyperparameters that need to be customized to best learn and generalize from an input dataset. In recent years, Quantum Machine Learning (QML) has been gaining traction as a possible application of quantum computing which may provide quantum advantage in the future. However, quantum versions of classical machine learning alg… ▽ More The learning process of classical machine learning algorithms is tuned by hyperparameters that need to be customized to best learn and generalize from an input dataset. In recent years, Quantum Machine Learning (QML) has been gaining traction as a possible application of quantum computing which may provide quantum advantage in the future. However, quantum versions of classical machine learning algorithms introduce a plethora of additional parameters and circuit variations that have their own intricacies in being tuned. In this work, we take the first steps towards Automated Quantum Machine Learning (AutoQML). We propose a concrete description of the problem, and then develop a classical-quantum hybrid cloud architecture that allows for parallelized hyperparameter exploration and model training. As an application use-case, we train a quantum Generative Adversarial neural Network (qGAN) to generate energy prices that follow a known historic data distribution. Such a QML model can be used for various applications in the energy economics sector. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: 8 pages, to appear in QSA 2022 (IEEE ICSA 2022)

Journal ref: 2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C), 129-136 (2022)

arXiv:2112.08300 [pdf, other]

Community Detection in Electrical Grids Using Quantum Annealing

Authors: Marina Fernández-Campoamor, Corey O'Meara, Giorgio Cortiana, Vedran Peric, Juan Bernabé-Moreno

Abstract: With the increase of intermittent renewable generation resources feeding into the electrical grid, Distribution System Operators (DSOs) must find ways to incorporate these new actors and adapt the grid to ensure stability and enable flexibility. Dividing the grid into logical clusters entails several organization and technical benefits, helping overcome these challenges.However, finding the optima… ▽ More With the increase of intermittent renewable generation resources feeding into the electrical grid, Distribution System Operators (DSOs) must find ways to incorporate these new actors and adapt the grid to ensure stability and enable flexibility. Dividing the grid into logical clusters entails several organization and technical benefits, helping overcome these challenges.However, finding the optimal grid partitioning remains a challenging task due to its complexity. At the same time, a new technology has gained traction in the last decades for its promising speed-up potential in solving non-trivial combinatorial optimization problems: quantum computing. This work explores its application in Graph Partitioning using electrical modularity. We benchmarked several quantum annealing and hybrid methods on IEEE well-known test cases. The results obtained for the IEEE 14-bus test case show that quantum annealing DWaveSampler brings equal solutions or, for the optimal number partitions, a 1% improvement. For the more significant test cases, hybrid quantum annealing shows a relative error of less than 0.02% compared to the classical benchmark and for IEEE 118-bus test case shows time performance speed-up. The increment in performance would enable real time planning and operations of electrical grids in real time. This work intends to be the first step to showcase the potentials of quantum computing towards the modernization and adaption of electrical grids to the decentralized future of energy systems. △ Less

Submitted 16 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

Comments: 7 pages, 5 figures, conference

Showing 1–10 of 10 results for author: Bernabé-Moreno, J