Search | arXiv e-print repository

Fourier analysis of the physics of transfer learning for data-driven subgrid-scale models of ocean turbulence

Authors: Moein Darman, Pedram Hassanzadeh, Laure Zanna, Ashesh Chattopadhyay

Abstract: Transfer learning (TL) is a powerful tool for enhancing the performance of neural networks (NNs) in applications such as weather and climate prediction and turbulence modeling. TL enables models to generalize to out-of-distribution data with minimal training data from the new system. In this study, we employ a 9-layer convolutional NN to predict the subgrid forcing in a two-layer ocean quasi-geost… ▽ More Transfer learning (TL) is a powerful tool for enhancing the performance of neural networks (NNs) in applications such as weather and climate prediction and turbulence modeling. TL enables models to generalize to out-of-distribution data with minimal training data from the new system. In this study, we employ a 9-layer convolutional NN to predict the subgrid forcing in a two-layer ocean quasi-geostrophic system and examine which metrics best describe its performance and generalizability to unseen dynamical regimes. Fourier analysis of the NN kernels reveals that they learn low-pass, Gabor, and high-pass filters, regardless of whether the training data are isotropic or anisotropic. By analyzing the activation spectra, we identify why NNs fail to generalize without TL and how TL can overcome these limitations: the learned weights and biases from one dataset underestimate the out-of-distribution sample spectra as they pass through the network, leading to an underestimation of output spectra. By re-training only one layer with data from the target system, this underestimation is corrected, enabling the NN to produce predictions that match the target spectra. These findings are broadly applicable to data-driven parameterization of dynamical systems. △ Less

Submitted 21 April, 2025; originally announced April 2025.

arXiv:2501.05058 [pdf, other]

Simultaneous emulation and downscaling with physically-consistent deep learning-based regional ocean emulators

Authors: Leonard Lupin-Jimenez, Moein Darman, Subhashis Hazarika, Tianning Wu, Michael Gray, Ruyoing He, Anthony Wong, Ashesh Chattopadhyay

Abstract: Building on top of the success in AI-based atmospheric emulation, we propose an AI-based ocean emulation and downscaling framework focusing on the high-resolution regional ocean over Gulf of Mexico. Regional ocean emulation presents unique challenges owing to the complex bathymetry and lateral boundary conditions as well as from fundamental biases in deep learning-based frameworks, such as instabi… ▽ More Building on top of the success in AI-based atmospheric emulation, we propose an AI-based ocean emulation and downscaling framework focusing on the high-resolution regional ocean over Gulf of Mexico. Regional ocean emulation presents unique challenges owing to the complex bathymetry and lateral boundary conditions as well as from fundamental biases in deep learning-based frameworks, such as instability and hallucinations. In this paper, we develop a deep learning-based framework to autoregressively integrate ocean-surface variables over the Gulf of Mexico at $8$ Km spatial resolution without unphysical drifts over decadal time scales and simulataneously downscale and bias-correct it to $4$ Km resolution using a physics-constrained generative model. The framework shows both short-term skills as well as accurate long-term statistics in terms of mean and variability. △ Less

Submitted 9 January, 2025; originally announced January 2025.

arXiv:2310.00813 [pdf, other]

OceanNet: A principled neural operator-based digital twin for regional oceans

Authors: Ashesh Chattopadhyay, Michael Gray, Tianning Wu, Anna B. Lowe, Ruoying He

Abstract: While data-driven approaches demonstrate great potential in atmospheric modeling and weather forecasting, ocean modeling poses distinct challenges due to complex bathymetry, land, vertical structure, and flow non-linearity. This study introduces OceanNet, a principled neural operator-based digital twin for ocean circulation. OceanNet uses a Fourier neural operator and predictor-evaluate-corrector… ▽ More While data-driven approaches demonstrate great potential in atmospheric modeling and weather forecasting, ocean modeling poses distinct challenges due to complex bathymetry, land, vertical structure, and flow non-linearity. This study introduces OceanNet, a principled neural operator-based digital twin for ocean circulation. OceanNet uses a Fourier neural operator and predictor-evaluate-corrector integration scheme to mitigate autoregressive error growth and enhance stability over extended time scales. A spectral regularizer counteracts spectral bias at smaller scales. OceanNet is applied to the northwest Atlantic Ocean western boundary current (the Gulf Stream), focusing on the task of seasonal prediction for Loop Current eddies and the Gulf Stream meander. Trained using historical sea surface height (SSH) data, OceanNet demonstrates competitive forecast skill by outperforming SSH predictions by an uncoupled, state-of-the-art dynamical ocean model forecast, reducing computation by 500,000 times. These accomplishments demonstrate the potential of physics-inspired deep neural operators as cost-effective alternatives to high-resolution numerical ocean models. △ Less

Submitted 4 September, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

Comments: Supplementary information can be found in: https://drive.google.com/file/d/1NoxJLa967naJT787a5-IfZ7f_MmRuZMP/view?usp=sharing

arXiv:2205.04601 [pdf, other]

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Authors: Ashesh Chattopadhyay, Jaideep Pathak, Ebrahim Nabizadeh, Wahid Bhimji, Pedram Hassanzadeh

Abstract: Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lo… ▽ More Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this paper, we propose a convolutional variational autoencoder-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a 2-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder-decoder-based convolutional model in terms of short-term skills while remaining stable for long-term climate simulations yielding accurate climatology. △ Less

Submitted 9 May, 2022; originally announced May 2022.

arXiv:2002.11167 [pdf, other]

doi 10.1029/2020MS002084

Data-driven super-parameterization using deep learning: Experimentation with multi-scale Lorenz 96 systems and transfer-learning

Authors: Ashesh Chattopadhyay, Adam Subel, Pedram Hassanzadeh

Abstract: To make weather/climate modeling computationally affordable, small-scale processes are usually represented in terms of the large-scale, explicitly-resolved processes using physics-based or semi-empirical parameterization schemes. Another approach, computationally more demanding but often more accurate, is super-parameterization (SP), which involves integrating the equations of small-scale processe… ▽ More To make weather/climate modeling computationally affordable, small-scale processes are usually represented in terms of the large-scale, explicitly-resolved processes using physics-based or semi-empirical parameterization schemes. Another approach, computationally more demanding but often more accurate, is super-parameterization (SP), which involves integrating the equations of small-scale processes on high-resolution grids embedded within the low-resolution grids of large-scale processes. Recently, studies have used machine learning (ML) to develop data-driven parameterization (DD-P) schemes. Here, we propose a new approach, data-driven SP (DD-SP), in which the equations of the small-scale processes are integrated data-drivenly using ML methods such as recurrent neural networks. Employing multi-scale Lorenz 96 systems as testbed, we compare the cost and accuracy (in terms of both short-term prediction and long-term statistics) of parameterized low-resolution (LR), SP, DD-P, and DD-SP models. We show that with the same computational cost, DD-SP substantially outperforms LR, and is better than DD-P, particularly when scale separation is lacking. DD-SP is much cheaper than SP, yet its accuracy is the same in reproducing long-term statistics and often comparable in short-term forecasting. We also investigate generalization, finding that when models trained on data from one system are applied to a system with different forcing (e.g., more chaotic), the models often do not generalize, particularly when the short-term prediction accuracy is examined. But we show that transfer-learning, which involves re-training the data-driven model with a small amount of data from the new system, significantly improves generalization. Potential applications of DD-SP and transfer-learning in climate/weather modeling and the expected challenges are discussed. △ Less

Submitted 25 February, 2020; originally announced February 2020.

Journal ref: Journal of Advances in Modeling Earth Systems 2020

arXiv:1906.08829 [pdf, other]

doi 10.5194/npg-27-373-2020

Data-driven prediction of a multi-scale Lorenz 96 chaotic system using deep learning methods: Reservoir computing, ANN, and RNN-LSTM

Authors: Ashesh Chattopadhyay, Pedram Hassanzadeh, Devika Subramanian

Abstract: In this paper, the performance of three deep learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multi-scale spatio-temporal Lorenz 96 system is examined. The methods are: echo state network (a type of reservoir computing, RC-ESN), deep feed-forward artificial neural network (ANN), and recurrent neural network with long short-term memory (RNN-LSTM… ▽ More In this paper, the performance of three deep learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multi-scale spatio-temporal Lorenz 96 system is examined. The methods are: echo state network (a type of reservoir computing, RC-ESN), deep feed-forward artificial neural network (ANN), and recurrent neural network with long short-term memory (RNN-LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting variables representing slow/large-scale ($X$), intermediate ($Y$), and fast/small-scale ($Z$) processes. For training or testing, only $X$ is available; $Y$ and $Z$ are never known or used. We show that RC-ESN substantially outperforms ANN and RNN-LSTM for short-term prediction, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps, equivalent to several Lyapunov timescales. The RNN-LSTM and ANN show some prediction skills as well; RNN-LSTM bests ANN. Furthermore, even after losing the trajectory, data predicted by RC-ESN and RNN-LSTM have probability density functions (PDFs) that closely match the true PDF, even at the tails. The PDF of the data predicted using ANN, however, deviates from the true PDF. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling of complex nonlinear dynamical systems such as weather/climate are discussed. △ Less

Submitted 5 December, 2019; v1 submitted 20 June, 2019; originally announced June 2019.

Comments: Some changes, in Figures, addition of an appendix etc has been done

Journal ref: Nonlin. Processes Geophys. 2020

arXiv:1708.01144 [pdf, other]

doi 10.1016/j.cnsns.2018.09.005

Direct nonlinear Fourier transform algorithms for the computation of solitonic spectra in focusing nonlinear Schrödinger equation

Authors: A. Vasylchenkova, J. E. Prilepsky, D. Shepelsky, A. Chattopadhyay

Abstract: Starting from a comparison of some established numerical algorithms for the computation of the eigenvalues (discrete or solitonic spectrum) of the non-Hermitian version of the Zakharov-Shabat spectral problem, this article delivers new algorithms that combine the best features of the existing ones and thereby allays their relative weaknesses. Our algorithm is modelled within the remit of the so-ca… ▽ More Starting from a comparison of some established numerical algorithms for the computation of the eigenvalues (discrete or solitonic spectrum) of the non-Hermitian version of the Zakharov-Shabat spectral problem, this article delivers new algorithms that combine the best features of the existing ones and thereby allays their relative weaknesses. Our algorithm is modelled within the remit of the so-called direct nonlinear Fourier transform (NFT) associated with the focusing nonlinear Schrödinger equation. First, we present the data for the calibration of methods comparing the relative errors associated with the computation of the continuous NF spectrum. Then each method is paired with different numerical algorithms for finding zeros of a complex-valued function to obtain the eigenvalues. Next we describe a new class of methods based on the contour integrals evaluation for the efficient search of eigenvalues. After that, we introduce a new hybrid method, one of our main results: the method combines the advances of contour integral approach and makes use of the iterative algorithms at its second stage for the refined eigenvalues search. The veracity of our new hybrid algorithm is established by estimating the convergence speed and accuracy across three independent test profiles. Along with the development of a new approach for the computation of the eigenvalues, our study also addresses the problem of computation of the so-called norming constants associated with the eigenvalues. We show that our formalism effectively amounts to accurate and fast enough computation of residues of the reflection coefficient in the upper complex half-plane of the spectral parameter. △ Less

Submitted 10 September, 2018; v1 submitted 31 July, 2017; originally announced August 2017.

Comments: accepted to Communication of Nonlinear Science and Numerical Simulations

MSC Class: 37K15; 65M12; 35C08

arXiv:1605.08646 [pdf, other]

doi 10.1103/PhysRevE.93.052221

Noise-induced standing waves in oscillatory systems with time-delayed feedback

Authors: Michael Stich, Amit K Chattopadhyay

Abstract: In oscillatory reaction-diffusion systems, time-delay feedback can lead to the instability of uniform oscillations with respect to formation of standing waves. Here, we investigate how the presence of additive, Gaussian white noise can induce the appearance of standing waves. Combining analytical solutions of the model with spatio-temporal simulations, we find that noise can promote standing waves… ▽ More In oscillatory reaction-diffusion systems, time-delay feedback can lead to the instability of uniform oscillations with respect to formation of standing waves. Here, we investigate how the presence of additive, Gaussian white noise can induce the appearance of standing waves. Combining analytical solutions of the model with spatio-temporal simulations, we find that noise can promote standing waves in regimes where the deterministic uniform oscillatory modes are stabilized. As the deterministic phase boundary is approached, the spatio-temporal correlations become stronger, such that even small noise can induce standing waves in this parameter regime. With larger noise strengths, standing waves could be induced at finite distances from the (deterministic) phase boundary. The overall dynamics is defined through the interplay of noisy forcing with the inherent reaction-diffusion dynamics. △ Less

Submitted 27 May, 2016; originally announced May 2016.

Comments: 8 two-columned pages, 5 figures; Published in PRE; URL: http://link.aps.org/doi/10.1103/PhysRevE.93.05222

arXiv:1502.00661 [pdf, ps, other]

Dynamics of Spatial Heterogeneity in Landfill - A Stochastic Analysis

Authors: Amit Kumar Chattopadhyay, Prasanta Kumar Dey, Sadhan Kumar Ghosh

Abstract: A landfill represents a complex and dynamically evolving structure that can be stochastically perturbed by exogenous factors. Both thermodynamic (equilibrium) and time varying (non-steady state) properties of a landfill are affected by spatially heterogenous and nonlinear subprocesses that combine with constraining initial and boundary conditions arising from the associated surroundings. While mul… ▽ More A landfill represents a complex and dynamically evolving structure that can be stochastically perturbed by exogenous factors. Both thermodynamic (equilibrium) and time varying (non-steady state) properties of a landfill are affected by spatially heterogenous and nonlinear subprocesses that combine with constraining initial and boundary conditions arising from the associated surroundings. While multiple approaches have been made to model landfill statistics by incorporating spatially dependent parameters on the one hand (data based approach) and continuum dynamical mass-balance equations on the other (equation based modelling), practically no attempt has been made to amalgamate these two approaches while also incorporating inherent stochastically induced fluctuations affecting the process overall. In this article, we will implement a minimalist scheme of modelling the time evolution of a realistic three dimensional landfill through a reaction-diffusion based approach, focusing on the coupled interactions of four key variables - solid mass density, hydrolysed mass density, acetogenic mass density and methanogenic mass density. In a marked departure from previous predictions, our results indicate that close to the linearly stable limit, the large time steady state properties, arising out of a series of complex coupled interactions between the stochastically driven variables, are scarcely affected by the biochemical growth-decay statistics. Our results clearly show that an equilibrium landfill structure relates to plant production times of approximately 20-30 years instead of the previous (incorrect) deterministic model predictions of 50 years and above. △ Less

Submitted 3 June, 2016; v1 submitted 20 January, 2015; originally announced February 2015.

Comments: To be published in Applied Mathematical Modelling; 8 pages, 5 figures

Showing 1–9 of 9 results for author: Chattopadhyay, A