Search | arXiv e-print repository

Nesterov Acceleration for Ensemble Kalman Inversion and Variants

Authors: Sydney Vernon, Eviatar Bach, Oliver R. A. Dunbar

Abstract: Ensemble Kalman inversion (EKI) is a derivative-free, particle-based optimization method for solving inverse problems. It can be shown that EKI approximates a gradient flow, which allows the application of methods for accelerating gradient descent. Here, we show that Nesterov acceleration is effective in speeding up the reduction of the EKI cost function on a variety of inverse problems. We also i… ▽ More Ensemble Kalman inversion (EKI) is a derivative-free, particle-based optimization method for solving inverse problems. It can be shown that EKI approximates a gradient flow, which allows the application of methods for accelerating gradient descent. Here, we show that Nesterov acceleration is effective in speeding up the reduction of the EKI cost function on a variety of inverse problems. We also implement Nesterov acceleration for two EKI variants, unscented Kalman inversion and ensemble transform Kalman inversion. Our specific implementation takes the form of a particle-level nudge that is demonstrably simple to couple in a black-box fashion with any existing EKI variant algorithms, comes with no additional computational expense, and with no additional tuning hyperparameters. This work shows a pathway for future research to translate advances in gradient-based optimization into advances in gradient-free Kalman optimization. △ Less

Submitted 19 May, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

arXiv:2407.00584 [pdf, other]

doi 10.1007/s11222-025-10587-w

Hyperparameter Optimization for Randomized Algorithms: A Case Study on Random Features

Authors: Oliver R. A. Dunbar, Nicholas H. Nelsen, Maya Mutic

Abstract: Randomized algorithms exploit stochasticity to reduce computational complexity. One important example is random feature regression (RFR) that accelerates Gaussian process regression (GPR). RFR approximates an unknown function with a random neural network whose hidden weights and biases are sampled from a probability distribution. Only the final output layer is fit to data. In randomized algorithms… ▽ More Randomized algorithms exploit stochasticity to reduce computational complexity. One important example is random feature regression (RFR) that accelerates Gaussian process regression (GPR). RFR approximates an unknown function with a random neural network whose hidden weights and biases are sampled from a probability distribution. Only the final output layer is fit to data. In randomized algorithms like RFR, the hyperparameters that characterize the sampling distribution greatly impact performance, yet are not directly accessible from samples. This makes optimization of hyperparameters via standard (gradient-based) optimization tools inapplicable. Inspired by Bayesian ideas from GPR, this paper introduces a random objective function that is tailored for hyperparameter tuning of vector-valued random features. The objective is minimized with ensemble Kalman inversion (EKI). EKI is a gradient-free particle-based optimizer that is scalable to high-dimensions and robust to randomness in objective functions. A numerical study showcases the new black-box methodology to learn hyperparameter distributions in several problems that are sensitive to the hyperparameter selection: two global sensitivity analyses, integrating a chaotic dynamical system, and solving a Bayesian inverse problem from atmospheric dynamics. The success of the proposed EKI-based algorithm for RFR suggests its potential for automated optimization of hyperparameters arising in other randomized algorithms. △ Less

Submitted 23 February, 2025; v1 submitted 30 June, 2024; originally announced July 2024.

Journal ref: Statistics and Computing Vol. 35 No. 56 (2025)

arXiv:2404.14212 [pdf, other]

Toward Routing River Water in Land Surface Models with Recurrent Neural Networks

Authors: Mauricio Lima, Katherine Deck, Oliver R. A. Dunbar, Tapio Schneider

Abstract: Machine learning is playing an increasing role in hydrology, supplementing or replacing physics-based models. One notable example is the use of recurrent neural networks (RNNs) for forecasting streamflow given observed precipitation and geographic characteristics. Training of such a model over the continental United States (CONUS) has demonstrated that a single set of model parameters can be used… ▽ More Machine learning is playing an increasing role in hydrology, supplementing or replacing physics-based models. One notable example is the use of recurrent neural networks (RNNs) for forecasting streamflow given observed precipitation and geographic characteristics. Training of such a model over the continental United States (CONUS) has demonstrated that a single set of model parameters can be used across independent catchments, and that RNNs can outperform physics-based models. In this work, we take a next step and study the performance of RNNs for river routing in land surface models (LSMs). Instead of observed precipitation, the LSM-RNN uses instantaneous runoff calculated from physics-based models as an input. We train the model with data from river basins spanning the globe and test it using historical streamflow measurements. The model demonstrates skill at generalization across basins (predicting streamflow in catchments not used in training) and across time (predicting streamflow during years not used in training). We compare the predictions from the LSM-RNN to an existing physics-based model calibrated with a similar dataset and find that the LSM-RNN outperforms the physics-based model: a gain in median NSE from 0.56 to 0.64 (time-split experiment) and from 0.30 to 0.34 (basin-split experiment). Our results show that RNNs are effective for global streamflow prediction from runoff inputs and motivate the development of complete routing models that can capture nested sub-basis connections. △ Less

Submitted 5 December, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 32 pages, 11 figures; submitted in HESS (EGU) with CCBY license

arXiv:2201.07577 [pdf, other]

doi 10.1017/S0956792524000895

Models for information propagation on graphs

Authors: Oliver R. A. Dunbar, Charles M. Elliott, Lisa Maria Kreusser

Abstract: We propose and unify classes of different models for information propagation over graphs. In a first class, propagation is modelled as a wave which emanates from a set of \emph{known} nodes at an initial time, to all other \emph{unknown} nodes at later times with an ordering determined by the arrival time of the information wave front. A second class of models is based on the notion of a travel ti… ▽ More We propose and unify classes of different models for information propagation over graphs. In a first class, propagation is modelled as a wave which emanates from a set of \emph{known} nodes at an initial time, to all other \emph{unknown} nodes at later times with an ordering determined by the arrival time of the information wave front. A second class of models is based on the notion of a travel time along paths between nodes. The time of information propagation from an initial \emph{known} set of nodes to a node is defined as the minimum of a generalised travel time over subsets of all admissible paths. A final class is given by imposing a local equation of an eikonal form at each \emph{unknown} node, with boundary conditions at the \emph{known} nodes. The solution value of the local equation at a node is coupled to those of neighbouring nodes with lower values. We provide precise formulations of the model classes and prove equivalences between them. Finally we apply the front propagation models on graphs to semi-supervised learning via label propagation and information propagation on trust networks. △ Less

Submitted 23 January, 2025; v1 submitted 19 January, 2022; originally announced January 2022.

arXiv:2201.06998 [pdf, other]

doi 10.1029/2022MS002997

Ensemble-Based Experimental Design for Targeting Data Acquisition to Inform Climate Models

Authors: Oliver R. A. Dunbar, Michael F. Howland, Tapio Schneider, Andrew M. Stuart

Abstract: Data required to calibrate uncertain GCM parameterizations are often only available in limited regions or time periods, for example, observational data from field campaigns, or data generated in local high-resolution simulations. This raises the question of where and when to acquire additional data to be maximally informative about parameterizations in a GCM. Here we construct a new ensemble-based… ▽ More Data required to calibrate uncertain GCM parameterizations are often only available in limited regions or time periods, for example, observational data from field campaigns, or data generated in local high-resolution simulations. This raises the question of where and when to acquire additional data to be maximally informative about parameterizations in a GCM. Here we construct a new ensemble-based parallel algorithm to automatically target data acquisition to regions and times that maximize the uncertainty reduction, or information gain, about GCM parameters. The algorithm uses a Bayesian framework that exploits a quantified distribution of GCM parameters as a measure of uncertainty. This distribution is informed by time-averaged climate statistics restricted to local regions and times. The algorithm is embedded in the recently developed calibrate-emulate-sample (CES) framework, which performs efficient model calibration and uncertainty quantification with only $\mathcal{O}(10^2)$ model evaluations, compared with $\mathcal{O}(10^5)$ evaluations typically needed for traditional approaches to Bayesian calibration. We demonstrate the algorithm with an idealized GCM, with which we generate surrogates of local data. In this perfect-model setting, we calibrate parameters and quantify uncertainties in a quasi-equilibrium convection scheme in the GCM. We consider targeted data that are (i) localized in space for statistically stationary simulations, and (ii) localized in space and time for seasonally varying simulations. In these proof-of-concept applications, the calculated information gain reflects the reduction in parametric uncertainty obtained from Bayesian inference when harnessing a targeted sample of data. The largest information gain typically, but not always, results from regions near the intertropical convergence zone (ITCZ). △ Less

Submitted 27 June, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

arXiv:2109.10970 [pdf, other]

doi 10.1371/journal.pcbi.1010171

Epidemic Management and Control Through Risk-Dependent Individual Contact Interventions

Authors: Tapio Schneider, Oliver R. A. Dunbar, Jinlong Wu, Lucas Böttcher, Dmitry Burov, Alfredo Garbuno-Iñigo, Gregory L. Wagner, Sen Pei, Chiara Daraio, Raffaele Ferrari, Jeffrey Shaman

Abstract: Testing, contact tracing, and isolation (TTI) is an epidemic management and control approach that is difficult to implement at scale because it relies on manual tracing of contacts. Exposure notification apps have been developed to digitally scale up TTI by harnessing contact data obtained from mobile devices; however, exposure notification apps provide users only with limited binary information w… ▽ More Testing, contact tracing, and isolation (TTI) is an epidemic management and control approach that is difficult to implement at scale because it relies on manual tracing of contacts. Exposure notification apps have been developed to digitally scale up TTI by harnessing contact data obtained from mobile devices; however, exposure notification apps provide users only with limited binary information when they have been directly exposed to a known infection source. Here we demonstrate a scalable improvement to TTI and exposure notification apps that uses data assimilation (DA) on a contact network. Network DA exploits diverse sources of health data together with the proximity data from mobile devices that exposure notification apps rely upon. It provides users with continuously assessed individual risks of exposure and infection, which can form the basis for targeting individual contact interventions. Simulations of the early COVID-19 epidemic in New York City prove the concepts. In the simulations, network DA identifies up to a factor 2 more infections than contact tracing when both harness the same contact data and diagnostic test data. This remains true even when only a relatively small fraction of the population uses network DA. When a sufficiently large fraction of the population ($\gtrsim 75\%$) uses network DA and complies with individual contact interventions, targeting contact interventions with network DA reduces deaths by up to a factor 4 relative to TTI. Network DA can be implemented by expanding the computational backend of existing exposure notification apps, thus greatly enhancing their capabilities. Implemented at scale, it has the potential to precisely and effectively control future epidemics while minimizing economic disruption. △ Less

Submitted 7 May, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

Journal ref: PLoS Comput Biol 18(6): e1010171. (2022)

arXiv:2108.00827 [pdf, other]

doi 10.1029/2021MS002735

Parameter uncertainty quantification in an idealized GCM with a seasonal cycle

Authors: Michael F. Howland, Oliver R. A. Dunbar, Tapio Schneider

Abstract: Climate models are generally calibrated manually by comparing selected climate statistics, such as the global top-of-atmosphere energy balance, to observations. The manual tuning only targets a limited subset of observational data and parameters. Bayesian calibration can estimate climate model parameters and their uncertainty using a larger fraction of the available data and automatically explorin… ▽ More Climate models are generally calibrated manually by comparing selected climate statistics, such as the global top-of-atmosphere energy balance, to observations. The manual tuning only targets a limited subset of observational data and parameters. Bayesian calibration can estimate climate model parameters and their uncertainty using a larger fraction of the available data and automatically exploring the parameter space more broadly. In Bayesian learning, it is natural to exploit the seasonal cycle, which has large amplitude, compared with anthropogenic climate change, in many climate statistics. In this study, we develop methods for the calibration and uncertainty quantification (UQ) of model parameters exploiting the seasonal cycle, and we demonstrate a proof-of-concept with an idealized general circulation model (GCM). Uncertainty quantification is performed using the calibrate-emulate-sample approach, which combines stochastic optimization and machine learning emulation to speed up Bayesian learning. The methods are demonstrated in a perfect-model setting through the calibration and UQ of a convective parameterization in an idealized GCM with a seasonal cycle. Calibration and UQ based on seasonally averaged climate statistics, compared to annually averaged, reduces the calibration error by up to an order of magnitude and narrows the spread of posterior distributions by factors between two and five, depending on the variables used for UQ. The reduction in the size of the parameter posterior distributions leads to a reduction in the uncertainty of climate model predictions. △ Less

Submitted 22 July, 2021; originally announced August 2021.

Comments: 21 pages, 11 figures, 2 tables

arXiv:2104.03384 [pdf, other]

Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods

Authors: Oliver R. A. Dunbar, Andrew B. Duncan, Andrew M. Stuart, Marie-Therese Wolfram

Abstract: The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle syste… ▽ More The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle systems for the solution of the resulting inverse problems for parameters. Of particular interest is the case where the available forward model evaluations are subject to rapid fluctuations, in parameter space, superimposed on the smoothly varying large scale parametric structure of interest. {A motivating example from climate science is presented, and ensemble Kalman methods (which do not use the derivative of the parameter-to-data map) are shown, empirically, to perform well. Multiscale analysis is then used to analyze the behaviour of interacting particle system algorithms when rapid fluctuations, which we refer to as noise, pollute the large scale parametric dependence of the parameter-to-data map. Ensemble Kalman methods and Langevin-based methods} (the latter use the derivative of the parameter-to-data map) are compared in this light. The ensemble Kalman methods are shown to behave favourably in the presence of noise in the parameter-to-data map, whereas Langevin methods are adversely affected. On the other hand, Langevin methods have the correct equilibrium distribution in the setting of noise-free forward models, whilst ensemble Kalman methods only provide an uncontrolled approximation, except in the linear case. Therefore a new class of algorithms, ensemble Gaussian process samplers, which combine the benefits of both ensemble Kalman and Langevin methods, are introduced and shown to perform favourably. △ Less

Submitted 22 January, 2022; v1 submitted 7 April, 2021; originally announced April 2021.

MSC Class: 65C05; 65C40; 60J22

arXiv:2012.13262 [pdf, other]

doi 10.1029/2020MS002454

Calibration and Uncertainty Quantification of Convective Parameters in an Idealized GCM

Authors: Oliver R. A. Dunbar, Alfredo Garbuno-Inigo, Tapio Schneider, Andrew M. Stuart

Abstract: Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Mark… ▽ More Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Markov chain Monte Carlo (MCMC) methods typically require $O(10^5)$ model runs and are sensitive to internal variability noise, rendering them infeasible for climate models. Here we demonstrate an approach to model calibration and uncertainty quantification that requires only $O(10^2)$ model runs and can accommodate internal climate variability. The approach consists of three stages: (i) a calibration stage uses variants of ensemble Kalman inversion to calibrate a model by minimizing mismatches between model and data statistics; (ii) an emulation stage emulates the parameter-to-data map with Gaussian processes (GP), using the model runs in the calibration stage for training; (iii) a sampling stage approximates the Bayesian posterior distributions by sampling the GP emulator with MCMC. We demonstrate the feasibility and computational efficiency of this calibrate-emulate-sample (CES) approach in a perfect-model setting. Using an idealized general circulation model, we estimate parameters in a simple convection scheme from synthetic data generated with the model. The CES approach generates probability distributions of the parameters that are good approximations of the Bayesian posteriors, at a fraction of the computational cost usually required to obtain them. Sampling from this approximate posterior allows the generation of climate predictions with quantified parametric uncertainties. △ Less

Submitted 19 August, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

arXiv:1811.02865 [pdf, other]

doi 10.1088/1361-6420/ab1c6c

Binary recovery via phase field regularization for first traveltime tomography

Authors: Oliver R. A. Dunbar, Charles M. Elliott

Abstract: We propose a double obstacle phase field methodology for binary recovery of the slowness function of an Eikonal equation found in first traveltime tomography. We treat the inverse problem as an optimization problem with quadratic misfit functional added to a phase field relaxation of the perimeter penalization functional. Our approach yields solutions as we account for well posedness of the forwar… ▽ More We propose a double obstacle phase field methodology for binary recovery of the slowness function of an Eikonal equation found in first traveltime tomography. We treat the inverse problem as an optimization problem with quadratic misfit functional added to a phase field relaxation of the perimeter penalization functional. Our approach yields solutions as we account for well posedness of the forward problem by choosing regular priors. We obtain a convergent finite difference and mixed finite element based discretization and a well defined descent scheme by accounting for the non-differentiability of the forward problem. We validate the phase field technique with a $Γ$ - convergence result and numerically by conducting parameter studies for the scheme, and by applying it to a variety of test problems with different geometries, boundary conditions, and source - receiver locations. △ Less

Submitted 9 November, 2018; v1 submitted 7 November, 2018; originally announced November 2018.

arXiv:1810.12274 [pdf, other]

Phase field modelling of surfactants in multi-phase flow

Authors: Oliver R. A. Dunbar, Kei Fong Lam, Bjorn Stinner

Abstract: A diffuse interface model for surfactants in multi-phase flow with three or more fluids is derived. A system of Cahn-Hilliard equations is coupled with a Navier-Stokes system and an advection-diffusion equation for the surfactant ensuring thermodynamic consistency. By an asymptotic analysis the model can be related to a moving boundary problem in the sharp interface limit, which is derived from fi… ▽ More A diffuse interface model for surfactants in multi-phase flow with three or more fluids is derived. A system of Cahn-Hilliard equations is coupled with a Navier-Stokes system and an advection-diffusion equation for the surfactant ensuring thermodynamic consistency. By an asymptotic analysis the model can be related to a moving boundary problem in the sharp interface limit, which is derived from first principles. Results from numerical simulations support the theoretical findings. The main novelties are centred around the conditions in the triple junctions where three fluids meet. Specifically the case of local chemical equilibrium with respect to the surfactant is considered, which allows for interfacial surfactant flow through the triple junctions. △ Less

Submitted 29 October, 2018; originally announced October 2018.

arXiv:1706.01960 [pdf, other]

Reconciling Bayesian and perimeter regularization for binary inversion

Authors: Oliver R. A. Dunbar, Matthew M. Dunlop, Charles M. Elliott, Viet Ha Hoang, Andrew M. Stuart

Abstract: A central theme in classical algorithms for the reconstruction of discontinuous functions from observational data is perimeter regularization via the use of the total variation. On the other hand, sparse or noisy data often demands a probabilistic approach to the reconstruction of images, to enable uncertainty quantification; the Bayesian approach to inversion, which itself introduces a form of re… ▽ More A central theme in classical algorithms for the reconstruction of discontinuous functions from observational data is perimeter regularization via the use of the total variation. On the other hand, sparse or noisy data often demands a probabilistic approach to the reconstruction of images, to enable uncertainty quantification; the Bayesian approach to inversion, which itself introduces a form of regularization, is a natural framework in which to carry this out. In this paper the link between Bayesian inversion methods and perimeter regularization is explored. In this paper two links are studied: (i) the maximum a posteriori (MAP) objective function of a suitably chosen Bayesian phase-field approach is shown to be closely related to a least squares plus perimeter regularization objective; (ii) sample paths of a suitably chosen Bayesian level set formulation are shown to possess finite perimeter and to have the ability to learn about the true perimeter. △ Less

Submitted 10 April, 2020; v1 submitted 6 June, 2017; originally announced June 2017.

Comments: 30 pages, 15 figures

MSC Class: 35J35; 62G08; 62M40; 94A08

Showing 1–12 of 12 results for author: Dunbar, O R A