-
Dimension-reduced Reconstruction Map Learning for Parameter Estimation in Likelihood-Free Inference Problems
Authors:
Rui Zhang,
Oksana A. Chkrebtii,
Dongbin Xiu
Abstract:
Many application areas rely on models that can be readily simulated but lack a closed-form likelihood, or an accurate approximation under arbitrary parameter values. Existing parameter estimation approaches in this setting are generally approximate. Recent work on using neural network models to reconstruct the mapping from the data space to the parameters from a set of synthetic parameter-data pai…
▽ More
Many application areas rely on models that can be readily simulated but lack a closed-form likelihood, or an accurate approximation under arbitrary parameter values. Existing parameter estimation approaches in this setting are generally approximate. Recent work on using neural network models to reconstruct the mapping from the data space to the parameters from a set of synthetic parameter-data pairs suffers from the curse of dimensionality, resulting in inaccurate estimation as the data size grows. We propose a dimension-reduced approach to likelihood-free estimation which combines the ideas of reconstruction map estimation with dimension-reduction approaches based on subject-specific knowledge. We examine the properties of reconstruction map estimation with and without dimension reduction and explore the trade-off between approximation error due to information loss from reducing the data dimension and approximation error. Numerical examples show that the proposed approach compares favorably with reconstruction map estimation, approximate Bayesian computation, and synthetic likelihood estimation.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Sequential Bayesian Registration for Functional Data
Authors:
Yoonji Kim,
Oksana A. Chkrebtii,
Sebastian A. Kurtek
Abstract:
In many modern applications, discretely-observed data may be naturally understood as a set of functions. Functional data often exhibit two confounded sources of variability: amplitude (y-axis) and phase (x-axis). The extraction of amplitude and phase, a process known as registration, is essential in exploring the underlying structure of functional data in a variety of areas, from environmental mon…
▽ More
In many modern applications, discretely-observed data may be naturally understood as a set of functions. Functional data often exhibit two confounded sources of variability: amplitude (y-axis) and phase (x-axis). The extraction of amplitude and phase, a process known as registration, is essential in exploring the underlying structure of functional data in a variety of areas, from environmental monitoring to medical imaging. Critically, such data are often gathered sequentially with new functional observations arriving over time. Despite this, existing registration procedures do not sequentially update inference based on the new data, requiring model refitting. To address these challenges, we introduce a Bayesian framework for sequential registration of functional data, which updates statistical inference as new sets of functions are assimilated. This Bayesian model-based sequential learning approach utilizes sequential Monte Carlo sampling to recursively update the alignment of observed functions while accounting for associated uncertainty. Distributed computing significantly reduces computational cost relative to refitting the model using an iterative method such as Markov chain Monte Carlo on the full data. Simulation studies and comparisons reveal that the proposed approach performs well even when the target posterior distribution has a challenging structure. We apply the proposed method to three real datasets: (1) functions of annual drought intensity near Kaweah River in California, (2) annual sea surface salinity functions near Null Island, and (3) a sequence of repeated patterns in electrocardiogram signals.
△ Less
Submitted 21 May, 2025; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Inference for stochastic kinetic models from multiple data sources for joint estimation of infection dynamics from aggregate reports and virological data
Authors:
Yury E. García,
Oksana A. Chkrebtii,
Marcos A. Capistrán and,
Daniel E. Noyola
Abstract:
Influenza and respiratory syncytial virus (RSV) are the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. Medical doctors typically base the diagnosis of ARI on patients' symptoms alone and do not always conduct virological tests necessary to identify individual viruses, which limits the ability to study the interaction between multiple pathogens and make…
▽ More
Influenza and respiratory syncytial virus (RSV) are the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. Medical doctors typically base the diagnosis of ARI on patients' symptoms alone and do not always conduct virological tests necessary to identify individual viruses, which limits the ability to study the interaction between multiple pathogens and make public health recommendations. We consider a stochastic kinetic model (SKM) for two interacting ARI pathogens circulating in a large population and an empirically motivated background process for infections with other pathogens causing similar symptoms. An extended marginal sampling approach based on the Linear Noise Approximation to the SKM integrates multiple data sources and additional model components. We infer the parameters defining the pathogens' dynamics and interaction within a Bayesian hierarchical model and explore the posterior trajectories of infections for each illness based on aggregate infection reports from six epidemic seasons collected by the state health department, and a subset of virological tests from a sentinel program at a general hospital in San Luis Potosí, México. We interpret the results based on real and simulated data and make recommendations for future data collection strategies. Supplementary materials and software are provided online.
△ Less
Submitted 28 March, 2019; v1 submitted 24 March, 2019;
originally announced March 2019.
-
Inference for stochastic kinetic models from multiple data sources for joint estimation of infection dynamics from aggregate reports and virological data
Authors:
Oksana A. Chkrebtii,
Yury E. García,
Marcos A. Capistrán,
Daniel E. Noyola
Abstract:
Before the current pandemic, influenza and respiratory syncytial virus (RSV) were the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. In this setting, medical doctors typically based the diagnosis of ARI on patients' symptoms alone and did not routinely conduct virological tests necessary to identify individual viruses, limiting the ability to study the…
▽ More
Before the current pandemic, influenza and respiratory syncytial virus (RSV) were the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. In this setting, medical doctors typically based the diagnosis of ARI on patients' symptoms alone and did not routinely conduct virological tests necessary to identify individual viruses, limiting the ability to study the interaction between multiple pathogens and to make public health recommendations. We consider a stochastic kinetic model (SKM) for two interacting ARI pathogens circulating in a large population and an empirically-motivated background process for infections with other pathogens causing similar symptoms. An extended marginal sampling approach, based on the linear noise approximation to the SKM, integrates multiple data sources and additional model components. We infer the parameters defining the pathogens' dynamics and interaction within a Bayesian model and explore the posterior trajectories of infections for each illness based on aggregate infection reports from six epidemic seasons collected by the state health department and a subset of virological tests from a sentinel program at a general hospital in San Luis Potosí, México. We interpret the results and make recommendations for future data collection strategies.
△ Less
Submitted 17 February, 2022; v1 submitted 27 October, 2017;
originally announced October 2017.
-
Transdimensional Approximate Bayesian Computation for Inference on Invasive Species Models with Latent Variables of Unknown Dimension
Authors:
Oksana A. Chkrebtii,
Erin K. Cameron,
David A. Campbell,
Erin M. Bayne
Abstract:
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimens…
▽ More
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimension precludes exact sampling techniques. Approximate Bayesian computation (ABC) methods provide an alternative to exact sampling but rely on inefficient conditional simulation of the latent variables. To accomplish this task efficiently, a new transdimensional Monte Carlo sampler is developed for approximate Bayesian model inference and used to estimate rates of introduction and spread for the non-native earthworm species Dendrobaena octaedra (Savigny) along roads in the boreal forest of northern Alberta. Using low and high estimates of introduction and spread rates, the extent of earthworm invasions in northeastern Alberta was simulated to project the proportion of suitable habitat invaded in the year following data collection.
△ Less
Submitted 30 December, 2014; v1 submitted 10 October, 2013;
originally announced October 2013.
-
Bayesian Solution Uncertainty Quantification for Differential Equations
Authors:
Oksana A. Chkrebtii,
David A. Campbell,
Ben Calderhead,
Mark A. Girolami
Abstract:
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory throug…
▽ More
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory through Bayesian updating of a prior process conditional on model information. A one-step-ahead sampling scheme for interrogating the model is described, its consistency and first order convergence properties are proved, and its computational complexity is shown to be proportional to that of numerical explicit one-step solvers. Examples illustrate the flexibility of this framework to deal with a wide variety of complex and large-scale systems. Within the calibration problem, discretization uncertainty defines a layer in the Bayesian hierarchy, and a Markov chain Monte Carlo algorithm that targets this posterior distribution is presented. This formalism is used for inference on the JAK-STAT delay differential equation model of protein dynamics from indirectly observed measurements. The discussion outlines implications for the new field of probabilistic numerics.
△ Less
Submitted 23 October, 2016; v1 submitted 10 June, 2013;
originally announced June 2013.