-
Mechanistic models for panel data: Analysis of ecological experiments with four interacting species
Authors:
Bo Yang,
Jesse Wheeler,
Meghan A. Duffy,
Aaron A. King,
Edward L. Ionides
Abstract:
In an ecological context, panel data arise when time series measurements are made on a collection of ecological processes. Each process may correspond to a spatial location for field data, or to an experimental ecosystem in a designed experiment. Statistical models for ecological panel data should capture the high levels of nonlinearity, stochasticity, and measurement uncertainty inherent in ecolo…
▽ More
In an ecological context, panel data arise when time series measurements are made on a collection of ecological processes. Each process may correspond to a spatial location for field data, or to an experimental ecosystem in a designed experiment. Statistical models for ecological panel data should capture the high levels of nonlinearity, stochasticity, and measurement uncertainty inherent in ecological systems. Furthermore, the system dynamics may depend on unobservable variables. This study applies iterated particle filtering techniques to explore new possibilities for likelihood-based statistical analysis of these complex systems. We analyze data from a mesocosm experiment in which two species of the freshwater planktonic crustacean genus, Daphnia, coexist with an alga and a fungal parasite. Time series data were collected on replicated mesocosms under six treatment conditions. Iterated filtering enables maximization of the likelihood for scientifically motivated nonlinear partially observed Markov process models, providing access to standard likelihood-based methods for parameter estimation, confidence intervals, hypothesis testing, model selection and diagnostics. This toolbox allows scientists to propose and evaluate scientifically motivated stochastic dynamic models for panel data, constrained only by the requirement to write code to simulate from the model and to specify a measurement distribution describing how the system state is observed.
△ Less
Submitted 8 June, 2025; v1 submitted 4 June, 2025;
originally announced June 2025.
-
panelPomp: Analysis of Panel Data via Partially Observed Markov Processes in R
Authors:
Carles Bretó,
Jesse Wheeler,
Aaron A. King,
Edward L. Ionides
Abstract:
Panel data arise when time series measurements are collected from multiple, dynamically independent but structurally related systems. Each system's time series can be modeled as a partially observed Markov process (POMP), and the ensemble of these models is called a PanelPOMP. If the time series are relatively short, statistical inference for each time series must draw information from across the…
▽ More
Panel data arise when time series measurements are collected from multiple, dynamically independent but structurally related systems. Each system's time series can be modeled as a partially observed Markov process (POMP), and the ensemble of these models is called a PanelPOMP. If the time series are relatively short, statistical inference for each time series must draw information from across the entire panel. The component systems in the panel are called units; model parameters may be shared between units or may be unit-specific. Differences between units may be of direct inferential interest or may be a nuisance for studying the commonalities. The R package panelPomp supports analysis of panel data via a general class of PanelPOMP models. This includes a suite of tools for manipulation of models and data that take advantage of the panel structure. The panelPomp package currently highlights recent advances enabling likelihood based inference via simulation based algorithms. However, the general framework provided by panelPomp supports development of additional, new inference methodology for panel data.
△ Less
Submitted 19 May, 2025; v1 submitted 10 October, 2024;
originally announced October 2024.
-
Poisson approximate likelihood compared to the particle filter
Authors:
Yize Hao,
Aaron A. Abkemeier,
Edward L. Ionides
Abstract:
Filtering algorithms are fundamental for inference on partially observed stochastic dynamic systems, since they provide access to the likelihood function and hence enable likelihood-based or Bayesian inference. A novel Poisson approximate likelihood (PAL) filter was introduced by Whitehouse et al. (2023). PAL employs a Poisson approximation to conditional densities, offering a fast approximation t…
▽ More
Filtering algorithms are fundamental for inference on partially observed stochastic dynamic systems, since they provide access to the likelihood function and hence enable likelihood-based or Bayesian inference. A novel Poisson approximate likelihood (PAL) filter was introduced by Whitehouse et al. (2023). PAL employs a Poisson approximation to conditional densities, offering a fast approximation to the likelihood function for a certain subset of partially observed Markov process models. A central piece of evidence for PAL is the comparison in Table 1 of Whitehouse et al. (2023), which claims a large improvement for PAL over a standard particle filter algorithm. This evidence, based on a model and data from a previous scientific study by Stocks et al. (2020), might suggest that researchers confronted with similar models should use PAL rather than particle filter methods. Taken at face value, this evidence also reduces the credibility of Stocks et al. (2020) by indicating a shortcoming with the numerical methods that they used. However, we show that the comparison of log-likelihood values made by Whitehouse et al. (2023) is flawed because their PAL calculations were carried out using a dataset scaled differently from the previous study. If PAL and the particle filter are applied to the same data, the advantage claimed for PAL disappears. On simulations where the model is correctly specified, the particle filter outperforms PAL.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
A tutorial on panel data analysis using partially observed Markov processes via the R package panelPomp
Authors:
Carles Breto,
Jesse Wheeler,
Aaron A. King,
Edward L. Ionides
Abstract:
The R package panelPomp supports analysis of panel data via a general class of partially observed Markov process models (PanelPOMP). This package tutorial describes how the mathematical concept of a PanelPOMP is represented in the software and demonstrates typical use-cases of panelPomp. Monte Carlo methods used for POMP models require adaptation for PanelPOMP models due to the higher dimensionali…
▽ More
The R package panelPomp supports analysis of panel data via a general class of partially observed Markov process models (PanelPOMP). This package tutorial describes how the mathematical concept of a PanelPOMP is represented in the software and demonstrates typical use-cases of panelPomp. Monte Carlo methods used for POMP models require adaptation for PanelPOMP models due to the higher dimensionality of panel data. The package takes advantage of recent advances for PanelPOMP, including an iterated filtering algorithm, Monte Carlo adjusted profile methodology and block optimization methodology to assist with the large parameter spaces that can arise with panel models. In addition, tools for manipulation of models and data are provided that take advantage of the panel structure.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Accelerated Inference for Partially Observed Markov Processes using Automatic Differentiation
Authors:
Kevin Tan,
Giles Hooker,
Edward L. Ionides
Abstract:
Automatic differentiation (AD) has driven recent advances in machine learning, including deep neural networks and Hamiltonian Markov Chain Monte Carlo methods. Partially observed nonlinear stochastic dynamical systems have proved resistant to AD techniques because widely used particle filter algorithms yield an estimated likelihood function that is discontinuous as a function of the model paramete…
▽ More
Automatic differentiation (AD) has driven recent advances in machine learning, including deep neural networks and Hamiltonian Markov Chain Monte Carlo methods. Partially observed nonlinear stochastic dynamical systems have proved resistant to AD techniques because widely used particle filter algorithms yield an estimated likelihood function that is discontinuous as a function of the model parameters. We show how to embed two existing AD particle filter methods in a theoretical framework that provides an extension to a new class of algorithms. This new class permits a bias/variance tradeoff and hence a mean squared error substantially lower than the existing algorithms. We develop likelihood maximization algorithms suited to the Monte Carlo properties of the AD gradient estimate. Our algorithms require only a differentiable simulator for the latent dynamic system; by contrast, most previous approaches to AD likelihood maximization for particle filters require access to the system's transition probabilities. Numerical results indicate that a hybrid algorithm that uses AD to refine a coarse solution from an iterated filtering algorithm show substantial improvement on current state-of-the-art methods for a challenging scientific benchmark problem.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Exact phylodynamic likelihood via structured Markov genealogy processes
Authors:
Aaron A. King,
Qianying Lin,
Edward L. Ionides
Abstract:
We consider genealogies arising from a Markov population process in which individuals are categorized into a discrete collection of compartments, with the requirement that individuals within the same compartment are statistically exchangeable. When equipped with a sampling process, each such population process induces a time-evolving tree-valued process defined as the genealogy of all sampled indi…
▽ More
We consider genealogies arising from a Markov population process in which individuals are categorized into a discrete collection of compartments, with the requirement that individuals within the same compartment are statistically exchangeable. When equipped with a sampling process, each such population process induces a time-evolving tree-valued process defined as the genealogy of all sampled individuals. We provide a construction of this genealogy process and derive exact expressions for the likelihood of an observed genealogy in terms of filter equations. These filter equations can be numerically solved using standard Monte Carlo integration methods. Thus, we obtain statistically efficient likelihood-based inference for essentially arbitrary compartment models based on an observed genealogy of individuals sampled from the population.
△ Less
Submitted 16 January, 2025; v1 submitted 27 May, 2024;
originally announced May 2024.
-
Inference on spatiotemporal dynamics for networks of biological populations
Authors:
Jifan Li,
Edward L. Ionides,
Aaron A. King,
Mercedes Pascual,
Ning Ning
Abstract:
Mathematical models in ecology and epidemiology must be consistent with observed data in order to generate reliable knowledge and evidence-based policy. Metapopulation systems, which consist of a network of connected sub-populations, pose technical challenges in statistical inference due to nonlinear, stochastic interactions. Numerical difficulties encountered in conducting inference can obstruct…
▽ More
Mathematical models in ecology and epidemiology must be consistent with observed data in order to generate reliable knowledge and evidence-based policy. Metapopulation systems, which consist of a network of connected sub-populations, pose technical challenges in statistical inference due to nonlinear, stochastic interactions. Numerical difficulties encountered in conducting inference can obstruct the core scientific questions concerning the link between the mathematical models and the data. Recently, an algorithm has been developed which enables effective likelihood-based inference for the high-dimensional partially observed stochastic dynamic models arising in metapopulation systems. The COVID-19 pandemic provides a situation where mathematical models and their policy implications were widely visible, and we use the new inferential technology to revisit an influential metapopulation model used to inform basic epidemiological understanding early in the pandemic. Our methods support self-critical data analysis, enabling us to identify and address model limitations, and leading to a new model with substantially improved statistical fit and parameter identifiability. Our results suggest that the lockdown initiated on January 23, 2020 in China was more effective than previously thought. We proceed to recommend statistical analysis standards for future metapopulation system modeling.
△ Less
Submitted 6 February, 2024; v1 submitted 11 November, 2023;
originally announced November 2023.
-
Likelihood Based Inference for ARMA Models
Authors:
Jesse Wheeler,
Edward L. Ionides
Abstract:
Autoregressive moving average (ARMA) models are widely used for analyzing time series data. However, standard likelihood-based inference methodology for ARMA models has avoidable limitations. We show that common ARMA likelihood maximization strategies often lead to sub-optimal parameter estimates. While this possibility has been previously identified, no routinely applicable algorithm has been dev…
▽ More
Autoregressive moving average (ARMA) models are widely used for analyzing time series data. However, standard likelihood-based inference methodology for ARMA models has avoidable limitations. We show that common ARMA likelihood maximization strategies often lead to sub-optimal parameter estimates. While this possibility has been previously identified, no routinely applicable algorithm has been developed to resolve the issue. We introduce a novel random initialization algorithm, designed to take advantage of the structure of the ARMA likelihood function, which overcomes these optimization problems. Additionally, we show that profile confidence intervals provide superior confidence intervals to those based on the Fisher information matrix. The efficacy of the proposed methodology is demonstrated through a data analysis example and a series of simulation studies. This work makes a significant contribution to statistical practice by identifying and resolving under-recognized shortcomings of existing procedures that frequently arise in scientific and industrial applications.
△ Less
Submitted 4 December, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Informing policy via dynamic models: Cholera in Haiti
Authors:
Jesse Wheeler,
AnnaElaine Rosengart,
Zhuoxun Jiang,
Kevin Tan,
Noah Treutle,
Edward Ionides
Abstract:
Public health decisions must be made about when and how to implement interventions to control an infectious disease epidemic. These decisions should be informed by data on the epidemic as well as current understanding about the transmission dynamics. Such decisions can be posed as statistical questions about scientifically motivated dynamic models. Thus, we encounter the methodological task of bui…
▽ More
Public health decisions must be made about when and how to implement interventions to control an infectious disease epidemic. These decisions should be informed by data on the epidemic as well as current understanding about the transmission dynamics. Such decisions can be posed as statistical questions about scientifically motivated dynamic models. Thus, we encounter the methodological task of building credible, data-informed decisions based on stochastic, partially observed, nonlinear dynamic models. This necessitates addressing the tradeoff between biological fidelity and model simplicity, and the reality of misspecification for models at all levels of complexity. We assess current methodological approaches to these issues via a case study of the 2010-2019 cholera epidemic in Haiti. We consider three dynamic models developed by expert teams to advise on vaccination policies. We evaluate previous methods used for fitting these models, and we demonstrate modified data analysis strategies leading to improved statistical fit. Specifically, we present approaches for diagnosing model misspecification and the consequent development of improved models. Additionally, we demonstrate the utility of recent advances in likelihood maximization for high-dimensional nonlinear dynamic models, enabling likelihood-based inference for spatiotemporal incidence data using this class of models. Our workflow is reproducible and extendable, facilitating future investigations of this disease system.
△ Less
Submitted 4 March, 2024; v1 submitted 21 January, 2023;
originally announced January 2023.
-
An iterated block particle filter for inference on coupled dynamic systems with shared and unit-specific parameters
Authors:
Edward L. Ionides,
Ning Ning,
Jesse Wheeler
Abstract:
We consider inference for a collection of partially observed, stochastic, interacting, nonlinear dynamic processes. Each process is identified with a label called its unit, and our primary motivation arises in biological metapopulation systems where a unit corresponds to a spatially distinct sub-population. Metapopulation systems are characterized by strong dependence through time within a single…
▽ More
We consider inference for a collection of partially observed, stochastic, interacting, nonlinear dynamic processes. Each process is identified with a label called its unit, and our primary motivation arises in biological metapopulation systems where a unit corresponds to a spatially distinct sub-population. Metapopulation systems are characterized by strong dependence through time within a single unit and relatively weak interactions between units, and these properties make block particle filters an effective tool for simulation-based likelihood evaluation. Iterated filtering algorithms can facilitate likelihood maximization for simulation-based filters. We introduce an iterated block particle filter applicable when parameters are unit-specific or shared between units. We demonstrate this algorithm by performing inference on a coupled epidemiological model describing spatiotemporal measles case report data for twenty towns.
△ Less
Submitted 19 December, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Iterated Block Particle Filter for High-dimensional Parameter Learning: Beating the Curse of Dimensionality
Authors:
Ning Ning,
Edward L. Ionides
Abstract:
Parameter learning for high-dimensional, partially observed, and nonlinear stochastic processes is a methodological challenge. Spatiotemporal disease transmission systems provide examples of such processes giving rise to open inference problems. We propose the iterated block particle filter (IBPF) algorithm for learning high-dimensional parameters over graphical state space models with general sta…
▽ More
Parameter learning for high-dimensional, partially observed, and nonlinear stochastic processes is a methodological challenge. Spatiotemporal disease transmission systems provide examples of such processes giving rise to open inference problems. We propose the iterated block particle filter (IBPF) algorithm for learning high-dimensional parameters over graphical state space models with general state spaces, measures, transition densities and graph structure. Theoretical performance guarantees are obtained on beating the curse of dimensionality (COD), algorithm convergence, and likelihood maximization. Experiments on a highly nonlinear and non-Gaussian spatiotemporal model for measles transmission reveal that the iterated ensemble Kalman filter algorithm (Li et al. (2020)) is ineffective and the iterated filtering algorithm (Ionides et al. (2015)) suffers from the COD, while our IBPF algorithm beats COD consistently across various experiments with different metrics.
△ Less
Submitted 4 April, 2023; v1 submitted 20 October, 2021;
originally announced October 2021.
-
Systemic Infinitesimal Over-dispersion on Graphical Dynamic Models
Authors:
Ning Ning,
Edward L. Ionides
Abstract:
Stochastic models for collections of interacting populations have crucial roles in scientific fields such as epidemiology and ecology, yet the standard approach to extending an ordinary differential equation model to a Markov chain does not have sufficient flexibility in the mean-variance relationship to match data. To handle that, over-dispersed Markov chains have previously been constructed usin…
▽ More
Stochastic models for collections of interacting populations have crucial roles in scientific fields such as epidemiology and ecology, yet the standard approach to extending an ordinary differential equation model to a Markov chain does not have sufficient flexibility in the mean-variance relationship to match data. To handle that, over-dispersed Markov chains have previously been constructed using gamma white noise on the rates. We develop new approaches using Dirichlet noise to construct collections of independent or dependent noise processes. This permits the modeling of high-frequency variation in transition rates both within and between the populations under study. Our theory is developed in a general framework of time-inhomogeneous Markov processes equipped with a graphical structure, for which ecological and epidemiological models provide motivating examples. We demonstrate our approach on a widely analyzed measles dataset, adding Dirichlet noise to a classical SEIR (Susceptible-Exposed-Infected-Recovered) model. Our methodology shows improved statistical fit measured by log-likelihood and provides new insights into the dynamics of this biological system.
△ Less
Submitted 3 October, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Markov Genealogy Processes
Authors:
Aaron A. King,
Qianying Lin,
Edward L. Ionides
Abstract:
We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calcu…
▽ More
We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calculations with several examples. Existing full-information approaches for phylodynamic inference are special cases of the theory.
△ Less
Submitted 24 January, 2022; v1 submitted 26 May, 2021;
originally announced May 2021.
-
A tutorial on spatiotemporal partially observed Markov process models via the R package spatPomp
Authors:
Kidus Asfaw,
Joonha Park,
Aaron A. King,
Edward L. Ionides
Abstract:
We describe a computational framework for modeling and statistical inference on high-dimensional stochastic dynamic systems. Our primary motivation is the investigation of metapopulation dynamics arising from a collection of spatially distributed, interacting biological populations. To make progress on this goal, we embed it in a more general problem: inference for a collection of interacting part…
▽ More
We describe a computational framework for modeling and statistical inference on high-dimensional stochastic dynamic systems. Our primary motivation is the investigation of metapopulation dynamics arising from a collection of spatially distributed, interacting biological populations. To make progress on this goal, we embed it in a more general problem: inference for a collection of interacting partially observed nonlinear non-Gaussian stochastic processes. Each process in the collection is called a unit; in the case of spatiotemporal models, the units correspond to distinct spatial locations. The dynamic state for each unit may be discrete or continuous, scalar or vector valued. In metapopulation applications, the state can represent a structured population or the abundances of a collection of species at a single location. We consider models where the collection of states has a Markov property. A sequence of noisy measurements is made on each unit, resulting in a collection of time series. A model of this form is called a spatiotemporal partially observed Markov process (SpatPOMP). The R package spatPomp provides an environment for implementing SpatPOMP models, analyzing data using existing methods, and developing new inference approaches. Our presentation of spatPomp reviews various methodologies in a unifying notational framework. We demonstrate the package on a simple Gaussian system and on a nontrivial epidemiological model for measles transmission within and between cities. We show how to construct user-specified SpatPOMP models within spatPomp.
△ Less
Submitted 18 April, 2024; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Scalable Monte Carlo Inference and Rescaled Local Asymptotic Normality
Authors:
Ning Ning,
Edward Ionides,
Ya'acov Ritov
Abstract:
In this paper, we generalize the property of local asymptotic normality (LAN) to an enlarged neighborhood, under the name of rescaled local asymptotic normality (RLAN). We obtain sufficient conditions for a regular parametric model to satisfy RLAN. We show that RLAN supports the construction of a statistically efficient estimator which maximizes a cubic approximation to the log-likelihood on this…
▽ More
In this paper, we generalize the property of local asymptotic normality (LAN) to an enlarged neighborhood, under the name of rescaled local asymptotic normality (RLAN). We obtain sufficient conditions for a regular parametric model to satisfy RLAN. We show that RLAN supports the construction of a statistically efficient estimator which maximizes a cubic approximation to the log-likelihood on this enlarged neighborhood. In the context of Monte Carlo inference, we find that this maximum cubic likelihood estimator can maintain its statistical efficiency in the presence of asymptotically increasing Monte Carlo error in likelihood evaluation.
△ Less
Submitted 30 November, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
The Sampled Moran Genealogy Process
Authors:
Aaron A. King,
Qianying Lin,
Edward L. Ionides
Abstract:
We define the Sampled Moran Genealogy Process, a continuous-time Markov process on the space of genealogies with the demography of the classical Moran process, sampled through time. To do so, we begin by defining the Moran Genealogy Process using a novel representation. We then extend this process to include sampling through time. We derive exact conditional and marginal probability distributions…
▽ More
We define the Sampled Moran Genealogy Process, a continuous-time Markov process on the space of genealogies with the demography of the classical Moran process, sampled through time. To do so, we begin by defining the Moran Genealogy Process using a novel representation. We then extend this process to include sampling through time. We derive exact conditional and marginal probability distributions for the sampled process under a stationarity assumption, and an exact expression for the likelihood of any sequence of genealogies it generates. This leads to some interesting observations pertinent to existing phylodynamic methods in the literature. Throughout, our proofs are original and make use of strictly forward-in-time calculations and are exact for all population sizes and sampling processes.
△ Less
Submitted 19 October, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Bagged filters for partially observed interacting systems
Authors:
Edward L. Ionides,
Kidus Asfaw,
Joonha Park,
Aaron A. King
Abstract:
Bagging (i.e., bootstrap aggregating) involves combining an ensemble of bootstrap estimators. We consider bagging for inference from noisy or incomplete measurements on a collection of interacting stochastic dynamic systems. Each system is called a unit, and each unit is associated with a spatial location. A motivating example arises in epidemiology, where each unit is a city: the majority of tran…
▽ More
Bagging (i.e., bootstrap aggregating) involves combining an ensemble of bootstrap estimators. We consider bagging for inference from noisy or incomplete measurements on a collection of interacting stochastic dynamic systems. Each system is called a unit, and each unit is associated with a spatial location. A motivating example arises in epidemiology, where each unit is a city: the majority of transmission occurs within a city, with smaller yet epidemiologically important interactions arising from disease transmission between cities. Monte Carlo filtering methods used for inference on nonlinear non-Gaussian systems can suffer from a curse of dimensionality as the number of units increases. We introduce bagged filter (BF) methodology which combines an ensemble of Monte Carlo filters, using spatiotemporally localized weights to select successful filters at each unit and time. We obtain conditions under which likelihood evaluation using a BF algorithm can beat a curse of dimensionality, and we demonstrate applicability even when these conditions do not hold. BF can out-perform an ensemble Kalman filter on a coupled population dynamics model describing infectious disease transmission. A block particle filter also performs well on this task, though the bagged filter respects smoothness and conservation laws that a block particle filter can violate.
△ Less
Submitted 28 June, 2021; v1 submitted 12 February, 2020;
originally announced February 2020.
-
Panel data analysis via mechanistic models
Authors:
Carles Bretó,
Edward L. Ionides,
Aaron A. King
Abstract:
Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of…
▽ More
Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model specification, we are motivated to develop a framework for inference on panel data permitting the consideration of arbitrary nonlinear, partially observed panel models. We build on iterated filtering techniques that provide likelihood-based inference on nonlinear partially observed Markov process models for time series data. Our methodology depends on the latent Markov process only through simulation; this plug-and-play property ensures applicability to a large class of models. We demonstrate our methodology on a toy example and two epidemiological case studies. We address inferential and computational issues arising due to the combination of model complexity and dataset size.
△ Less
Submitted 25 March, 2019; v1 submitted 17 January, 2018;
originally announced January 2018.
-
Inference on high-dimensional implicit dynamic models using a guided intermediate resampling filter
Authors:
Joonha Park,
Edward L. Ionides
Abstract:
We propose a method for inference on moderately high-dimensional, nonlinear, non-Gaussian, partially observed Markov process models for which the transition density is not analytically tractable. Markov processes with intractable transition densities arise in models defined implicitly by simulation algorithms. Widely used particle filter methods are applicable to nonlinear, non-Gaussian models but…
▽ More
We propose a method for inference on moderately high-dimensional, nonlinear, non-Gaussian, partially observed Markov process models for which the transition density is not analytically tractable. Markov processes with intractable transition densities arise in models defined implicitly by simulation algorithms. Widely used particle filter methods are applicable to nonlinear, non-Gaussian models but suffer from the curse of dimensionality. Improved scalability is provided by ensemble Kalman filter methods, but these are inappropriate for highly nonlinear and non-Gaussian models. We propose a particle filter method having improved practical and theoretical scalability with respect to the model dimension. This method is applicable to implicitly defined models having analytically intractable transition densities. Our method is developed based on the assumption that the latent process is defined in continuous time and that a simulator of this latent process is available. In this method, particles are propagated at intermediate time intervals between observations and are resampled based on a forecast likelihood of future observations. We combine this particle filter with parameter estimation methodology to enable likelihood-based inference for highly nonlinear spatiotemporal systems. We demonstrate our methodology on a stochastic Lorenz 96 model and a model for the population dynamics of infectious diseases in a network of linked regions.
△ Less
Submitted 31 March, 2020; v1 submitted 28 August, 2017;
originally announced August 2017.
-
Monte Carlo profile confidence intervals
Authors:
Edward L. Ionides,
Carles Breto,
Joonha Park,
Richard A. Smith,
Aaron A. King
Abstract:
Monte Carlo methods to evaluate and maximize the likelihood function enable the construction of confidence intervals and hypothesis tests, facilitating scientific investigation using models for which the likelihood function is intractable. When Monte Carlo error can be made small, by sufficiently exhaustive computation, then the standard theory and practice of likelihood-based inference applies. A…
▽ More
Monte Carlo methods to evaluate and maximize the likelihood function enable the construction of confidence intervals and hypothesis tests, facilitating scientific investigation using models for which the likelihood function is intractable. When Monte Carlo error can be made small, by sufficiently exhaustive computation, then the standard theory and practice of likelihood-based inference applies. As data become larger, and models more complex, situations arise where no reasonable amount of computation can render Monte Carlo error negligible. We develop profile likelihood methodology to provide frequentist inferences that take into account Monte Carlo uncertainty. We investigate the role of this methodology in facilitating inference for computationally challenging dynamic latent variable models. We present three examples arising in the study of infectious disease transmission. These three examples demonstrate our methodology for inference on nonlinear dynamic models using genetic sequence data, panel time series data, and spatiotemporal data. We also discuss applicability to nonlinear time series analysis.
△ Less
Submitted 9 February, 2017; v1 submitted 8 December, 2016;
originally announced December 2016.
-
Statistical Inference for Partially Observed Markov Processes via the R Package pomp
Authors:
Aaron A. King,
Dao Nguyen,
Edward L. Ionides
Abstract:
Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including se…
▽ More
Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including sequential Monte Carlo, iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and trajectory matching. In this paper, we demonstrate the application of these methodologies using some simple toy problems. We also illustrate the specification of more complex POMP models, using a nonlinear epidemiological model with a discrete population, seasonality, and extra-demographic stochasticity. We discuss the specification of user-defined models and the development of additional methods within the programming environment provided by pomp.
△ Less
Submitted 21 October, 2015; v1 submitted 1 September, 2015;
originally announced September 2015.
-
Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong
Authors:
Edward L. Ionides
Abstract:
Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong [arXiv:1104.3073]
Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong [arXiv:1104.3073]
△ Less
Submitted 6 January, 2012;
originally announced January 2012.
-
Macroeconomic effects on mortality revealed by panel analysis with nonlinear trends
Authors:
Edward L. Ionides,
Zhen Wang,
José A. Tapia Granados
Abstract:
Many investigations have used panel methods to study the relationships between fluctuations in economic activity and mortality. A broad consensus has emerged on the overall procyclical nature of mortality: perhaps counter-intuitively, mortality typically rises above its trend during expansions. This consensus has been tarnished by inconsistent reports on the specific age groups and mortality cause…
▽ More
Many investigations have used panel methods to study the relationships between fluctuations in economic activity and mortality. A broad consensus has emerged on the overall procyclical nature of mortality: perhaps counter-intuitively, mortality typically rises above its trend during expansions. This consensus has been tarnished by inconsistent reports on the specific age groups and mortality causes involved. We show that these inconsistencies result, in part, from the trend specifications used in previous panel models. Standard econometric panel analysis involves fitting regression models using ordinary least squares, employing standard errors which are robust to temporal autocorrelation. The model specifications include a fixed effect, and possibly a linear trend, for each time series in the panel. We propose alternative methodology based on nonlinear detrending. Applying our methodology on data for the 50 US states from 1980 to 2006, we obtain more precise and consistent results than previous studies. We find procyclical mortality in all age groups. We find clear procyclical mortality due to respiratory disease and traffic injuries. Predominantly procyclical cardiovascular disease mortality and countercyclical suicide are subject to substantial state-to-state variation. Neither cancer nor homicide have significant macroeconomic association.
△ Less
Submitted 28 November, 2013; v1 submitted 24 October, 2011;
originally announced October 2011.