-
Model-Based Clustering with Sequential Outlier Identification using the Distribution of Mahalanobis Distances
Authors:
Ultán P. Doherty,
Paul D. McNicholas,
Arthur White
Abstract:
The presence of outliers can prevent clustering algorithms from accurately determining an appropriate group structure within a data set. We present outlierMBC, a model-based approach for sequentially removing outliers and clustering the remaining observations. Our method identifies outliers one at a time while fitting a multivariate Gaussian mixture model to data. Since it can be difficult to clas…
▽ More
The presence of outliers can prevent clustering algorithms from accurately determining an appropriate group structure within a data set. We present outlierMBC, a model-based approach for sequentially removing outliers and clustering the remaining observations. Our method identifies outliers one at a time while fitting a multivariate Gaussian mixture model to data. Since it can be difficult to classify observations as outliers without knowing what the correct cluster structure is a priori, and the presence of outliers interferes with the process of modelling clusters correctly, we use an iterative method to identify outliers one by one. At each iteration, outlierMBC removes the observation with the lowest density and fits a Gaussian mixture model to the remaining data. The method continues to remove potential outliers until a pre-set maximum number of outliers is reached, then retrospectively identifies the optimal number of outliers. To decide how many outliers to remove, it uses the fact that the squared sample Mahalanobis distances of Gaussian distributed observations are Beta distributed when scaled appropriately. outlierMBC chooses the number of outliers which minimises a dissimilarity between this theoretical Beta distribution and the observed distribution of the scaled squared sample Mahalanobis distances. This means that our method both clusters the data using a Gaussian mixture model and implements a model-based procedure to identify the optimal outliers to remove without requiring the number of outliers to be pre-specified. Unlike leading methods in the literature, outlierMBC does not assume that the outliers follow a known distribution or that the number of outliers can be pre-specified. Moreover, outlierMBC performs strongly compared to these algorithms when applied to a range of simulated and real data sets.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Model-based calibration of gear-specific fish abundance survey data as a change-of-support problem
Authors:
Grace S. Chiu,
Anton H. Westveld,
Mark A. Albins,
Kevin M. Boswell,
John M. Hoenig,
Sean P. Powers,
S. Lynne Stokes,
Allison L. White
Abstract:
In a continental-scale fish abundance study, a major challenge in deriving an absolute abundance estimate lies in the fact that regional surveys deploy different gear types, each with its unique field of view, producing gear-specific relative abundance data. Thus, data from regional surveys in the study must be converted from the gear-specific relative scale to an absolute scale before being combi…
▽ More
In a continental-scale fish abundance study, a major challenge in deriving an absolute abundance estimate lies in the fact that regional surveys deploy different gear types, each with its unique field of view, producing gear-specific relative abundance data. Thus, data from regional surveys in the study must be converted from the gear-specific relative scale to an absolute scale before being combined to estimate a continental scale absolute abundance. In this paper, we develop a tool that takes gear-based data as input, and produces as output the required conversion, with associated uncertainty. Methodologically, this tool is operationalized from a Bayesian hierarchical model which we develop in an inferential context that is akin to the change-of-support problem often encountered in spatial studies; the actual context here is to reconcile abundance data at various gear-specific scales, some being relative, and others, absolute. We consider data from a small-scale calibration experiment in which 2 to 4 underwater video camera types, as well as an acoustic echosounder, were simultaneously deployed on each of 21 boat trips. While acoustic fish signals are recorded along transects on the absolute scale, they are subject to confounding from acoustically similar species, thus requiring an externally derived correction factor. Conversely, a camera allows visual distinction between species but records data on a gear-specific relative scale. Our statistical modeling framework reflects the relationship among all 5 gear types across the 21 trips, and the resulting model is used to derive calibration formulae to translate relative abundance data to the corrected absolute abundance scale whenever a camera is deployed alone. Cross-validation is conducted using mark-recapture abundance estimates. We also briefly discuss the case when one camera type is deployed alongside the echosounder.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Projected Neural Differential Equations for Learning Constrained Dynamics
Authors:
Alistair White,
Anna Büttner,
Maximilian Gelbrecht,
Valentin Duruisseaux,
Niki Kilbertus,
Frank Hellmann,
Niklas Boers
Abstract:
Neural differential equations offer a powerful approach for learning dynamics from data. However, they do not impose known constraints that should be obeyed by the learned model. It is well-known that enforcing constraints in surrogate models can enhance their generalizability and numerical stability. In this paper, we introduce projected neural differential equations (PNDEs), a new method for con…
▽ More
Neural differential equations offer a powerful approach for learning dynamics from data. However, they do not impose known constraints that should be obeyed by the learned model. It is well-known that enforcing constraints in surrogate models can enhance their generalizability and numerical stability. In this paper, we introduce projected neural differential equations (PNDEs), a new method for constraining neural differential equations based on projection of the learned vector field to the tangent space of the constraint manifold. In tests on several challenging examples, including chaotic dynamical systems and state-of-the-art power grid models, PNDEs outperform existing methods while requiring fewer hyperparameters. The proposed approach demonstrates significant potential for enhancing the modeling of constrained dynamical systems, particularly in complex domains where accuracy and reliability are essential.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Position: Benchmarking is Limited in Reinforcement Learning Research
Authors:
Scott M. Jordan,
Adam White,
Bruno Castro da Silva,
Martha White,
Philip S. Thomas
Abstract:
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is…
▽ More
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is that conducting rigorous benchmarking experiments requires substantial computational time. This work investigates the sources of increased computation costs in rigorous experiment designs. We show that conducting rigorous performance benchmarks will likely have computational costs that are often prohibitive. As a result, we argue for using an additional experimentation paradigm to overcome the limitations of benchmarking.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Extrapolation of Relative Treatment Effects using Change-point Survival Models
Authors:
Philip Cooney,
Arthur White
Abstract:
Introduction: Modelling of relative treatment effects is an important aspect to consider when extrapolating the long-term survival outcomes of treatments. Flexible parametric models offer the ability to accurately model the observed data, however, the extrapolated relative treatment effects and subsequent survival function may lack face validity. Methods: We investigate the ability of change-point…
▽ More
Introduction: Modelling of relative treatment effects is an important aspect to consider when extrapolating the long-term survival outcomes of treatments. Flexible parametric models offer the ability to accurately model the observed data, however, the extrapolated relative treatment effects and subsequent survival function may lack face validity. Methods: We investigate the ability of change-point survival models to estimate changes in the relative treatment effects, specifically treatment delay, loss of treatment effects and converging hazards. These models are implemented using standard Bayesian statistical software and propagate the uncertainty associate with all model parameters including the change-point location. A simulation study was conducted to assess the predictive performance of these models compared with other parametric survival models. Change-point survival models were applied to three datasets, two of which were used in previous health technology assessments. Results: Change-point survival models typically provided improved extrapolated survival predictions, particularly when the changes in relative treatment effects are large. When applied to the real world examples they provided good fit to the observed data while and in some situations produced more clinically plausible extrapolations than those generated by flexible spline models. Change-point models also provided support to a previously implemented modelling approach which was justified by visual inspection only and not goodness of fit to the observed data. Conclusions: We believe change-point survival models offer the ability to flexibly model observed data while also modelling and investigating clinically plausible scenarios with respect to the relative treatment effects.
△ Less
Submitted 31 December, 2023;
originally announced January 2024.
-
Modeling Supply and Demand in Public Transportation Systems
Authors:
Miranda Bihler,
Hala Nelson,
Erin Okey,
Noe Reyes Rivas,
John Webb,
Anna White
Abstract:
We propose two neural network based and data-driven supply and demand models to analyze the efficiency, identify service gaps, and determine the significant predictors of demand, in the bus system for the Department of Public Transportation (HDPT) in Harrisonburg City, Virginia, which is the home to James Madison University (JMU). The supply and demand models, one temporal and one spatial, take ma…
▽ More
We propose two neural network based and data-driven supply and demand models to analyze the efficiency, identify service gaps, and determine the significant predictors of demand, in the bus system for the Department of Public Transportation (HDPT) in Harrisonburg City, Virginia, which is the home to James Madison University (JMU). The supply and demand models, one temporal and one spatial, take many variables into account, including the demographic data surrounding the bus stops, the metrics that the HDPT reports to the federal government, and the drastic change in population between when JMU is on or off session. These direct and data-driven models to quantify supply and demand and identify service gaps can generalize to other cities' bus systems.
△ Less
Submitted 20 October, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Stabilized Neural Differential Equations for Learning Dynamics with Explicit Constraints
Authors:
Alistair White,
Niki Kilbertus,
Maximilian Gelbrecht,
Niklas Boers
Abstract:
Many successful methods to learn dynamical systems from data have recently been introduced. However, ensuring that the inferred dynamics preserve known constraints, such as conservation laws or restrictions on the allowed system states, remains challenging. We propose stabilized neural differential equations (SNDEs), a method to enforce arbitrary manifold constraints for neural differential equati…
▽ More
Many successful methods to learn dynamical systems from data have recently been introduced. However, ensuring that the inferred dynamics preserve known constraints, such as conservation laws or restrictions on the allowed system states, remains challenging. We propose stabilized neural differential equations (SNDEs), a method to enforce arbitrary manifold constraints for neural differential equations. Our approach is based on a stabilization term that, when added to the original dynamics, renders the constraint manifold provably asymptotically stable. Due to its simplicity, our method is compatible with all common neural differential equation (NDE) models and broadly applicable. In extensive empirical evaluations, we demonstrate that SNDEs outperform existing methods while broadening the types of constraints that can be incorporated into NDE training.
△ Less
Submitted 15 February, 2024; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Active Learning in Symbolic Regression with Physical Constraints
Authors:
Jorge Medina,
Andrew D. White
Abstract:
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the commit…
▽ More
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the committee. The physical constraints improve proposed equations in very low data settings. These approaches reduce the data required for SR and achieves state of the art results in data required to rediscover known equations.
△ Less
Submitted 9 August, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
ChemCrow: Augmenting large-language models with chemistry tools
Authors:
Andres M Bran,
Sam Cox,
Oliver Schilter,
Carlo Baldassari,
Andrew D White,
Philippe Schwaller
Abstract:
Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack ac…
▽ More
Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 18 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and Chemcrow's performance. Our work not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.
△ Less
Submitted 2 October, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Pointwise density estimation on metric spaces and applications in seismology
Authors:
Galatia Cleanthous,
Athanasios G. Georgiadis,
Philip A. White
Abstract:
We are studying the problem of estimating density in a wide range of metric spaces, including the Euclidean space, the sphere, the ball, and various Riemannian manifolds. Our framework involves a metric space with a doubling measure and a self-adjoint operator, whose heat kernel exhibits Gaussian behaviour. We begin by reviewing the construction of kernel density estimators and the related backgro…
▽ More
We are studying the problem of estimating density in a wide range of metric spaces, including the Euclidean space, the sphere, the ball, and various Riemannian manifolds. Our framework involves a metric space with a doubling measure and a self-adjoint operator, whose heat kernel exhibits Gaussian behaviour. We begin by reviewing the construction of kernel density estimators and the related background information. As a novel result, we present a pointwise kernel density estimation for probability density functions that belong to general Hölder spaces. The study is accompanied by an application in Seismology. Precisely, we analyze a globally-indexed dataset of earthquake occurrence and compare the out-of-sample performance of several approximated kernel density estimators indexed on the sphere.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Incorporating Expert Opinion on Observable Quantities into Statistical Models -- A General Framework
Authors:
Philip Cooney,
Arthur White
Abstract:
This article describes an approach to incorporate expert opinion on observable quantities through the use of a loss function which updates a prior belief as opposed to specifying parameters on the priors. Eliciting information on observable quantities allows experts to provide meaningful information on a quantity familiar to them, in contrast to elicitation on model parameters, which may be subjec…
▽ More
This article describes an approach to incorporate expert opinion on observable quantities through the use of a loss function which updates a prior belief as opposed to specifying parameters on the priors. Eliciting information on observable quantities allows experts to provide meaningful information on a quantity familiar to them, in contrast to elicitation on model parameters, which may be subject to interactions with other parameters or non-linear transformations before obtaining an observable quantity. The approach to incorporating expert opinion described in this paper is distinctive in that we do not specify a prior to match an expert's opinion on observed quantity, rather we obtain a posterior by updating the model parameters through a loss function. This loss function contains the observable quantity, expressed a function of the parameters, and is related to the expert's opinion which is typically operationalized as a statistical distribution. Parameters which generate observable quantities which are further from the expert's opinion incur a higher loss, allowing for the model parameters to be estimated based on their fidelity to both the data and expert opinion, with the relative strength determined by the number of observations and precision of the elicited belief. Including expert opinion in this fashion allows for a flexible specification of the opinion and in many situations is straightforward to implement with commonly used probabilistic programming software. We highlight this using three worked examples of varying model complexity including survival models, a multivariate normal distribution and a regression problem.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Joint Multivariate and Functional Modeling for Plant Traits and Reflectances
Authors:
Philip A. White,
Michael F. Christensen,
Henry Frye,
Alan E. Gelfand,
John A. Silander Jr
Abstract:
The investigation of leaf-level traits in response to varying environmental conditions has immense importance for understanding plant ecology. Remote sensing technology enables measurement of the reflectance of plants to make inferences about underlying traits along environmental gradients. While much focus has been placed on understanding how reflectance and traits are related at the leaf-level,…
▽ More
The investigation of leaf-level traits in response to varying environmental conditions has immense importance for understanding plant ecology. Remote sensing technology enables measurement of the reflectance of plants to make inferences about underlying traits along environmental gradients. While much focus has been placed on understanding how reflectance and traits are related at the leaf-level, the challenge of modelling the dependence of this relationship along environmental gradients has limited this line of inquiry. Here, we take up the problem of jointly modeling traits and reflectance given environment. Our objective is to assess not only response to environmental regressors but also dependence between trait levels and the reflectance spectrum in the context of this regression. This leads to joint modeling of a response vector of traits with reflectance arising as a functional response over the wavelength spectrum. To conduct this investigation, we employ a dataset from a global biodiversity hotspot, the Greater Cape Floristic Region in South Africa.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Nonseparable Space-Time Stationary Covariance Functions on Networks cross Time
Authors:
Emilio Porcu,
Philip A. White,
Marc G. Genton
Abstract:
The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classe…
▽ More
The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classes of nonseparable space-time stationary covariance functions where {\em space} can be a generalized network, a Euclidean tree, or a linear network, and where time can be linear or circular (seasonal). Because the construction principles are technical, we focus on illustrations that guide the reader through the construction of statistically interpretable examples. A simulation study demonstrates that we can recover the correct model when compared to misspecified models. In addition, our simulation studies show that we effectively recover simulation parameters. In our data analysis, we consider a traffic accident dataset that shows improved model performance based on covariance specifications and network-based metrics.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Change-point Detection for Piecewise Exponential Models
Authors:
Philip Cooney,
Arthur White
Abstract:
In decision modelling with time to event data, parametric models are often used to extrapolate the survivor function. One such model is the piecewise exponential model whereby the hazard function is partitioned into segments, with the hazard constant within the segment and independent between segments and the boundaries of these segments are known as change-points. We present an approach for deter…
▽ More
In decision modelling with time to event data, parametric models are often used to extrapolate the survivor function. One such model is the piecewise exponential model whereby the hazard function is partitioned into segments, with the hazard constant within the segment and independent between segments and the boundaries of these segments are known as change-points. We present an approach for determining the location and number of change-points in piecewise exponential models. Inference is performed in a Bayesian framework using Markov Chain Monte Carlo (MCMC) where the model parameters can be integrated out of the model and the number of change-points can be sampled as part of the MCMC scheme. We can estimate both the uncertainty in the change-point locations and hazards for a given change-point model and obtain a probabilistic interpretation for the number of change-points. We evaluate model performance to determine changepoint numbers and locations in a simulation study and show the utility of the method using two data sets for time to event data. In a dataset of Glioblastoma patients we use the piecewise exponential model to describe the general trends in the hazard function. In a data set of heart transplant patients, we show the piecewise exponential model produces the best statistical fit and extrapolation amongst other standard parametric models. Piecewise exponential models may be useful for survival extrapolation if a long-term constant hazard trend is clinically plausible. A key advantage of this method is that the number and change-point locations are automatically estimated rather than specified by the analyst.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Utilizing Expert Opinion to inform Extrapolation of Survival Models
Authors:
Philip Cooney,
Arthur White
Abstract:
In decision modelling with time to event data, there are a variety of parametric models which could be used to extrapolate the survivor function. Each of these implies a different hazard function and in situations where there is moderate censoring, they can result in quite different extrapolations. Expert opinion on the long-term survival or other quantities could reduce model uncertainty. We pres…
▽ More
In decision modelling with time to event data, there are a variety of parametric models which could be used to extrapolate the survivor function. Each of these implies a different hazard function and in situations where there is moderate censoring, they can result in quite different extrapolations. Expert opinion on the long-term survival or other quantities could reduce model uncertainty. We present a general and easily implementable approach for including a variety of types of expert opinions. Expert opinion is incorporated by penalizing the likelihood function. Inference is performed in a Bayesian framework, however, this approach can also be implemented using frequentist methods. The issue of aggregating pooling expert opinions is also considered and included in the analysis. We validate the method against a previously published approach and include a worked example of this method. This work highlights that expert opinions can be implemented in a straightforward manner using this approach, however, more work is required on the correct elicitation of these quantities.
△ Less
Submitted 4 December, 2021;
originally announced December 2021.
-
Gaussian Processes to speed up MCMC with automatic exploratory-exploitation effect
Authors:
Alessio Benavoli,
Jason Wyse,
Arthur White
Abstract:
We present a two-stage Metropolis-Hastings algorithm for sampling probabilistic models, whose log-likelihood is computationally expensive to evaluate, by using a surrogate Gaussian Process (GP) model. The key feature of the approach, and the difference w.r.t. previous works, is the ability to learn the target distribution from scratch (while sampling), and so without the need of pre-training the G…
▽ More
We present a two-stage Metropolis-Hastings algorithm for sampling probabilistic models, whose log-likelihood is computationally expensive to evaluate, by using a surrogate Gaussian Process (GP) model. The key feature of the approach, and the difference w.r.t. previous works, is the ability to learn the target distribution from scratch (while sampling), and so without the need of pre-training the GP. This is fundamental for automatic and inference in Probabilistic Programming Languages In particular, we present an alternative first stage acceptance scheme by marginalising out the GP distributed function, which makes the acceptance ratio explicitly dependent on the variance of the GP. This approach is extended to Metropolis-Adjusted Langevin algorithm (MALA).
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
City-wide modeling of Vehicle-to-Grid Economics to Understand Effects of Battery Performance
Authors:
Heta A. Gandhi,
Andrew D. White
Abstract:
Vehicle-to-grid (V2G) is a promising approach to solve the problem of grid-level intermittent supply and demand mismatch, caused due to renewable energy resources, because it uses the existing resource of electric vehicle (EV) batteries as the energy storage medium. EV battery design together with an impetus on profitability for participating EV owners is pivotal for V2G success. To better underst…
▽ More
Vehicle-to-grid (V2G) is a promising approach to solve the problem of grid-level intermittent supply and demand mismatch, caused due to renewable energy resources, because it uses the existing resource of electric vehicle (EV) batteries as the energy storage medium. EV battery design together with an impetus on profitability for participating EV owners is pivotal for V2G success. To better understand what battery device parameters are most important for V2G adoption, we model the economics of V2G process under realistic conditions. Most previous studies that perform V2G economic analysis, assume ideal driving conditions, use linear battery degradation models, or only consider V2G for ancillary services. Our model accounts realistic battery degradation, empirical charging efficiencies, for randomness in commute behavior, and historic hourly electricity prices in six cities in the United States. We model user behavior with Bayesian optimization to provide a best-case scenario for V2G. Across all cities, we find that charging rate and efficiency are the most important factors that determine EV users' profits. Surprisingly, EV battery cost and thus degradation due to cycling has little effect. These findings should help focus research on figures of merit that better reflect real usage of batteries in a V2G economy.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Learning Expected Emphatic Traces for Deep RL
Authors:
Ray Jiang,
Shangtong Zhang,
Veronica Chelu,
Adam White,
Hado van Hasselt
Abstract:
Off-policy sampling and experience replay are key for improving sample efficiency and scaling model-free temporal difference learning methods. When combined with function approximation, such as neural networks, this combination is known as the deadly triad and is potentially unstable. Recently, it has been shown that stability and good performance at scale can be achieved by combining emphatic wei…
▽ More
Off-policy sampling and experience replay are key for improving sample efficiency and scaling model-free temporal difference learning methods. When combined with function approximation, such as neural networks, this combination is known as the deadly triad and is potentially unstable. Recently, it has been shown that stability and good performance at scale can be achieved by combining emphatic weightings and multi-step updates. This approach, however, is generally limited to sampling complete trajectories in order, to compute the required emphatic weighting. In this paper we investigate how to combine emphatic weightings with non-sequential, off-line data sampled from a replay buffer. We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed $n$-step TD learning algorithm to learn the required emphatic weighting. We show that these state weightings reduce variance compared with prior approaches, while providing convergence guarantees. We tested the approach at scale on Atari 2600 video games, and observed that the new X-ETD($n$) agent improved over baseline agents, highlighting both the scalability and broad applicability of our approach.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Emphatic Algorithms for Deep Reinforcement Learning
Authors:
Ray Jiang,
Tom Zahavy,
Zhongwen Xu,
Adam White,
Matteo Hessel,
Charles Blundell,
Hado van Hasselt
Abstract:
Off-policy learning allows us to learn about possible policies of behavior from experience generated by a different behavior policy. Temporal difference (TD) learning algorithms can become unstable when combined with function approximation and off-policy sampling - this is known as the ''deadly triad''. Emphatic temporal difference (ETD($λ$)) algorithm ensures convergence in the linear case by app…
▽ More
Off-policy learning allows us to learn about possible policies of behavior from experience generated by a different behavior policy. Temporal difference (TD) learning algorithms can become unstable when combined with function approximation and off-policy sampling - this is known as the ''deadly triad''. Emphatic temporal difference (ETD($λ$)) algorithm ensures convergence in the linear case by appropriately weighting the TD($λ$) updates. In this paper, we extend the use of emphatic methods to deep reinforcement learning agents. We show that naively adapting ETD($λ$) to popular deep reinforcement learning algorithms, which use forward view multi-step returns, results in poor performance. We then derive new emphatic algorithms for use in the context of such algorithms, and we demonstrate that they provide noticeable benefits in small problems designed to highlight the instability of TD methods. Finally, we observed improved performance when applying these algorithms at scale on classic Atari games from the Arcade Learning Environment.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
Simulation-Based Inference with Approximately Correct Parameters via Maximum Entropy
Authors:
Rainier Barrett,
Mehrad Ansari,
Gourab Ghoshal,
Andrew D White
Abstract:
Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (Ma…
▽ More
Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (MaxEnt) and provably makes the smallest change in the latent joint distribution to fit new data. This method requires no likelihood or model derivatives and its fit is insensitive to prior strength, removing the need to balance observed data fit with prior belief. The method requires the ansatz that data is fit in expectation, which is true in some settings and may be reasonable in all with few data points. The method is based on sample reweighting, so its asymptotic run time is independent of prior distribution dimension. We demonstrate this MaxEnt approach and compare with other likelihood-free inference methods across three systems: a point particle moving in a gravitational field, a compartmental model of epidemic spread and finally molecular dynamics simulation of a protein.
△ Less
Submitted 23 August, 2021; v1 submitted 19 April, 2021;
originally announced April 2021.
-
Spatial Functional Data Modeling of Plant Reflectances
Authors:
Philip A. White,
Henry Frye,
Michael F. Christensen,
Alan E. Gelfand,
John A. Silander Jr
Abstract:
Plant reflectance spectra - the profile of light reflected by leaves across different wavelengths - supply the spectral signature for a species at a spatial location to enable estimation of functional and taxonomic diversity for plants. We consider leaf spectra as "responses" to be explained spatially. These spectra/reflectances are functions over a wavelength band that respond to the environment.…
▽ More
Plant reflectance spectra - the profile of light reflected by leaves across different wavelengths - supply the spectral signature for a species at a spatial location to enable estimation of functional and taxonomic diversity for plants. We consider leaf spectra as "responses" to be explained spatially. These spectra/reflectances are functions over a wavelength band that respond to the environment.
Our motivating data are gathered for several families from the Cape Floristic Region (CFR) in South Africa and lead us to develop rich novel spatial models that can explain spectra for genera within families. Wavelength responses for an individual leaf are viewed as a function of wavelength, leading to functional data modeling. Local environmental features become covariates. We introduce wavelength - covariate interaction since the response to environmental regressors may vary with wavelength, so may variance. Formal spatial modeling enables prediction of reflectances for genera at unobserved locations with known environmental features. We incorporate spatial dependence, wavelength dependence, and space-wavelength interaction (in the spirit of space-time interaction). We implement out-of-sample validation to select a best model, discovering that the model features listed above are all informative for the functional data analysis. We then supply interpretation of the results under the selected model.
△ Less
Submitted 25 March, 2021; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Graph Neural Network Based Coarse-Grained Mapping Prediction
Authors:
Zhiheng Li,
Geemi P. Wellawatte,
Maghesree Chakraborty,
Heta A. Gandhi,
Chenliang Xu,
Andrew D. White
Abstract:
The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning…
▽ More
The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the mapping operators produced by experts. We present a graph neural network based CG mapping predictor called DEEP SUPERVISED GRAPH PARTITIONING MODEL(DSGPM) that treats mapping operators as a graph segmentation problem. DSGPM is trained on a novel dataset, Human-annotated Mappings (HAM), consisting of 1,206 molecules with expert annotated mapping operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally, we find that predicted CG mapping operators indeed result in good CG MD models when used in simulation.
△ Less
Submitted 19 August, 2021; v1 submitted 24 June, 2020;
originally announced July 2020.
-
Towards a practical measure of interference for reinforcement learning
Authors:
Vincent Liu,
Adam White,
Hengshuai Yao,
Martha White
Abstract:
Catastrophic interference is common in many network-based learning systems, and many proposals exist for mitigating it. But, before we overcome interference we must understand it better. In this work, we provide a definition of interference for control in reinforcement learning. We systematically evaluate our new measures, by assessing correlation with several measures of learning performance, inc…
▽ More
Catastrophic interference is common in many network-based learning systems, and many proposals exist for mitigating it. But, before we overcome interference we must understand it better. In this work, we provide a definition of interference for control in reinforcement learning. We systematically evaluate our new measures, by assessing correlation with several measures of learning performance, including stability, sample efficiency, and online and offline control performance across a variety of learning architectures. Our new interference measure allows us to ask novel scientific questions about commonly used deep learning architectures. In particular we show that target network frequency is a dominating factor for interference, and that updates on the last layer result in significantly higher interference than updates internal to the network. This new measure can be expensive to compute; we conclude with motivation for an efficient proxy measure and empirically demonstrate it is correlated with our definition of interference.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Gradient Temporal-Difference Learning with Regularized Corrections
Authors:
Sina Ghiassian,
Andrew Patterson,
Shivam Garg,
Dhawal Gupta,
Adam White,
Martha White
Abstract:
It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well. However, recent work with large neural network learning systems reveals that instability is more common than previously thought. Practitioners face a difficult dilemma: choose an ea…
▽ More
It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well. However, recent work with large neural network learning systems reveals that instability is more common than previously thought. Practitioners face a difficult dilemma: choose an easy to use and performant TD method, or a more complex algorithm that is more sound but harder to tune and all but unexplored with non-linear function approximation or control. In this paper, we introduce a new method called TD with Regularized Corrections (TDRC), that attempts to balance ease of use, soundness, and performance. It behaves as well as TD, when TD performs well, but is sound in cases where TD diverges. We empirically investigate TDRC across a range of problems, for both prediction and control, and for both linear and non-linear function approximation, and show, potentially for the first time, that gradient TD methods could be a better alternative to TD and Q-learning.
△ Less
Submitted 17 September, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Hierarchical Integrated Spatial Process Modeling of Monotone West Antarctic Snow Density Curves
Authors:
Philip A. White,
Durban G. Keeler,
Summer Rupper
Abstract:
Snow density estimates below the surface, used with airplane-acquired ice-penetrating radar measurements, give a site-specific history of snow water accumulation. Because it is infeasible to drill snow cores across all of Antarctica to measure snow density and because it is critical to understand how climatic changes are affecting the world's largest freshwater reservoir, we develop methods that e…
▽ More
Snow density estimates below the surface, used with airplane-acquired ice-penetrating radar measurements, give a site-specific history of snow water accumulation. Because it is infeasible to drill snow cores across all of Antarctica to measure snow density and because it is critical to understand how climatic changes are affecting the world's largest freshwater reservoir, we develop methods that enable snow density estimation with uncertainty in regions where snow cores have not been drilled.
In inland West Antarctica, snow density increases monotonically as a function of depth, except for possible micro-scale variability or measurement error, and it cannot exceed the density of ice. We present a novel class of integrated spatial process models that allow interpolation of monotone snow density curves. For computational feasibility, we construct the space-depth process through kernel convolutions of log-Gaussian spatial processes. We discuss model comparison, model fitting, and prediction. Using this model, we extend estimates of snow density beyond the depth of the original core and estimate snow density curves where snow cores have not been drilled. Along flight lines with ice-penetrating radar, we use interpolated snow density curves to estimate recent water accumulation and find predominantly decreasing water accumulation over recent decades.
△ Less
Submitted 19 July, 2021; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Investigating Active Learning and Meta-Learning for Iterative Peptide Design
Authors:
Rainier Barrett,
Andrew D. White
Abstract:
Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here,…
▽ More
Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here, we study the effectiveness of active learning, optimizing experiment order, and meta-learning, transferring knowledge between contexts, to reduce the number of experiments necessary to build a predictive model. We present a multi-task benchmark database of peptides designed to advance these methods for experimental design. Each task is binary classification of peptides represented as a sequence string. We find neither active learning method tested to be better than random choice. The meta-learning method Reptile was found to improve average accuracy across datasets. Combining meta-learning with active learning offers inconsistent benefits.
△ Less
Submitted 10 December, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Generalized Evolutionary Point Processes: Model Specifications and Model Comparison
Authors:
Philip A. White,
Alan E. Gelfand
Abstract:
Generalized evolutionary point processes offer a class of point process models that allows for either excitation or inhibition based upon the history of the process. In this regard, we propose modeling which comprises generalization of the nonlinear Hawkes process. Working within a Bayesian framework, model fitting is implemented through Markov chain Monte Carlo. This entails discussion of computa…
▽ More
Generalized evolutionary point processes offer a class of point process models that allows for either excitation or inhibition based upon the history of the process. In this regard, we propose modeling which comprises generalization of the nonlinear Hawkes process. Working within a Bayesian framework, model fitting is implemented through Markov chain Monte Carlo. This entails discussion of computation of the likelihood for such point patterns. Furthermore, for this class of models, we discuss strategies for model comparison. Using simulation, we illustrate how well we can distinguish these models from point pattern specifications with conditionally independent event times, e.g., Poisson processes. Specifically, we demonstrate that these models can correctly identify true relationships (i.e., excitation or inhibition/control). Then, we consider a novel extension of the log Gaussian Cox process that incorporates evolutionary behavior and illustrate that our model comparison approach prefers the evolutionary log Gaussian Cox process compared to simpler models. We also examine a real dataset consisting of violent crime events from the 11th police district in Chicago from the year 2018. This data exhibits strong daily seasonality and changes across the year. After we account for these data attributes, we find significant but mild self-excitation, implying that event occurrence increases the intensity of future events.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Meta-descent for Online, Continual Prediction
Authors:
Andrew Jacobsen,
Matthew Schlegel,
Cameron Linke,
Thomas Degris,
Adam White,
Martha White
Abstract:
This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad, RMSProp, and AMSGrad, keep statistics about the learning process to approximate a second order upda…
▽ More
This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad, RMSProp, and AMSGrad, keep statistics about the learning process to approximate a second order update---a vector approximation of the inverse Hessian. Another family of approaches use meta-gradient descent to adapt the step-size parameters to minimize prediction error. These meta-descent strategies are promising for non-stationary problems, but have not been as extensively explored as quasi-second order methods. We first derive a general, incremental meta-descent algorithm, called AdaGain, designed to be applicable to a much broader range of algorithms, including those with semi-gradient updates or even those with accelerations, such as RMSProp. We provide an empirical comparison of methods from both families. We conclude that methods from both families can perform well, but in non-stationary prediction problems the meta-descent methods exhibit advantages. Our method is particularly robust across several prediction problems, and is competitive with the state-of-the-art method on a large-scale, time-series prediction problem on real data from a mobile robot.
△ Less
Submitted 13 December, 2019; v1 submitted 17 July, 2019;
originally announced July 2019.
-
Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study
Authors:
Cam Linke,
Nadia M. Ady,
Martha White,
Thomas Degris,
Adam White
Abstract:
Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience---how to adapt the learning system's behavior---to optimi…
▽ More
Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience---how to adapt the learning system's behavior---to optimize the learning of a collection of value functions. A simple answer is to compute an intrinsic reward based on the statistics of each auxiliary learner, and use reinforcement learning to maximize that intrinsic reward. Unfortunately, implementing this simple idea has proven difficult, and thus has been the focus of decades of study. It remains unclear which of the many possible measures of learning would work well in a parallel learning setting where environmental reward is extremely sparse or absent. In this paper, we investigate and compare different intrinsic reward mechanisms in a new bandit-like parallel-learning testbed. We discuss the interaction between reward and prediction learners and highlight the importance of introspective prediction learners: those that increase their rate of learning when progress is possible, and decrease when it is not. We provide a comprehensive empirical comparison of 14 different rewards, including well-known ideas from reinforcement learning and active learning. Our results highlight a simple but seemingly powerful principle: intrinsic rewards based on the amount of learning can generate useful behavior, if each individual learner is introspective.
△ Less
Submitted 21 August, 2020; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Multivariate Functional Data Modeling with Time-varying Clustering
Authors:
Philip A. White,
Alan E. Gelfand
Abstract:
We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM$_{10}$ levels as a function of time over the course of a year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City which record hourly ozone and PM$_{10}$ levels. We use the data…
▽ More
We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM$_{10}$ levels as a function of time over the course of a year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City which record hourly ozone and PM$_{10}$ levels. We use the data for the year 2017. Hence, we have 48 functions to work with. Our objective is to implement model-based clustering of the functions across the sites. Using our example, such clustering can be considered for ozone and PM$_{10}$ individually or jointly. It may occur differentially for the two pollutants. More importantly for us, we allow that such clustering can vary with time.
We model the multivariate functions across sites using a multivariate Gaussian process. With many sites and several functions at each site, we use dimension reduction to provide a stochastic process specification for the distribution of the collection of multivariate functions over the say $n$ sites. Furthermore, to cluster the functions, either individually by component or jointly with all components, we use the Dirichlet process which enables shared labeling of the functions across the sites. Specifically, we cluster functions based on their response to exogenous variables. Though the functions arise in continuous time, clustering in continuous time is extremely computationally demanding and not of practical interest. Therefore, we employ a partitioning of the time scale to capture time-varying clustering.
△ Less
Submitted 1 May, 2019; v1 submitted 25 April, 2019;
originally announced April 2019.
-
Planning with Expectation Models
Authors:
Yi Wan,
Zaheer Abbas,
Adam White,
Martha White,
Richard S. Sutton
Abstract:
Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments…
▽ More
Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments, it is not obvious how expectation models can be used for planning as they only partially characterize a distribution. In this paper, we propose a sound way of using approximate expectation models for MBRL. In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm.
△ Less
Submitted 29 July, 2020; v1 submitted 1 April, 2019;
originally announced April 2019.
-
A quantile-based g-computation approach to addressing the effects of exposure mixtures
Authors:
Alexander P. Keil,
Jessie P. Buckley,
Katie M. OBrien,
Kelly K. Ferguson,
Shanshan Zhao,
Alexandra J. White
Abstract:
Exposure mixtures frequently occur in data across many domains, particularly in the fields of environmental and nutritional epidemiology. Various strategies have arisen to answer questions about mixtures, including methods such as weighted quantile sum (WQS) regression that estimate a joint effect of the mixture components.We demonstrate a new approach to estimating the joint effects of a mixture:…
▽ More
Exposure mixtures frequently occur in data across many domains, particularly in the fields of environmental and nutritional epidemiology. Various strategies have arisen to answer questions about mixtures, including methods such as weighted quantile sum (WQS) regression that estimate a joint effect of the mixture components.We demonstrate a new approach to estimating the joint effects of a mixture: quantile g-computation. This approach combines the inferential simplicity of WQS regression with the flexibility of g-computation, a method of causal effect estimation. We use simulations to examine whether quantile g-computation and WQS regression can accurately and precisely estimate effects of mixtures in common scenarios. We examine the bias, confidence interval coverage, and bias-variance tradeoff of quantile g-computation and WQS regression, and how these quantities are impacted by the presence of non-causal exposures, exposure correlation, unmeasured confounding, and non-linear effects. Quantile g-computation, unlike WQS regression allows inference on mixture effects that is unbiased with appropriate confidence interval coverage at sample sizes typically encountered in epidemiologic studies and when the assumptions of WQS regression are not met. Further, WQS regression can magnify bias from unmeasured confounding that might occur if important components of the mixture are omitted. Unlike inferential approaches that examine effects of individual exposures, methods like quantile g-computation that can estimate the effect of a mixture are essential for understanding effects of potential public health actions that act on exposure sources. Our approach may serve to help bridge gaps between epidemiologic analysis and interventions such as regulations on industrial emissions or mining processes, dietary changes, or consumer behavioral changes that act on multiple exposures simultaneously.
△ Less
Submitted 11 March, 2020; v1 submitted 11 February, 2019;
originally announced February 2019.
-
Online Off-policy Prediction
Authors:
Sina Ghiassian,
Andrew Patterson,
Martha White,
Richard S. Sutton,
Adam White
Abstract:
This paper investigates the problem of online prediction learning, where learning proceeds continuously as the agent interacts with an environment. The predictions made by the agent are contingent on a particular way of behaving, represented as a value function. However, the behavior used to select actions and generate the behavior data might be different from the one used to define the prediction…
▽ More
This paper investigates the problem of online prediction learning, where learning proceeds continuously as the agent interacts with an environment. The predictions made by the agent are contingent on a particular way of behaving, represented as a value function. However, the behavior used to select actions and generate the behavior data might be different from the one used to define the predictions, and thus the samples are generated off-policy. The ability to learn behavior-contingent predictions online and off-policy has long been advocated as a key capability of predictive-knowledge learning systems but remained an open algorithmic challenge for decades. The issue lies with the temporal difference (TD) learning update at the heart of most prediction algorithms: combining bootstrapping, off-policy sampling and function approximation may cause the value estimate to diverge. A breakthrough came with the development of a new objective function that admitted stochastic gradient descent variants of TD. Since then, many sound online off-policy prediction algorithms have been developed, but there has been limited empirical work investigating the relative merits of all the variants. This paper aims to fill these empirical gaps and provide clarity on the key ideas behind each method. We summarize the large body of literature on off-policy learning, focusing on 1- methods that use computation linear in the number of features and are convergent under off-policy sampling, and 2- other methods which have proven useful with non-fixed, nonlinear function approximation. We provide an empirical study of off-policy prediction methods in two challenging microworlds. We report each method's parameter sensitivity, empirical convergence rate, and final performance, providing new insights that should enable practitioners to successfully extend these new methods to large-scale applications.[Abridged abstract]
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Authors:
Samuel Neumann,
Sungsu Lim,
Ajin Joseph,
Yangchen Pan,
Adam White,
Martha White
Abstract:
Many policy gradient methods are variants of Actor-Critic (AC), where a value function (critic) is learned to facilitate updating the parameterized policy (actor). The update to the actor involves a log-likelihood update weighted by the action-values, with the addition of entropy regularization for soft variants. In this work, we explore an alternative update for the actor, based on an extension o…
▽ More
Many policy gradient methods are variants of Actor-Critic (AC), where a value function (critic) is learned to facilitate updating the parameterized policy (actor). The update to the actor involves a log-likelihood update weighted by the action-values, with the addition of entropy regularization for soft variants. In this work, we explore an alternative update for the actor, based on an extension of the cross entropy method (CEM) to condition on inputs (states). The idea is to start with a broader policy and slowly concentrate around maximal actions, using a maximum likelihood update towards actions in the top percentile per state. The speed of this concentration is controlled by a proposal policy, that concentrates at a slower rate than the actor. We first provide a policy improvement result in an idealized setting, and then prove that our conditional CEM (CCEM) strategy tracks a CEM update per state, even with changing action-values. We empirically show that our Greedy AC algorithm, that uses CCEM for the actor update, performs better than Soft Actor-Critic and is much less sensitive to entropy-regularization.
△ Less
Submitted 28 February, 2023; v1 submitted 22 October, 2018;
originally announced October 2018.
-
General Value Function Networks
Authors:
Matthew Schlegel,
Andrew Jacobsen,
Zaheer Abbas,
Andrew Patterson,
Adam White,
Martha White
Abstract:
State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions…
▽ More
State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions and decision-making. At the same time, specifying and training RNNs is notoriously tricky, particularly as the common strategy to approximate gradients back in time, called truncated Back-prop Through Time (BPTT), can be sensitive to the truncation window. Further, domain-expertise--which can usually help constrain the function class and so improve trainability--can be difficult to incorporate into complex recurrent units used within RNNs. In this work, we explore how to use multi-step predictions to constrain the RNN and incorporate prior knowledge. In particular, we revisit the idea of using predictions to construct state and ask: does constraining (parts of) the state to consist of predictions about the future improve RNN trainability? We formulate a novel RNN architecture, called a General Value Function Network (GVFN), where each internal state component corresponds to a prediction about the future represented as a value function. We first provide an objective for optimizing GVFNs, and derive several algorithms to optimize this objective. We then show that GVFNs are more robust to the truncation level, in many cases only requiring one-step gradient updates.
△ Less
Submitted 2 February, 2021; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Modeling Daily Seasonality of Mexico City Ozone using Nonseparable Covariance Models on Circles Cross Time
Authors:
Philip A. White,
Emilio Porcu
Abstract:
Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonse…
▽ More
Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonseparable covariance models of this type, models appropriate for quasi-periodic data collected at many locations. With these covariance models, we use nearest-neighbor Gaussian processes to predict hourly ozone levels at unobserved locations in April and May, the peak ozone season, to infer compliance to Mexican air quality standards and to estimate respiratory health risk associated with ozone. Predicted compliance with air quality standards and estimated respiratory health risk vary greatly over space and time. In some regions, we predict exceedance of national standards for more than a third of the hours in April and May. On many days, we predict that nearly all of Mexico City exceeds nationally legislated ozone thresholds at least once. In peak regions, we estimate respiratory risk for ozone to be 55% higher on average than the annual average risk and as much at 170% higher on some days.
△ Less
Submitted 15 July, 2018;
originally announced July 2018.
-
Non-separable Nearest-Neighbor Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection
Authors:
Philip A. White,
C. Shane Reese,
William F. Christensen,
Summer Rupper
Abstract:
Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB over the Antarctic ice sheets. Using a quality-controlled aggregate dataset of SMB field measurements with significantly more observations than previous analyses, a fully Bayesian nearest-neighbor Gaussian process model is posed to estimate Antarc…
▽ More
Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB over the Antarctic ice sheets. Using a quality-controlled aggregate dataset of SMB field measurements with significantly more observations than previous analyses, a fully Bayesian nearest-neighbor Gaussian process model is posed to estimate Antarctic SMB and propose new field measurement locations. A corresponding Antarctic SMB map is rendered using this model and is compared with previous estimates. A prediction uncertainty map is created to identify regions of high SMB uncertainty. The model estimates net SMB to be 2345 Gton $\text{yr}^{-1}$, with 95% credible interval (2273,2413) Gton $\text{yr}^{-1}$. Overall, these results suggest lower Antarctic SMB than previously reported. Using the model's uncertainty quantification, we propose 25 new measurement sites for field study utilizing a design to minimize integrated mean squared error.
△ Less
Submitted 14 July, 2018;
originally announced July 2018.
-
Pollution State Modeling for Mexico City
Authors:
Philip A. White,
Alan E. Gelfand,
Eliane R. Rodrigues,
Guadalupe Tzintzun
Abstract:
Ground-level ozone and particulate matter pollutants are associated with a variety of health issues and increased mortality. For this reason, Mexican environmental agencies regulate pollutant levels. In addition, Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and particulate matter with diameter less than 10 micrometers ($\text{PM}_{10}$). To pred…
▽ More
Ground-level ozone and particulate matter pollutants are associated with a variety of health issues and increased mortality. For this reason, Mexican environmental agencies regulate pollutant levels. In addition, Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and particulate matter with diameter less than 10 micrometers ($\text{PM}_{10}$). To predict local pollution emergencies and to assess compliance to Mexican ambient air quality standards, we analyze hourly ozone and $\text{PM}_{10}$ measurements from 24 stations across Mexico City from 2017 using a bivariate spatiotemporal model. Using this model, we predict future pollutant levels using current weather conditions and recent pollutant concentrations. Using hourly pollutant projections, we predict regional maxima needed to estimate the probability of future pollution emergencies. We discuss how predicted compliance to legislated pollution limits varies across regions within Mexico City in 2017. We find that predicted probability of pollution emergencies is limited to a few time periods. In contrast, we show that predicted exceedance of Mexican ambient air quality standards is a common, nearly daily occurrence.
△ Less
Submitted 10 July, 2018;
originally announced July 2018.
-
Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks
Authors:
Alex White,
Matthieu Vignes
Abstract:
Biological networks are a very convenient modelling and visualisation tool to discover knowledge from modern high-throughput genomics and postgenomics data sets. Indeed, biological entities are not isolated, but are components of complex multi-level systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems.We present the caus…
▽ More
Biological networks are a very convenient modelling and visualisation tool to discover knowledge from modern high-throughput genomics and postgenomics data sets. Indeed, biological entities are not isolated, but are components of complex multi-level systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems.We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set.
△ Less
Submitted 4 May, 2018;
originally announced May 2018.
-
Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models
Authors:
Rainier Barrett,
Shaoyi Jiang,
Andrew D White
Abstract:
Bayesian network models are finding success in characterizing enzyme-catalyzed reactions, slow conformational changes, predicting enzyme inhibition, and genomics. In this work, we apply them to statistical modeling of peptides by simultaneously identifying amino acid sequence motifs and using a motif-based model to clarify the role motifs may play in antimicrobial activity. We construct models of…
▽ More
Bayesian network models are finding success in characterizing enzyme-catalyzed reactions, slow conformational changes, predicting enzyme inhibition, and genomics. In this work, we apply them to statistical modeling of peptides by simultaneously identifying amino acid sequence motifs and using a motif-based model to clarify the role motifs may play in antimicrobial activity. We construct models of increasing sophistication, demonstrating how chemical knowledge of a peptide system may be embedded without requiring new derivation of model fitting equations after changing model structure. These models are used to construct classifiers with good performance (94% accuracy, Matthews correlation coefficient of 0.87) at predicting antimicrobial activity in peptides, while at the same time being built of interpretable parameters. We demonstrate use of these models to identify peptides that are potentially both antimicrobial and antifouling, and show that the background distribution of amino acids could play a greater role in activity than sequence motifs do. This provides an advancement in the type of peptide activity modeling that can be done and the ease in which models can be constructed.
△ Less
Submitted 17 April, 2018;
originally announced April 2018.
-
High-dimensional Stochastic Inversion via Adjoint Models and Machine Learning
Authors:
Charanraj A. Thimmisetty,
Wenju Zhao,
Xiao Chen,
Charles H. Tong,
Joshua A. White
Abstract:
Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even with gradient information provided. Moreover, the `nonlinear' mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the u…
▽ More
Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even with gradient information provided. Moreover, the `nonlinear' mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the use of efficient inversion algorithms designed for models with Gaussian assumptions. In this paper, we propose a novel Bayesian stochastic inversion methodology, characterized by a tight coupling between a gradient-based Langevin Markov Chain Monte Carlo (LMCMC) method and a kernel principal component analysis (KPCA). This approach addresses the `curse-of-dimensionality' via KPCA to identify a low-dimensional feature space within the high-dimensional and nonlinearly correlated spatial random field. Moreover, non-Gaussian full posterior probability distribution functions are estimated via an efficient LMCMC method on both the projected low-dimensional feature space and the recovered high-dimensional parameter space. We demonstrate this computational framework by integrating and adapting recent developments such as data-driven statistics-on-manifolds constructions and reduction-through-projection techniques to solve inverse problems in linear elasticity.
△ Less
Submitted 16 March, 2018;
originally announced March 2018.
-
A New Family of Near-metrics for Universal Similarity
Authors:
Chu Wang,
Iraj Saniee,
William S. Kennedy,
Chris A. White
Abstract:
We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near…
▽ More
We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near-metric family includes the forward k-step diffusion and its reverse, typically on the graph consisting of data objects and their features. By construction, this family of near-metrics is particularly appropriate for categorical data, continuous data, and vector representations of images and text extracted via deep learning approaches. We conduct extensive experiments to evaluate the performance of this family of similarity measures and compare and contrast with traditional measures of similarity used for each specific application and with the ground truth when available. We show that for structured data including categorical and continuous data, the near-metrics corresponding to normalized forward k-step diffusion (k small) work as one of the best performing similarity measures; for vector representations of text and images including those extracted from deep learning, the near-metrics derived from normalized and reverse k-step graph diffusion (k very small) exhibit outstanding ability to distinguish data points from different classes.
△ Less
Submitted 17 October, 2017; v1 submitted 21 July, 2017;
originally announced July 2017.
-
Accelerated Gradient Temporal Difference Learning
Authors:
Yangchen Pan,
Adam White,
Martha White
Abstract:
The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic…
▽ More
The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic developments have yielded several sub-quadratic methods that use an approximation to the least squares TD solution, but incur bias. In this paper, we propose a new family of accelerated gradient TD (ATD) methods that (1) provide similar data efficiency benefits to least-squares methods, at a fraction of the computation and storage (2) significantly reduce parameter sensitivity compared to linear TD methods, and (3) are asymptotically unbiased. We illustrate these claims with a proof of convergence in expectation and experiments on several benchmark domains and a large-scale industrial energy allocation domain.
△ Less
Submitted 9 March, 2017; v1 submitted 28 November, 2016;
originally announced November 2016.
-
Exponential Family Mixed Membership Models for Soft~Clustering of Multivariate Data
Authors:
Arthur White,
Thomas Brendan Murphy
Abstract:
For several years, model-based clustering methods have successfully tackled many of the challenges presented by data-analysts. However, as the scope of data analysis has evolved, some problems may be beyond the standard mixture model framework. One such problem is when observations in a dataset come from overlapping clusters, whereby different clusters will possess similar parameters for multiple…
▽ More
For several years, model-based clustering methods have successfully tackled many of the challenges presented by data-analysts. However, as the scope of data analysis has evolved, some problems may be beyond the standard mixture model framework. One such problem is when observations in a dataset come from overlapping clusters, whereby different clusters will possess similar parameters for multiple variables. In this setting, mixed membership models, a soft clustering approach whereby observations are not restricted to single cluster membership, have proved to be an effective tool. In this paper, a method for fitting mixed membership models to data generated by a member of an exponential family is outlined. The method is applied to count data obtained from an ultra running competition, and compared with a standard mixture model approach.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
Modeling Efficiency of Foreign Aid Allocation in Malawi
Authors:
Philip A. White,
Candace Berrett,
E. Shannon Neeley-Tass,
Michael G. Findley
Abstract:
The Open Aid Malawi initiative has collected an unprecedented database that identifies as much location-specific information as possible for each of over 2500 individual foreign aid donations to Malawi since 2003. Ensuring efficient use and distribution of that aid is important to donors and to Malawi citizens. However, because of individual donor goals and difficulty in tracking donor coordinatio…
▽ More
The Open Aid Malawi initiative has collected an unprecedented database that identifies as much location-specific information as possible for each of over 2500 individual foreign aid donations to Malawi since 2003. Ensuring efficient use and distribution of that aid is important to donors and to Malawi citizens. However, because of individual donor goals and difficulty in tracking donor coordination, determining presence or absence of efficient aid allocation is difficult. We compare several Bayesian spatial generalized linear mixed models to relate aid allocation to various economic indicators within seven donation sectors. We find that the spatial gamma regression model best predicts current aid allocation. Using this model, first we use inferences on coefficients to examine whether or not there is evidence of efficient aid allocation within each sector. Second, we use this model to determine a more efficient aid allocation scenario and compare this scenario to the current allocation to provide insight for future aid donations.
△ Less
Submitted 1 November, 2017; v1 submitted 9 August, 2016;
originally announced August 2016.
-
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning
Authors:
Martha White,
Adam White
Abstract:
One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-…
▽ More
One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-rate parameter, online during learning. Such meta-learning approaches can improve robustness of learning and enable specialization to current task, improving learning speed. For temporal-difference learning algorithms which we study here, there is yet another parameter, $λ$, that similarly impacts learning speed and stability in practice. Unfortunately, unlike the learning-rate parameter, $λ$ parametrizes the objective function that temporal-difference methods optimize. Different choices of $λ$ produce different fixed-point solutions, and thus adapting $λ$ online and characterizing the optimization is substantially more complex than adapting the learning-rate parameter. There are no meta-learning method for $λ$ that can achieve (1) incremental updating, (2) compatibility with function approximation, and (3) maintain stability of learning under both on and off-policy sampling. In this paper we contribute a novel objective function for optimizing $λ$ as a function of state rather than time. We derive a new incremental, linear complexity $λ$-adaption algorithm that does not require offline batch updating or access to a model of the world, and present a suite of experiments illustrating the practicality of our new algorithm in three different settings. Taken together, our contributions represent a concrete step towards black-box application of temporal-difference learning methods in real world problems.
△ Less
Submitted 24 October, 2016; v1 submitted 1 July, 2016;
originally announced July 2016.
-
Investigating practical linear temporal difference learning
Authors:
Adam White,
Martha White
Abstract:
Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximatio…
▽ More
Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximation, linear complexity, and temporal difference (TD) updates. This paper contains two main contributions. First, we derive two new hybrid TD policy-evaluation algorithms, which fill a gap in this collection of algorithms. Second, we perform an empirical comparison to elicit which of these new linear TD methods should be preferred in different situations, and make concrete suggestions about practical use.
△ Less
Submitted 30 March, 2016; v1 submitted 28 February, 2016;
originally announced February 2016.
-
Improved model-based clustering performance using Bayesian initialization averaging
Authors:
Adrian O'Hagan,
Arthur White
Abstract:
The Expectation-Maximization (EM) algorithm is a commonly used method for finding the maximum likelihood estimates of the parameters in a mixture model via coordinate ascent. A serious pitfall with the algorithm is that in the case of multimodal likelihood functions, it can get trapped at a local maximum. This problem often occurs when sub-optimal starting values are used to initialize the algorit…
▽ More
The Expectation-Maximization (EM) algorithm is a commonly used method for finding the maximum likelihood estimates of the parameters in a mixture model via coordinate ascent. A serious pitfall with the algorithm is that in the case of multimodal likelihood functions, it can get trapped at a local maximum. This problem often occurs when sub-optimal starting values are used to initialize the algorithm. Bayesian initialization averaging (BIA) is proposed as an ensemble method to generate high quality starting values for the EM algorithm. Competing sets of trial starting values are combined as a weighted average, which is then used as the starting position for a full EM run. The method can also be extended to variational Bayes (VB) methods, a class of algorithm similar to EM that is based on an approximation of the model posterior. The BIA method is demonstrated on real continuous, categorical and network data sets, and the convergent log-likelihoods and associated clustering solutions presented. These compare favorably with the output produced using competing initialization methods such as random starts, hierarchical clustering and deterministic annealing, with the highest available maximum likelihood estimates obtained in a higher percentage of cases, at reasonable computational cost. The implications of the different clustering solutions obtained by local maxima are also discussed.
△ Less
Submitted 30 August, 2018; v1 submitted 26 April, 2015;
originally announced April 2015.
-
An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage
Authors:
Michael R. Smith,
Andrew White,
Christophe Giraud-Carrier,
Tony Martinez
Abstract:
The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learn…
▽ More
The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learning algorithms can be compared easily and the knowledge gained from them can be used to improve their performance. The purpose of this work is to provide easy access to previous experimental results for learning and comparison. These stored results are comprehensive -- storing the prediction for each test instance as well as the learning algorithm, hyperparameters, and training set that were used. Previous results are particularly important for meta-learning, which, in a broad sense, is the process of learning from previous machine learning results such that the learning process is improved. While other experiment databases do exist, one of our focuses is on easy access to the data. We provide meta-learning data sets that are ready to be downloaded for meta-learning experiments. In addition, queries to the underlying database can be made if specific information is desired. We also differ from previous experiment databases in that our databases is designed at the instance level, where an instance is an example in a data set. We store the predictions of a learning algorithm trained on a specific training set for each instance in the test set. Data set level information can then be obtained by aggregating the results from the instances. The instance level information can be used for many tasks such as determining the diversity of a classifier or algorithmically determining the optimal subset of training instances for a learning algorithm.
△ Less
Submitted 5 June, 2014; v1 submitted 28 May, 2014;
originally announced May 2014.
-
Mixed-Membership of Experts Stochastic Blockmodel
Authors:
Arthur White,
Thomas Brendan Murphy
Abstract:
Social network analysis is the study of how links between a set of actors are formed. Typically, it is believed that links are formed in a structured manner, which may be due to, for example, political or material incentives, and which often may not be directly observable. The stochastic blockmodel represents this structure using latent groups which exhibit different connective properties, so that…
▽ More
Social network analysis is the study of how links between a set of actors are formed. Typically, it is believed that links are formed in a structured manner, which may be due to, for example, political or material incentives, and which often may not be directly observable. The stochastic blockmodel represents this structure using latent groups which exhibit different connective properties, so that conditional on the group membership of two actors, the probability of a link being formed between them is represented by a connectivity matrix. The mixed membership stochastic blockmodel (MMSBM) extends this model to allow actors membership to different groups, depending on the interaction in question, providing further flexibility.
Attribute information can also play an important role in explaining network formation. Network models which do not explicitly incorporate covariate information require the analyst to compare fitted network models to additional attributes in a post-hoc manner. We introduce the mixed membership of experts stochastic blockmodel, an extension to the MMSBM which incorporates covariate actor information into the existing model. The method is illustrated with application to the Lazega Lawyers dataset. Model and variable selection methods are also discussed.
△ Less
Submitted 1 April, 2014;
originally announced April 2014.