Skip to main content

Showing 1–50 of 52 results for author: White, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.11668  [pdf, ps, other

    stat.ME stat.CO

    Model-Based Clustering with Sequential Outlier Identification using the Distribution of Mahalanobis Distances

    Authors: Ultán P. Doherty, Paul D. McNicholas, Arthur White

    Abstract: The presence of outliers can prevent clustering algorithms from accurately determining an appropriate group structure within a data set. We present outlierMBC, a model-based approach for sequentially removing outliers and clustering the remaining observations. Our method identifies outliers one at a time while fitting a multivariate Gaussian mixture model to data. Since it can be difficult to clas… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: 25 pages, 5 figures

    MSC Class: 62H30 ACM Class: G.3

  2. arXiv:2505.05767  [pdf, ps, other

    stat.ME q-bio.QM

    Model-based calibration of gear-specific fish abundance survey data as a change-of-support problem

    Authors: Grace S. Chiu, Anton H. Westveld, Mark A. Albins, Kevin M. Boswell, John M. Hoenig, Sean P. Powers, S. Lynne Stokes, Allison L. White

    Abstract: In a continental-scale fish abundance study, a major challenge in deriving an absolute abundance estimate lies in the fact that regional surveys deploy different gear types, each with its unique field of view, producing gear-specific relative abundance data. Thus, data from regional surveys in the study must be converted from the gear-specific relative scale to an absolute scale before being combi… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  3. arXiv:2410.23667  [pdf, other

    cs.LG physics.comp-ph stat.ML

    Projected Neural Differential Equations for Learning Constrained Dynamics

    Authors: Alistair White, Anna Büttner, Maximilian Gelbrecht, Valentin Duruisseaux, Niki Kilbertus, Frank Hellmann, Niklas Boers

    Abstract: Neural differential equations offer a powerful approach for learning dynamics from data. However, they do not impose known constraints that should be obeyed by the learned model. It is well-known that enforcing constraints in surrogate models can enhance their generalizability and numerical stability. In this paper, we introduce projected neural differential equations (PNDEs), a new method for con… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

    Comments: 17 pages, 6 figures

  4. arXiv:2406.16241  [pdf, other

    cs.LG stat.ME

    Position: Benchmarking is Limited in Reinforcement Learning Research

    Authors: Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas

    Abstract: Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 19 pages, 13 figures, The Forty-first International Conference on Machine Learning (ICML 2024)

  5. arXiv:2401.00568  [pdf, other

    stat.ME stat.CO

    Extrapolation of Relative Treatment Effects using Change-point Survival Models

    Authors: Philip Cooney, Arthur White

    Abstract: Introduction: Modelling of relative treatment effects is an important aspect to consider when extrapolating the long-term survival outcomes of treatments. Flexible parametric models offer the ability to accurately model the observed data, however, the extrapolated relative treatment effects and subsequent survival function may lack face validity. Methods: We investigate the ability of change-point… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  6. arXiv:2309.06299  [pdf, other

    cs.LG stat.AP stat.ML

    Modeling Supply and Demand in Public Transportation Systems

    Authors: Miranda Bihler, Hala Nelson, Erin Okey, Noe Reyes Rivas, John Webb, Anna White

    Abstract: We propose two neural network based and data-driven supply and demand models to analyze the efficiency, identify service gaps, and determine the significant predictors of demand, in the bus system for the Department of Public Transportation (HDPT) in Harrisonburg City, Virginia, which is the home to James Madison University (JMU). The supply and demand models, one temporal and one spatial, take ma… ▽ More

    Submitted 20 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 28 pages, 2022 REU project at James Madison University

    MSC Class: 00A69; 62-07; 62P30

  7. arXiv:2306.09739  [pdf, other

    cs.LG physics.comp-ph stat.ML

    Stabilized Neural Differential Equations for Learning Dynamics with Explicit Constraints

    Authors: Alistair White, Niki Kilbertus, Maximilian Gelbrecht, Niklas Boers

    Abstract: Many successful methods to learn dynamical systems from data have recently been introduced. However, ensuring that the inferred dynamics preserve known constraints, such as conservation laws or restrictions on the allowed system states, remains challenging. We propose stabilized neural differential equations (SNDEs), a method to enforce arbitrary manifold constraints for neural differential equati… ▽ More

    Submitted 15 February, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 22 pages, 8 figures. Accepted at NeurIPS 2023

  8. arXiv:2305.10379  [pdf, other

    cs.LG cs.NE physics.chem-ph stat.ML

    Active Learning in Symbolic Regression with Physical Constraints

    Authors: Jorge Medina, Andrew D. White

    Abstract: Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the commit… ▽ More

    Submitted 9 August, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

  9. arXiv:2304.05376  [pdf, other

    physics.chem-ph stat.ML

    ChemCrow: Augmenting large-language models with chemistry tools

    Authors: Andres M Bran, Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D White, Philippe Schwaller

    Abstract: Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack ac… ▽ More

    Submitted 2 October, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Experimental results

  10. arXiv:2304.00085  [pdf, other

    math.ST stat.AP

    Pointwise density estimation on metric spaces and applications in seismology

    Authors: Galatia Cleanthous, Athanasios G. Georgiadis, Philip A. White

    Abstract: We are studying the problem of estimating density in a wide range of metric spaces, including the Euclidean space, the sphere, the ball, and various Riemannian manifolds. Our framework involves a metric space with a doubling measure and a self-adjoint operator, whose heat kernel exhibits Gaussian behaviour. We begin by reviewing the construction of kernel density estimators and the related backgro… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  11. arXiv:2302.06391  [pdf, other

    stat.ML cs.AI

    Incorporating Expert Opinion on Observable Quantities into Statistical Models -- A General Framework

    Authors: Philip Cooney, Arthur White

    Abstract: This article describes an approach to incorporate expert opinion on observable quantities through the use of a loss function which updates a prior belief as opposed to specifying parameters on the priors. Eliciting information on observable quantities allows experts to provide meaningful information on a quantity familiar to them, in contrast to elicitation on model parameters, which may be subjec… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  12. arXiv:2210.00409  [pdf, other

    stat.AP stat.ME

    Joint Multivariate and Functional Modeling for Plant Traits and Reflectances

    Authors: Philip A. White, Michael F. Christensen, Henry Frye, Alan E. Gelfand, John A. Silander Jr

    Abstract: The investigation of leaf-level traits in response to varying environmental conditions has immense importance for understanding plant ecology. Remote sensing technology enables measurement of the reflectance of plants to make inferences about underlying traits along environmental gradients. While much focus has been placed on understanding how reflectance and traits are related at the leaf-level,… ▽ More

    Submitted 1 October, 2022; originally announced October 2022.

  13. arXiv:2208.03359  [pdf, other

    stat.ME math.ST

    Nonseparable Space-Time Stationary Covariance Functions on Networks cross Time

    Authors: Emilio Porcu, Philip A. White, Marc G. Genton

    Abstract: The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classe… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

  14. arXiv:2112.03962  [pdf, other

    stat.ME

    Change-point Detection for Piecewise Exponential Models

    Authors: Philip Cooney, Arthur White

    Abstract: In decision modelling with time to event data, parametric models are often used to extrapolate the survivor function. One such model is the piecewise exponential model whereby the hazard function is partitioned into segments, with the hazard constant within the segment and independent between segments and the boundaries of these segments are known as change-points. We present an approach for deter… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: 21 pages, 3 figures

  15. arXiv:2112.02288  [pdf, other

    stat.AP stat.ME

    Utilizing Expert Opinion to inform Extrapolation of Survival Models

    Authors: Philip Cooney, Arthur White

    Abstract: In decision modelling with time to event data, there are a variety of parametric models which could be used to extrapolate the survivor function. Each of these implies a different hazard function and in situations where there is moderate censoring, they can result in quite different extrapolations. Expert opinion on the long-term survival or other quantities could reduce model uncertainty. We pres… ▽ More

    Submitted 4 December, 2021; originally announced December 2021.

    Comments: 13 pages, 5 figures

  16. arXiv:2109.13891  [pdf, other

    stat.ML cs.LG math.PR

    Gaussian Processes to speed up MCMC with automatic exploratory-exploitation effect

    Authors: Alessio Benavoli, Jason Wyse, Arthur White

    Abstract: We present a two-stage Metropolis-Hastings algorithm for sampling probabilistic models, whose log-likelihood is computationally expensive to evaluate, by using a surrogate Gaussian Process (GP) model. The key feature of the approach, and the difference w.r.t. previous works, is the ability to learn the target distribution from scratch (while sampling), and so without the need of pre-training the G… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

  17. arXiv:2108.05837  [pdf, other

    stat.AP eess.SY stat.CO

    City-wide modeling of Vehicle-to-Grid Economics to Understand Effects of Battery Performance

    Authors: Heta A. Gandhi, Andrew D. White

    Abstract: Vehicle-to-grid (V2G) is a promising approach to solve the problem of grid-level intermittent supply and demand mismatch, caused due to renewable energy resources, because it uses the existing resource of electric vehicle (EV) batteries as the energy storage medium. EV battery design together with an impetus on profitability for participating EV owners is pivotal for V2G success. To better underst… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: 17 Pages, 10 Figures, 1 Table

  18. arXiv:2107.05405  [pdf, other

    cs.LG stat.ML

    Learning Expected Emphatic Traces for Deep RL

    Authors: Ray Jiang, Shangtong Zhang, Veronica Chelu, Adam White, Hado van Hasselt

    Abstract: Off-policy sampling and experience replay are key for improving sample efficiency and scaling model-free temporal difference learning methods. When combined with function approximation, such as neural networks, this combination is known as the deadly triad and is potentially unstable. Recently, it has been shown that stability and good performance at scale can be achieved by combining emphatic wei… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

  19. arXiv:2106.11779  [pdf, other

    cs.LG stat.ML

    Emphatic Algorithms for Deep Reinforcement Learning

    Authors: Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

    Abstract: Off-policy learning allows us to learn about possible policies of behavior from experience generated by a different behavior policy. Temporal difference (TD) learning algorithms can become unstable when combined with function approximation and off-policy sampling - this is known as the ''deadly triad''. Emphatic temporal difference (ETD($λ$)) algorithm ensures convergence in the linear case by app… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  20. Simulation-Based Inference with Approximately Correct Parameters via Maximum Entropy

    Authors: Rainier Barrett, Mehrad Ansari, Gourab Ghoshal, Andrew D White

    Abstract: Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (Ma… ▽ More

    Submitted 23 August, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: 16 pages, 4 figures

  21. arXiv:2102.03249  [pdf, other

    stat.AP stat.ME

    Spatial Functional Data Modeling of Plant Reflectances

    Authors: Philip A. White, Henry Frye, Michael F. Christensen, Alan E. Gelfand, John A. Silander Jr

    Abstract: Plant reflectance spectra - the profile of light reflected by leaves across different wavelengths - supply the spectral signature for a species at a spatial location to enable estimation of functional and taxonomic diversity for plants. We consider leaf spectra as "responses" to be explained spatially. These spectra/reflectances are functions over a wavelength band that respond to the environment.… ▽ More

    Submitted 25 March, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: 20 pages main manuscript, 20 pages supplement

  22. arXiv:2007.04921  [pdf, other

    q-bio.QM cs.LG stat.ML

    Graph Neural Network Based Coarse-Grained Mapping Prediction

    Authors: Zhiheng Li, Geemi P. Wellawatte, Maghesree Chakraborty, Heta A. Gandhi, Chenliang Xu, Andrew D. White

    Abstract: The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning… ▽ More

    Submitted 19 August, 2021; v1 submitted 24 June, 2020; originally announced July 2020.

  23. arXiv:2007.03807  [pdf, other

    cs.LG cs.AI stat.ML

    Towards a practical measure of interference for reinforcement learning

    Authors: Vincent Liu, Adam White, Hengshuai Yao, Martha White

    Abstract: Catastrophic interference is common in many network-based learning systems, and many proposals exist for mitigating it. But, before we overcome interference we must understand it better. In this work, we provide a definition of interference for control in reinforcement learning. We systematically evaluate our new measures, by assessing correlation with several measures of learning performance, inc… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 18 pages

  24. arXiv:2007.00611  [pdf, other

    cs.LG cs.AI stat.ML

    Gradient Temporal-Difference Learning with Regularized Corrections

    Authors: Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White

    Abstract: It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well. However, recent work with large neural network learning systems reveals that instability is more common than previously thought. Practitioners face a difficult dilemma: choose an ea… ▽ More

    Submitted 17 September, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Appeared in Proceedings of the 37th International Conference on Machine Learning (ICML2020)

  25. arXiv:2001.05520  [pdf, other

    stat.AP stat.ME

    Hierarchical Integrated Spatial Process Modeling of Monotone West Antarctic Snow Density Curves

    Authors: Philip A. White, Durban G. Keeler, Summer Rupper

    Abstract: Snow density estimates below the surface, used with airplane-acquired ice-penetrating radar measurements, give a site-specific history of snow water accumulation. Because it is infeasible to drill snow cores across all of Antarctica to measure snow density and because it is critical to understand how climatic changes are affecting the world's largest freshwater reservoir, we develop methods that e… ▽ More

    Submitted 19 July, 2021; v1 submitted 15 January, 2020; originally announced January 2020.

    Journal ref: the Annals of Applied Statistics 2021, Vol. 15, No. 2, 556-571

  26. arXiv:1911.09103  [pdf, other

    q-bio.BM cs.LG stat.ML

    Investigating Active Learning and Meta-Learning for Iterative Peptide Design

    Authors: Rainier Barrett, Andrew D. White

    Abstract: Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here,… ▽ More

    Submitted 10 December, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: 19 pages, 8 figures, 9 tables

  27. arXiv:1910.06897  [pdf, other

    stat.ME stat.AP stat.CO

    Generalized Evolutionary Point Processes: Model Specifications and Model Comparison

    Authors: Philip A. White, Alan E. Gelfand

    Abstract: Generalized evolutionary point processes offer a class of point process models that allows for either excitation or inhibition based upon the history of the process. In this regard, we propose modeling which comprises generalization of the nonlinear Hawkes process. Working within a Bayesian framework, model fitting is implemented through Markov chain Monte Carlo. This entails discussion of computa… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

    Journal ref: Methodology and Computing in Applied Probability (2020+)

  28. arXiv:1907.07751  [pdf, other

    cs.LG stat.ML

    Meta-descent for Online, Continual Prediction

    Authors: Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White

    Abstract: This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad, RMSProp, and AMSGrad, keep statistics about the learning process to approximate a second order upda… ▽ More

    Submitted 13 December, 2019; v1 submitted 17 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence 2019. v2: Correction to Baird's counterexample. A bug in the code lead to results being reported for AMSGrad in this experiment, when they were actually results for Adam

  29. arXiv:1906.07865  [pdf, other

    cs.LG cs.AI stat.ML

    Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study

    Authors: Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White

    Abstract: Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience---how to adapt the learning system's behavior---to optimi… ▽ More

    Submitted 21 August, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

  30. Multivariate Functional Data Modeling with Time-varying Clustering

    Authors: Philip A. White, Alan E. Gelfand

    Abstract: We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM$_{10}$ levels as a function of time over the course of a year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City which record hourly ozone and PM$_{10}$ levels. We use the data… ▽ More

    Submitted 1 May, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

    Journal ref: TEST (2020+)

  31. arXiv:1904.01191  [pdf, other

    cs.LG cs.AI stat.ML

    Planning with Expectation Models

    Authors: Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton

    Abstract: Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments… ▽ More

    Submitted 29 July, 2020; v1 submitted 1 April, 2019; originally announced April 2019.

  32. A quantile-based g-computation approach to addressing the effects of exposure mixtures

    Authors: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, Alexandra J. White

    Abstract: Exposure mixtures frequently occur in data across many domains, particularly in the fields of environmental and nutritional epidemiology. Various strategies have arisen to answer questions about mixtures, including methods such as weighted quantile sum (WQS) regression that estimate a joint effect of the mixture components.We demonstrate a new approach to estimating the joint effects of a mixture:… ▽ More

    Submitted 11 March, 2020; v1 submitted 11 February, 2019; originally announced February 2019.

    Comments: Main manuscript (3 figures, 4 tables, 7000 words) + appendix

  33. arXiv:1811.02597  [pdf, other

    cs.LG cs.AI stat.ML

    Online Off-policy Prediction

    Authors: Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White

    Abstract: This paper investigates the problem of online prediction learning, where learning proceeds continuously as the agent interacts with an environment. The predictions made by the agent are contingent on a particular way of behaving, represented as a value function. However, the behavior used to select actions and generate the behavior data might be different from the one used to define the prediction… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Comments: 68 pages

  34. arXiv:1810.09103  [pdf, other

    cs.LG cs.AI stat.ML

    Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement

    Authors: Samuel Neumann, Sungsu Lim, Ajin Joseph, Yangchen Pan, Adam White, Martha White

    Abstract: Many policy gradient methods are variants of Actor-Critic (AC), where a value function (critic) is learned to facilitate updating the parameterized policy (actor). The update to the actor involves a log-likelihood update weighted by the action-values, with the addition of entropy regularization for soft variants. In this work, we explore an alternative update for the actor, based on an extension o… ▽ More

    Submitted 28 February, 2023; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 27 pages, 8 figures

  35. arXiv:1807.06763  [pdf, other

    cs.LG cs.AI stat.ML

    General Value Function Networks

    Authors: Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White

    Abstract: State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions… ▽ More

    Submitted 2 February, 2021; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: Published in the Journal of Artificial Intelligence Research

    Journal ref: Journal of Artificial Intelligence Research, 70, 497-543 (2021)

  36. arXiv:1807.05600  [pdf, other

    stat.AP stat.ME

    Modeling Daily Seasonality of Mexico City Ozone using Nonseparable Covariance Models on Circles Cross Time

    Authors: Philip A. White, Emilio Porcu

    Abstract: Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonse… ▽ More

    Submitted 15 July, 2018; originally announced July 2018.

    Journal ref: Environmetrics. 2019;e2558

  37. Non-separable Nearest-Neighbor Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection

    Authors: Philip A. White, C. Shane Reese, William F. Christensen, Summer Rupper

    Abstract: Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB over the Antarctic ice sheets. Using a quality-controlled aggregate dataset of SMB field measurements with significantly more observations than previous analyses, a fully Bayesian nearest-neighbor Gaussian process model is posed to estimate Antarc… ▽ More

    Submitted 14 July, 2018; originally announced July 2018.

    Journal ref: Environmetrics. 2019;e2579

  38. Pollution State Modeling for Mexico City

    Authors: Philip A. White, Alan E. Gelfand, Eliane R. Rodrigues, Guadalupe Tzintzun

    Abstract: Ground-level ozone and particulate matter pollutants are associated with a variety of health issues and increased mortality. For this reason, Mexican environmental agencies regulate pollutant levels. In addition, Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and particulate matter with diameter less than 10 micrometers ($\text{PM}_{10}$). To pred… ▽ More

    Submitted 10 July, 2018; originally announced July 2018.

    Journal ref: J. R. Stat. Soc. A., 182(3), 1039-1060 (2019)

  39. arXiv:1805.01608  [pdf, other

    q-bio.QM q-bio.MN stat.AP stat.ML

    Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks

    Authors: Alex White, Matthieu Vignes

    Abstract: Biological networks are a very convenient modelling and visualisation tool to discover knowledge from modern high-throughput genomics and postgenomics data sets. Indeed, biological entities are not isolated, but are components of complex multi-level systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems.We present the caus… ▽ More

    Submitted 4 May, 2018; originally announced May 2018.

    Comments: This chapter will appear in the forthcoming book "Gene Regulatory Networks: Methods and Protocols", published by Springer Nature

  40. arXiv:1804.06327  [pdf, ps, other

    stat.AP q-bio.BM stat.ML

    Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models

    Authors: Rainier Barrett, Shaoyi Jiang, Andrew D White

    Abstract: Bayesian network models are finding success in characterizing enzyme-catalyzed reactions, slow conformational changes, predicting enzyme inhibition, and genomics. In this work, we apply them to statistical modeling of peptides by simultaneously identifying amino acid sequence motifs and using a motif-based model to clarify the role motifs may play in antimicrobial activity. We construct models of… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

    Comments: 19 pages, 7 figures, 1 table, supporting information included

    MSC Class: 62P10

    Journal ref: Peptide Science, Volume 110, Issue 4, 2018

  41. arXiv:1803.06295  [pdf, other

    stat.CO

    High-dimensional Stochastic Inversion via Adjoint Models and Machine Learning

    Authors: Charanraj A. Thimmisetty, Wenju Zhao, Xiao Chen, Charles H. Tong, Joshua A. White

    Abstract: Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even with gradient information provided. Moreover, the `nonlinear' mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the u… ▽ More

    Submitted 16 March, 2018; originally announced March 2018.

  42. arXiv:1707.06903  [pdf, other

    stat.ML cs.LG

    A New Family of Near-metrics for Universal Similarity

    Authors: Chu Wang, Iraj Saniee, William S. Kennedy, Chris A. White

    Abstract: We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near… ▽ More

    Submitted 17 October, 2017; v1 submitted 21 July, 2017; originally announced July 2017.

  43. arXiv:1611.09328  [pdf, other

    cs.AI cs.LG stat.ML

    Accelerated Gradient Temporal Difference Learning

    Authors: Yangchen Pan, Adam White, Martha White

    Abstract: The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic… ▽ More

    Submitted 9 March, 2017; v1 submitted 28 November, 2016; originally announced November 2016.

    Comments: AAAI Conference on Artificial Intelligence (AAAI), 2017

  44. Exponential Family Mixed Membership Models for Soft~Clustering of Multivariate Data

    Authors: Arthur White, Thomas Brendan Murphy

    Abstract: For several years, model-based clustering methods have successfully tackled many of the challenges presented by data-analysts. However, as the scope of data analysis has evolved, some problems may be beyond the standard mixture model framework. One such problem is when observations in a dataset come from overlapping clusters, whereby different clusters will possess similar parameters for multiple… ▽ More

    Submitted 10 August, 2016; originally announced August 2016.

    Journal ref: White, A. & Murphy, T.B. Adv Data Anal Classif (2016) 10: 521

  45. Modeling Efficiency of Foreign Aid Allocation in Malawi

    Authors: Philip A. White, Candace Berrett, E. Shannon Neeley-Tass, Michael G. Findley

    Abstract: The Open Aid Malawi initiative has collected an unprecedented database that identifies as much location-specific information as possible for each of over 2500 individual foreign aid donations to Malawi since 2003. Ensuring efficient use and distribution of that aid is important to donors and to Malawi citizens. However, because of individual donor goals and difficulty in tracking donor coordinatio… ▽ More

    Submitted 1 November, 2017; v1 submitted 9 August, 2016; originally announced August 2016.

    Journal ref: The American Statistician, 73(4), 385-399, (2018)

  46. arXiv:1607.00446  [pdf, other

    cs.AI cs.LG stat.ML

    A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

    Authors: Martha White, Adam White

    Abstract: One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-… ▽ More

    Submitted 24 October, 2016; v1 submitted 1 July, 2016; originally announced July 2016.

  47. arXiv:1602.08771  [pdf, other

    cs.LG cs.AI stat.ML

    Investigating practical linear temporal difference learning

    Authors: Adam White, Martha White

    Abstract: Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximatio… ▽ More

    Submitted 30 March, 2016; v1 submitted 28 February, 2016; originally announced February 2016.

    Comments: Autonomous Agents and Multi-agent Systems, 2016

  48. arXiv:1504.06870  [pdf, ps, other

    stat.CO

    Improved model-based clustering performance using Bayesian initialization averaging

    Authors: Adrian O'Hagan, Arthur White

    Abstract: The Expectation-Maximization (EM) algorithm is a commonly used method for finding the maximum likelihood estimates of the parameters in a mixture model via coordinate ascent. A serious pitfall with the algorithm is that in the case of multimodal likelihood functions, it can get trapped at a local maximum. This problem often occurs when sub-optimal starting values are used to initialize the algorit… ▽ More

    Submitted 30 August, 2018; v1 submitted 26 April, 2015; originally announced April 2015.

  49. arXiv:1405.7292  [pdf, ps, other

    stat.ML cs.LG

    An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage

    Authors: Michael R. Smith, Andrew White, Christophe Giraud-Carrier, Tony Martinez

    Abstract: The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learn… ▽ More

    Submitted 5 June, 2014; v1 submitted 28 May, 2014; originally announced May 2014.

    Comments: 7 pages, 1 figure, 6 tables

  50. arXiv:1404.0221  [pdf, other

    stat.CO stat.ME

    Mixed-Membership of Experts Stochastic Blockmodel

    Authors: Arthur White, Thomas Brendan Murphy

    Abstract: Social network analysis is the study of how links between a set of actors are formed. Typically, it is believed that links are formed in a structured manner, which may be due to, for example, political or material incentives, and which often may not be directly observable. The stochastic blockmodel represents this structure using latent groups which exhibit different connective properties, so that… ▽ More

    Submitted 1 April, 2014; originally announced April 2014.

    Comments: 32 pages, 8 figures

    Journal ref: Network Science, 4, pp 48-80 (2016)