Skip to main content

Showing 1–50 of 50 results for author: White, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2409.18464  [pdf, other

    hep-ph stat.ML

    A comparison of Bayesian sampling algorithms for high-dimensional particle physics and cosmology applications

    Authors: Joshua Albert, Csaba Balazs, Andrew Fowlie, Will Handley, Nicholas Hunt-Smith, Roberto Ruiz de Austri, Martin White

    Abstract: For several decades now, Bayesian inference techniques have been applied to theories of particle physics, cosmology and astrophysics to obtain the probability density functions of their free parameters. In this study, we review and compare a wide range of Markov Chain Monte Carlo (MCMC) and nested sampling techniques to determine their relative efficacy on functions that resemble those encountered… ▽ More

    Submitted 24 November, 2024; v1 submitted 27 September, 2024; originally announced September 2024.

    Comments: 45 pages, 8 figures, 20 tables

  2. arXiv:2406.16241  [pdf, other

    cs.LG stat.ME

    Position: Benchmarking is Limited in Reinforcement Learning Research

    Authors: Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas

    Abstract: Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 19 pages, 13 figures, The Forty-first International Conference on Machine Learning (ICML 2024)

  3. arXiv:2402.13425  [pdf, other

    cs.LG cs.AI stat.ML

    Investigating the Histogram Loss in Regression

    Authors: Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White

    Abstract: It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction. This additional modeling often comes with performance gain and the reasons behind the improvement are not fully known. This paper investigates a recent approach to regression, the Histogram Loss, which involves learning the conditional distr… ▽ More

    Submitted 19 October, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 52 pages

  4. arXiv:2312.02953  [pdf

    stat.AP q-bio.QM

    Longitudinal Assessment of Seasonal Impacts and Depression Associations on Circadian Rhythm Using Multimodal Wearable Sensing

    Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Yatharth Ranjan, Zulqarnain Rashid, Callum Stewart, Pauline Conde, Heet Sankesara, Petroula Laiou, Faith Matcham, Katie M White, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Srinivasan Vairavan, Inez Myin-Germeys, David C. Mohr, Til Wykes, Josep Maria Haro, Peter Annas, Brenda WJH Penninx, Vaibhav A Narayan, Matthew Hotopf , et al. (2 additional authors not shown)

    Abstract: Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  5. arXiv:2309.01454  [pdf, other

    hep-ph stat.ML

    Accelerating Markov Chain Monte Carlo sampling with diffusion models

    Authors: N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W Thomas, M. J. White

    Abstract: Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the contex… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 21 pages, 8 figures, 1 table

  6. Discovering a change point and piecewise linear structure in a time series of organoid networks via the iso-mirror

    Authors: Tianyi Chen, Youngser Park, Ali Saad-Eldin, Zachary Lubberts, Avanti Athreya, Benjamin D. Pedigo, Joshua T. Vogelstein, Francesca Puppo, Gabriel A. Silva, Alysson R. Muotri, Weiwei Yang, Christopher M. White, Carey E. Priebe

    Abstract: Recent advancements have been made in the development of cell-based in-vitro neuronal networks, or organoids. In order to better understand the network structure of these organoids, a super-selective algorithm has been proposed for inferring the effective connectivity networks from multi-electrode array data. In this paper, we apply a novel statistical method called spectral mirror estimation to t… ▽ More

    Submitted 12 April, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Journal ref: Appl Netw Sci 8, 75 (2023)

  7. arXiv:2203.11992  [pdf, other

    cs.LG stat.ML

    Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum

    Authors: Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White

    Abstract: Most convergence guarantees for stochastic gradient descent with momentum (SGDm) rely on iid sampling. Yet, SGDm is often used outside this regime, in settings with temporally correlated input samples such as continual learning and reinforcement learning. Existing work has shown that SGDm with a decaying step-size can converge under Markovian temporal correlation. In this work, we show that SGDm u… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: In International Conference on Learning Representations. 2021

  8. arXiv:2108.13637  [pdf, other

    cs.LG cs.AI q-bio.NC stat.ML

    When are Deep Networks really better than Decision Forests at small sample sizes, and how?

    Authors: Haoyin Xu, Kaleab A. Kinfu, Will LeVine, Sambit Panda, Jayanta Dey, Michael Ainsworth, Yu-Chung Peng, Madi Kusmanov, Florian Engert, Christopher M. White, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Deep networks and decision forests (such as random forests and gradient boosted trees) are the leading machine learning methods for structured and tabular data, respectively. Many papers have empirically compared large numbers of classifiers on one or two different domains (e.g., on 100 different tabular data settings). However, a careful conceptual and empirical comparison of these two strategies… ▽ More

    Submitted 2 November, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

  9. arXiv:2106.12621  [pdf, other

    cs.LG cs.IR stat.ME

    Leveraging semantically similar queries for ranking via combining representations

    Authors: Hayden S. Helm, Marah Abdin, Benjamin D. Pedigo, Shweti Mahajan, Vince Lyzinski, Youngser Park, Amitabh Basu, Piali~Choudhury, Christopher M. White, Weiwei Yang, Carey E. Priebe

    Abstract: In modern ranking problems, different and disparate representations of the items to be ranked are often available. It is sensible, then, to try to combine these representations to improve ranking. Indeed, learning to rank via combining representations is both principled and practical for learning a ranking function for a particular query. In extremely data-scarce settings, however, the amount of l… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  10. arXiv:2105.14027  [pdf, other

    hep-ph hep-ex physics.data-an stat.ML

    The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider

    Authors: T. Aarrestad, M. van Beekveld, M. Bona, A. Boveia, S. Caron, J. Davies, A. De Simone, C. Doglioni, J. M. Duarte, A. Farbin, H. Gupta, L. Hendriks, L. Heinrich, J. Howarth, P. Jawahar, A. Jueid, J. Lastow, A. Leinweber, J. Mamuzic, E. Merényi, A. Morandini, P. Moskvitina, C. Nellist, J. Ngadiuba, B. Ostdiek , et al. (14 additional authors not shown)

    Abstract: We describe the outcome of a data challenge conducted as part of the Dark Machines Initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We defin… ▽ More

    Submitted 9 December, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: v1: 54 pages, 24 figures. v2: 56 pages, citations added, extend discussion of look-elsewhere-effect, results unchanged; v3. minor typos and updated references

    Journal ref: SciPost Phys. 12, 043 (2022)

  11. arXiv:2011.06557  [pdf, other

    stat.ML cs.LG stat.ME

    A partition-based similarity for classification distributions

    Authors: Hayden S. Helm, Ronak D. Mehta, Brandon Duderstadt, Weiwei Yang, Christoper M. White, Ali Geisa, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular, we propose a novel similarity on classification distributions, dubbed task similarity, that quantifies how an optimally-transformed optimal representation for a… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  12. arXiv:2007.06414  [pdf, other

    q-bio.PE stat.AP

    Epidemic modelling of bovine tuberculosis in cattle herds and badgers in Ireland

    Authors: L. M. White, G. E. Kelly

    Abstract: Bovine tuberculosis, a disease that affects cattle and badgers in Ireland, was studied via stochastic epidemic modeling using incidence data from the Four Area Project (Griffin et al., 2005). The Four Area Project was a large scale field trial conducted in four diverse farming regions of Ireland over a five-year period (1997-2002) to evaluate the impact of badger culling on bovine tuberculosis inc… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: 32 pages, 2 figures

  13. arXiv:2007.03807  [pdf, other

    cs.LG cs.AI stat.ML

    Towards a practical measure of interference for reinforcement learning

    Authors: Vincent Liu, Adam White, Hengshuai Yao, Martha White

    Abstract: Catastrophic interference is common in many network-based learning systems, and many proposals exist for mitigating it. But, before we overcome interference we must understand it better. In this work, we provide a definition of interference for control in reinforcement learning. We systematically evaluate our new measures, by assessing correlation with several measures of learning performance, inc… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 18 pages

  14. arXiv:2007.02418  [pdf, other

    cs.LG cs.AI stat.ML

    Selective Dyna-style Planning Under Limited Model Capacity

    Authors: Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White

    Abstract: In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but… ▽ More

    Submitted 7 March, 2021; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020

  15. arXiv:2007.00611  [pdf, other

    cs.LG cs.AI stat.ML

    Gradient Temporal-Difference Learning with Regularized Corrections

    Authors: Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White

    Abstract: It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well. However, recent work with large neural network learning systems reveals that instability is more common than previously thought. Practitioners face a difficult dilemma: choose an ea… ▽ More

    Submitted 17 September, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Appeared in Proceedings of the 37th International Conference on Machine Learning (ICML2020)

  16. arXiv:2006.07461  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Causal Models Online

    Authors: Khurram Javed, Martha White, Yoshua Bengio

    Abstract: Predictive models -- learned from observational data not covering the complete data distribution -- can rely on spurious correlations in the data for making predictions. These correlations make the models brittle and hinder generalization. One solution for achieving strong generalization is to incorporate causal structures in the models; such structures constrain learning by ignoring correlations… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: Spurious features, causal models, online learning, random search, non-iid

  17. arXiv:2006.04363  [pdf, other

    cs.LG cs.AI stat.ML

    Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models

    Authors: Taher Jafferjee, Ehsan Imani, Erin Talvitie, Martha White, Micheal Bowling

    Abstract: Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we investigate one type of model error: hallucinated s… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

    Comments: 9 pages, 7 figures,

  18. arXiv:2005.08158  [pdf, other

    cs.LG stat.ML

    Optimizing for the Future in Non-Stationary MDPs

    Authors: Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

    Abstract: Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary. However, in many real-world applications, this assumption is violated, and using existing algorithms may result in a performance lag. To proactively search for a good future policy, we present a policy grad… ▽ More

    Submitted 21 September, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: Thirty-seventh International Conference on Machine Learning (ICML 2020)

  19. arXiv:2004.12908  [pdf, other

    cs.AI cs.LG stat.ML

    Simple Lifelong Learning Machines

    Authors: Jayanta Dey, Joshua T. Vogelstein, Hayden S. Helm, Will LeVine, Ronak D. Mehta, Tyler M. Tomita, Haoyin Xu, Ali Geisa, Qingyang Wang, Gido M. van de Ven, Chenyu Gao, Bryan Tower, Jonathan Larson, Christopher M. White, Carey E. Priebe

    Abstract: In lifelong learning, data are used to improve performance not only on the present task, but also on past and future (unencountered) tasks. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain perf… ▽ More

    Submitted 20 April, 2025; v1 submitted 27 April, 2020; originally announced April 2020.

  20. arXiv:2002.06195  [pdf, other

    stat.ML cs.LG

    An implicit function learning approach for parametric modal regression

    Authors: Yangchen Pan, Ehsan Imani, Martha White, Amir-massoud Farahmand

    Abstract: For multi-valued functions---such as when the conditional distribution on targets given the inputs is multi-modal---standard regression approaches are not always desirable because they provide the conditional mean. Modal regression algorithms address this issue by instead finding the conditional mode(s). Most, however, are nonparametric approaches and so can be difficult to scale. Further, paramet… ▽ More

    Submitted 29 October, 2020; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: Accepted to NeurIPS 2020

  21. arXiv:1910.01705  [pdf, other

    cs.LG stat.ML

    Is Fast Adaptation All You Need?

    Authors: Khurram Javed, Hengshuai Yao, Martha White

    Abstract: Gradient-based meta-learning has proven to be highly effective at learning model initializations, representations, and update rules that allow fast adaptation from a few samples. The core idea behind these approaches is to use fast adaptation and generalization -- two second-order metrics -- as training signals on a meta-training dataset. However, little attention has been given to other possible… ▽ More

    Submitted 3 October, 2019; originally announced October 2019.

    Comments: Meta Learning Workshop, NeurIPS 2019, 2 figures, MRCL, MAML

  22. arXiv:1909.01936  [pdf, other

    stat.AP cs.LG econ.EM stat.ML

    State Drug Policy Effectiveness: Comparative Policy Analysis of Drug Overdose Mortality

    Authors: Jarrod Olson, Po-Hsu Allen Chen, Marissa White, Nicole Brennan, Ning Gong

    Abstract: Opioid overdose rates have reached an epidemic level and state-level policy innovations have followed suit in an effort to prevent overdose deaths. State-level drug law is a set of policies that may reinforce or undermine each other, and analysts have a limited set of tools for handling the policy collinearity using statistical methods. This paper uses a machine learning method called hierarchical… ▽ More

    Submitted 5 October, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    MSC Class: 62H12

  23. arXiv:1907.07751  [pdf, other

    cs.LG stat.ML

    Meta-descent for Online, Continual Prediction

    Authors: Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White

    Abstract: This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad, RMSProp, and AMSGrad, keep statistics about the learning process to approximate a second order upda… ▽ More

    Submitted 13 December, 2019; v1 submitted 17 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence 2019. v2: Correction to Baird's counterexample. A bug in the code lead to results being reported for AMSGrad in this experiment, when they were actually results for Adam

  24. arXiv:1906.07865  [pdf, other

    cs.LG cs.AI stat.ML

    Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study

    Authors: Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White

    Abstract: Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience---how to adapt the learning system's behavior---to optimi… ▽ More

    Submitted 21 August, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

  25. arXiv:1906.07791  [pdf, other

    cs.LG cs.AI stat.ML

    Hill Climbing on Value Estimates for Search-control in Dyna

    Authors: Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White

    Abstract: Dyna is an architecture for model-based reinforcement learning (RL), where simulated experience from a model is used to update policies or value functions. A key component of Dyna is search-control, the mechanism to generate the state and action from which the agent queries the model, which remains largely unexplored. In this work, we propose to generate such states by using the trajectory obtaine… ▽ More

    Submitted 4 July, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: IJCAI 2019

  26. arXiv:1906.04328  [pdf, other

    cs.LG cs.AI stat.ML

    Importance Resampling for Off-policy Prediction

    Authors: Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White

    Abstract: Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning. While it is consistent and unbiased, it can result in high variance updates to the weights for the value function. In this work, we explore a resampling strategy as an alternative to reweighting. We propose Importance Resampling (IR) for off-policy prediction, which resamples experience f… ▽ More

    Submitted 13 November, 2019; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Recently published in NeurIPS 2019

  27. arXiv:1905.12588  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-Learning Representations for Continual Learning

    Authors: Khurram Javed, Martha White

    Abstract: A continual learning agent should be able to build on top of existing knowledge to learn on new data quickly while minimizing forgetting. Current intelligent systems based on neural network function approximators arguably do the opposite---they are highly prone to forgetting and rarely trained to facilitate future learning. One reason for this poor behavior is that they learn from a representation… ▽ More

    Submitted 30 October, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Accepted at NeurIPS19, 15 pages, 10 figures, open-source, representation learning, continual learning, online learning

  28. arXiv:1904.01191  [pdf, other

    cs.LG cs.AI stat.ML

    Planning with Expectation Models

    Authors: Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton

    Abstract: Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments… ▽ More

    Submitted 29 July, 2020; v1 submitted 1 April, 2019; originally announced April 2019.

  29. arXiv:1812.00914  [pdf, other

    cs.LG cs.AI stat.ML

    Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling

    Authors: Minghan Li, Tanli Zuo, Ruicheng Li, Martha White, Weishi Zheng

    Abstract: Knowledge distillation is an effective technique that transfers knowledge from a large teacher model to a shallow student. However, just like massive classification, large scale knowledge distillation also imposes heavy computational costs on training models of deep neural networks, as the softmax activations at the last layer involve computing probabilities over numerous classes. In this work, we… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

  30. arXiv:1811.09013  [pdf, other

    cs.LG stat.ML

    An Off-policy Policy Gradient Theorem Using Emphatic Weightings

    Authors: Ehsan Imani, Eric Graves, Martha White

    Abstract: Policy gradient methods are widely used for control in reinforcement learning, particularly for the continuous action setting. There have been a host of theoretically sound algorithms proposed for the on-policy setting, due to the existence of the policy gradient theorem which provides a simplified form for the gradient. In off-policy learning, however, where the behaviour policy is not necessaril… ▽ More

    Submitted 20 June, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

    Comments: Updated to final NeurIPS version

  31. arXiv:1811.06626  [pdf, other

    cs.LG cs.AI stat.ML

    The Utility of Sparse Representations for Control in Reinforcement Learning

    Authors: Vincent Liu, Raksha Kumaraswamy, Lei Le, Martha White

    Abstract: We investigate sparse representations for control in reinforcement learning. While these representations are widely used in computer vision, their prevalence in reinforcement learning is limited to sparse coding where extracting representations for new data can be computationally intensive. Here, we begin by demonstrating that learning a control policy incrementally with a representation from a st… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: Association for the Advancement of Artificial Intelligence 2019

  32. arXiv:1811.02597  [pdf, other

    cs.LG cs.AI stat.ML

    Online Off-policy Prediction

    Authors: Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White

    Abstract: This paper investigates the problem of online prediction learning, where learning proceeds continuously as the agent interacts with an environment. The predictions made by the agent are contingent on a particular way of behaving, represented as a value function. However, the behavior used to select actions and generate the behavior data might be different from the one used to define the prediction… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Comments: 68 pages

  33. arXiv:1810.09103  [pdf, other

    cs.LG cs.AI stat.ML

    Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement

    Authors: Samuel Neumann, Sungsu Lim, Ajin Joseph, Yangchen Pan, Adam White, Martha White

    Abstract: Many policy gradient methods are variants of Actor-Critic (AC), where a value function (critic) is learned to facilitate updating the parameterized policy (actor). The update to the actor involves a log-likelihood update weighted by the action-values, with the addition of entropy regularization for soft variants. In this work, we explore an alternative update for the actor, based on an extension o… ▽ More

    Submitted 28 February, 2023; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 27 pages, 8 figures

  34. arXiv:1808.09127  [pdf, other

    stat.ML cs.LG

    High-confidence error estimates for learned value functions

    Authors: Touqir Sajed, Wesley Chung, Martha White

    Abstract: Estimating the value function for a fixed policy is a fundamental problem in reinforcement learning. Policy evaluation algorithms---to estimate value functions---continue to be developed, to improve convergence rates, improve stability and handle variability, particularly for off-policy learning. To understand the properties of these algorithms, the experimenter needs high-confidence estimates of… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: Presented at (UAI) Uncertainty in Artificial Intelligence 2018

  35. arXiv:1807.06763  [pdf, other

    cs.LG cs.AI stat.ML

    General Value Function Networks

    Authors: Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White

    Abstract: State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions… ▽ More

    Submitted 2 February, 2021; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: Published in the Journal of Artificial Intelligence Research

    Journal ref: Journal of Artificial Intelligence Research, 70, 497-543 (2021)

  36. arXiv:1806.06931  [pdf, other

    cs.LG cs.AI stat.ML

    Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control

    Authors: Yangchen Pan, Amir-massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski

    Abstract: Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE). This paper shows how to use RL to tackle more general PDE control problems that have continuous high-dimensional action spaces with spatial relationship among action dimensions. In particular, we propose the concept of action descriptors, wh… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: ICML2018

  37. arXiv:1806.04613  [pdf, other

    stat.ML cs.LG

    Improving Regression Performance with Distributional Losses

    Authors: Ehsan Imani, Martha White

    Abstract: There is growing evidence that converting targets to soft targets in supervised learning can provide considerable gains in performance. Much of this work has considered classification, converting hard zero-one values to soft labels---such as by adding label noise, incorporating label ambiguity or using distillation. In parallel, there is some evidence from a regression setting in reinforcement lea… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: 12 pages, 4 figures. To appear in Proceedings of the 35th International Conference on Machine Learning, 2018

  38. arXiv:1707.08316  [pdf, other

    cs.AI cs.LG stat.ML

    Learning Sparse Representations in Reinforcement Learning with Sparse Coding

    Authors: Lei Le, Raksha Kumaraswamy, Martha White

    Abstract: A variety of representation learning approaches have been investigated for reinforcement learning; much less attention, however, has been given to investigating the utility of sparse coding. Outside of reinforcement learning, sparse coding representations have been widely used, with non-convex objectives that result in discriminative representations. In this work, we develop a supervised sparse co… ▽ More

    Submitted 26 July, 2017; originally announced July 2017.

    Comments: 6(+1) pages, 2 figures, International Joint Conference on Artificial Intelligence 2017

  39. arXiv:1702.00518  [pdf, other

    stat.ML cs.LG

    Recovering True Classifier Performance in Positive-Unlabeled Learning

    Authors: Shantanu Jain, Martha White, Predrag Radivojac

    Abstract: A common approach in positive-unlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic cu… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.

    Comments: Full paper with supplement

  40. arXiv:1611.09328  [pdf, other

    cs.AI cs.LG stat.ML

    Accelerated Gradient Temporal Difference Learning

    Authors: Yangchen Pan, Adam White, Martha White

    Abstract: The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic… ▽ More

    Submitted 9 March, 2017; v1 submitted 28 November, 2016; originally announced November 2016.

    Comments: AAAI Conference on Artificial Intelligence (AAAI), 2017

  41. arXiv:1607.00446  [pdf, other

    cs.AI cs.LG stat.ML

    A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

    Authors: Martha White, Adam White

    Abstract: One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-… ▽ More

    Submitted 24 October, 2016; v1 submitted 1 July, 2016; originally announced July 2016.

  42. arXiv:1606.08561  [pdf, other

    stat.ML cs.LG

    Estimating the class prior and posterior from noisy positives and unlabeled data

    Authors: Shantanu Jain, Martha White, Predrag Radivojac

    Abstract: We develop a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data. In recent years, several algorithms have been proposed to learn from positive-unlabeled data; however, many of these contributions remain theoretical, performing poorly on real high-dimensional data that i… ▽ More

    Submitted 31 January, 2017; v1 submitted 28 June, 2016; originally announced June 2016.

    Comments: Fixed a typo in the MSGMM update equations in the appendix. Other minor changes

  43. arXiv:1604.04942  [pdf, other

    stat.ML cs.LG

    Identifying global optimality for dictionary learning

    Authors: Lei Le, Martha White

    Abstract: Learning new representations of input observations in machine learning is often tackled using a factorization of the data. For many such problems, including sparse coding and matrix completion, learning these factorizations can be difficult, in terms of efficiency and to guarantee that the solution is a global minimum. Recently, a general class of objectives have been introduced-which we term indu… ▽ More

    Submitted 6 August, 2017; v1 submitted 17 April, 2016; originally announced April 2016.

    Comments: Updates to previous version include a small modification to Proposition 2, to only use normed regularizers, and a modification to the main theorem (previously Theorem 13) to focus on the overcomplete, full rank setting and to better characterize non-differentiable induced regularizers. The theory has been significantly modified since version 1

  44. arXiv:1602.08771  [pdf, other

    cs.LG cs.AI stat.ML

    Investigating practical linear temporal difference learning

    Authors: Adam White, Martha White

    Abstract: Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximatio… ▽ More

    Submitted 30 March, 2016; v1 submitted 28 February, 2016; originally announced February 2016.

    Comments: Autonomous Agents and Multi-agent Systems, 2016

  45. arXiv:1601.01944  [pdf, other

    stat.ML cs.LG

    Nonparametric semi-supervised learning of class proportions

    Authors: Shantanu Jain, Martha White, Michael W. Trosset, Predrag Radivojac

    Abstract: The problem of developing binary classifiers from positive and unlabeled data is often encountered in machine learning. A common requirement in this setting is to approximate posterior probabilities of positive and negative classes for a previously unseen data point. This problem can be decomposed into two steps: (i) the development of accurate predictors that discriminate between positive and unl… ▽ More

    Submitted 8 January, 2016; originally announced January 2016.

  46. arXiv:1512.01241  [pdf, other

    astro-ph.IM astro-ph.CO stat.ME

    Estimating sparse precision matrices

    Authors: Nikhil Padmanabhan, Martin White, Harrison H. Zhou, Ross O'Connell

    Abstract: We apply a method recently introduced to the statistical literature to directly estimate the precision matrix from an ensemble of samples drawn from a corresponding Gaussian distribution. Motivated by the observation that cosmological precision matrices are often approximately sparse, the method allows one to exploit this sparsity of the precision matrix to more quickly converge to an asymptotic 1… ▽ More

    Submitted 3 December, 2015; originally announced December 2015.

    Comments: 11 pages, 14 figures, submitted to MNRAS

  47. arXiv:1401.1640  [pdf, ps, other

    stat.AP q-bio.QM

    Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: An application to single cell data

    Authors: Bärbel Finkenstädt, Dan J. Woodcock, Michal Komorowski, Claire V. Harper, Julian R. E. Davis, Mike R. H. White, David A. Rand

    Abstract: A central challenge in computational modeling of dynamic biological systems is parameter inference from experimental time course measurements. However, one would not only like to infer kinetic parameters but also study their variability from cell to cell. Here we focus on the case where single-cell fluorescent protein imaging time series data are available for a population of cells. Based on van K… ▽ More

    Submitted 8 January, 2014; originally announced January 2014.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS669 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS669

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 4, 1960-1982

  48. arXiv:1211.0587  [pdf, other

    cs.IT cs.LG stat.ML

    Partition Tree Weighting

    Authors: Joel Veness, Martha White, Michael Bowling, András György

    Abstract: This paper introduces the Partition Tree Weighting technique, an efficient meta-algorithm for piecewise stationary sources. The technique works by performing Bayesian model averaging over a large class of possible partitions of the data into locally stationary segments. It uses a prior, closely related to the Context Tree Weighting technique of Willems, that is well suited to data compression appl… ▽ More

    Submitted 21 November, 2012; v1 submitted 2 November, 2012; originally announced November 2012.

  49. arXiv:1208.4066  [pdf, other

    q-bio.GN stat.AP

    Reverse Engineering Gene Interaction Networks Using the Phi-Mixing Coefficient

    Authors: Nitin Kumar Singh, M. Eren Ahsen, Shiva Mankala, Hyun-Seok Kim, Michael A. White, M. Vidyasagar

    Abstract: Constructing gene interaction networks (GINs) from high-throughput gene expression data is an important and challenging problem in systems biology. Existing algorithms produce networks that either have undirected and unweighted edges, or else are constrained to contain no cycles, both of which are biologically unrealistic. In the present paper we propose a new algorithm, based on a concept from pr… ▽ More

    Submitted 12 March, 2016; v1 submitted 20 August, 2012; originally announced August 2012.

    Comments: 19 pages, 6 figures

    MSC Class: 62P10; 92B15

  50. arXiv:0911.3944  [pdf, other

    stat.ML cs.CL cs.LG stat.AP

    Likelihood-based semi-supervised model selection with applications to speech processing

    Authors: Christopher M. White, Sanjeev P. Khudanpur, Patrick J. Wolfe

    Abstract: In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some other means. In the context of speech processing systems and other large-scale practical applications, however, such labeled development data are typically cost… ▽ More

    Submitted 19 November, 2009; originally announced November 2009.

    Comments: 11 pages, 2 figures; submitted for publication

    Journal ref: IEEE Journal of Selected Topics in Signal Processing, vol. 4, pp. 1016-1026, 2010