Skip to main content

Showing 1–6 of 6 results for author: Laber, E

Searching in archive math. Search in all archives.
.
  1. arXiv:2505.01361  [pdf, ps, other

    cs.LG math.PR stat.ML

    Stabilizing Temporal Difference Learning via Implicit Stochastic Recursion

    Authors: Hwanwoo Kim, Panos Toulis, Eric Laber

    Abstract: Temporal difference (TD) learning is a foundational algorithm in reinforcement learning (RL). For nearly forty years, TD learning has served as a workhorse for applied RL as well as a building block for more complex and specialized algorithms. However, despite its widespread use, TD procedures are generally sensitive to step size specification. A poor choice of step size can dramatically increase… ▽ More

    Submitted 22 June, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

    Comments: A substantial amount of content has been added regarding the theory and numerical experiments of the implicit version of temporal difference learning with gradient correction (TDC), which is newly proposed in this manuscript

  2. arXiv:2501.01505  [pdf, other

    stat.ME math.ST

    Reinforcement Learning for Respondent-Driven Sampling

    Authors: Justin Weltz, Angela Yoon, Yichi Zhang, Alexander Volfovsky, Eric Laber

    Abstract: Respondent-driven sampling (RDS) is widely used to study hidden or hard-to-reach populations by incentivizing study participants to recruit their social connections. The success and efficiency of RDS can depend critically on the nature of the incentives, including their number, value, call to action, etc. Standard RDS uses an incentive structure that is set a priori and held fixed throughout the s… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

  3. arXiv:2310.04390  [pdf, other

    math.ST

    Experimental Designs for Heteroskedastic Variance

    Authors: Justin Weltz, Tanner Fiez, Alexander Volfovsky, Eric Laber, Blake Mason, Houssam Nassif, Lalit Jain

    Abstract: Most linear experimental design problems assume homogeneous variance although heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$ that can be probed to receive noisy linear responses of the form $y=x^{\top}θ^{\ast}+η$. Here $θ^{\ast}\in \mathbb{R}^d$ is an unknown parameter vector, and $η$ i… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS'23), New Orleans, pp. 65967-66005, 2023

  4. arXiv:2206.06140  [pdf, other

    math.ST

    Inference for change-plane regression

    Authors: Chaeryon Kang, Hunyong Cho, Rui Song, Moulinath Banerjee, Eric B. Laber, Michael R. Kosorok

    Abstract: A key challenge in analyzing the behavior of change-plane estimators is that the objective function has multiple minimizers. Two estimators are proposed to deal with this non-uniqueness. For each estimator, an n-rate of convergence is established, and the limiting distribution is derived. Based on these results, we provide a parametric bootstrap procedure for inference. The validity of our theoret… ▽ More

    Submitted 13 January, 2024; v1 submitted 13 June, 2022; originally announced June 2022.

  5. arXiv:1907.09083  [pdf, other

    math.ST cs.LG math.OC

    Convergence Rates of Posterior Distributions in Markov Decision Process

    Authors: Zhen Li, Eric Laber

    Abstract: In this paper, we show the convergence rates of posterior distributions of the model dynamics in a MDP for both episodic and continuous tasks. The theoretical results hold for general state and action space and the parameter space of the dynamics can be infinite dimensional. Moreover, we show the convergence rates of posterior distributions of the mean accumulative reward under a fixed or the opti… ▽ More

    Submitted 21 July, 2019; originally announced July 2019.

  6. arXiv:1704.07531  [pdf, other

    stat.ME math.ST stat.ML

    Sufficient Markov Decision Processes with Alternating Deep Neural Networks

    Authors: Longshaokan Wang, Eric B. Laber, Katie Witkiewitz

    Abstract: Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We pro… ▽ More

    Submitted 17 March, 2018; v1 submitted 25 April, 2017; originally announced April 2017.

    Comments: 31 pages, 3 figures, extended abstract in the proceedings of RLDM2017. (v2 revisions: Fixed a minor bug in the code w.r.t. setting seed, as a result numbers in the simulation experiments had some slight changes, but conclusions stayed the same. Corrected typos. Improved notations.)