Skip to main content

Showing 1–45 of 45 results for author: Foster, D

Searching in archive math. Search in all archives.
.
  1. arXiv:2503.07453  [pdf, other

    cs.LG cs.AI cs.CL math.ST

    Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

    Authors: Dylan J. Foster, Zakaria Mhammedi, Dhruv Rohatgi

    Abstract: Language model alignment (or, reinforcement learning) techniques that leverage active exploration -- deliberately encouraging the model to produce diverse, informative responses -- offer the promise of super-human capabilities. However, current understanding of algorithm design primitives for computationally efficient exploration with language models is limited. To better understand how to leverag… ▽ More

    Submitted 13 March, 2025; v1 submitted 10 March, 2025; originally announced March 2025.

    Comments: V2: Improved number of prompts used by Algorithm 1

  2. arXiv:2411.03955  [pdf, ps, other

    math.PR cs.GT stat.OT

    Large Deviations Inequalities for Unequal Probability Sampling Without Replacement

    Authors: Dean P. Foster, Sergiu Hart

    Abstract: We provide bounds on the tail probabilities for simple procedures that generate random samples _without replacement_, when the probabilities of being selected need not be equal.

    Submitted 6 November, 2024; originally announced November 2024.

  3. arXiv:2410.21676  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    How Does Critical Batch Size Scale in Pre-training?

    Authors: Hanlin Zhang, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham Kakade

    Abstract: Training large-scale models under given resources requires careful design of parallelism strategies. In particular, the efficiency notion of critical batch size (CBS), concerning the compromise between time and compute, marks the threshold beyond which greater data parallelism leads to diminishing returns. To operationalize it, we propose a measure of CBS and pre-train a series of auto-regressive… ▽ More

    Submitted 21 April, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

    Comments: ICLR 2025, Blog post: https://kempnerinstitute.harvard.edu/research/deeper-learning/how-does-critical-batch-size-scale-in-pre-training-decoupling-data-and-model-size

  4. arXiv:2410.17904  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

    Authors: Philip Amortila, Dylan J. Foster, Nan Jiang, Akshay Krishnamurthy, Zakaria Mhammedi

    Abstract: Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations, but the underlying (''latent'') dynamics are comparatively simple. However, outside of restrictive settings such as small latent spaces, the fundamental statistical requirements and algorithmic principles for reinforcement learning under latent dynamics are p… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

  5. arXiv:2410.05117  [pdf, ps, other

    cs.LG cs.IT math.ST stat.ML

    Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

    Authors: Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, Yunbei Xu

    Abstract: We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms tha… ▽ More

    Submitted 6 December, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

  6. arXiv:2407.15007  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

    Authors: Dylan J. Foster, Adam Block, Dipendra Misra

    Abstract: Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The simplest approach to IL, behavior cloning (BC), is thought to incur sample complexity with unfavorable quadratic dependence on the problem horizon, motivating a vari… ▽ More

    Submitted 30 November, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: NeurIPS 2024

  7. arXiv:2404.10122  [pdf, other

    stat.ML cs.LG math.ST

    Online Estimation via Offline Estimation: An Information-Theoretic Framework

    Authors: Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin

    Abstract: $… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  8. arXiv:2403.06571  [pdf, other

    cs.LG math.OC stat.ML

    Scalable Online Exploration via Coverability

    Authors: Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy

    Abstract: Exploration is a major challenge in reinforcement learning, especially for high-dimensional domains that require function approximation. We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration. Within this framework, we introduce a new objective, $L_1$-Coverag… ▽ More

    Submitted 4 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  9. arXiv:2312.16730  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Foundations of Reinforcement Learning and Interactive Decision Making

    Authors: Dylan J. Foster, Alexander Rakhlin

    Abstract: These lecture notes give a statistical perspective on the foundations of reinforcement learning and interactive decision making. We present a unifying framework for addressing the exploration-exploitation dilemma using frequentist and Bayesian approaches, with connections and parallels between supervised learning/estimation and decision making as an overarching theme. Special attention is paid to… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  10. arXiv:2310.11428  [pdf, other

    cs.LG math.OC stat.ML

    Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

    Authors: Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang

    Abstract: This work studies training instabilities of behavior cloning with deep neural networks. We observe that minibatch SGD updates to the policy network during training result in sharp oscillations in long-horizon rewards, despite negligibly affecting the behavior cloning loss. We empirically disentangle the statistical and computational causes of these oscillations, and find them to stem from the chao… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  11. arXiv:2307.03997  [pdf, other

    cs.LG math.OC

    Efficient Model-Free Exploration in Low-Rank MDPs

    Authors: Zakaria Mhammedi, Adam Block, Dylan J. Foster, Alexander Rakhlin

    Abstract: A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required. Low-Rank Markov Decision Processes -- where transition probabilities admit a low-rank factorization based on an unknown feature embedding -- offer a simple, yet expressive framework for RL with func… ▽ More

    Submitted 29 February, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

  12. arXiv:2301.08215  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient

    Authors: Dylan J. Foster, Noah Golowich, Yanjun Han

    Abstract: A foundational problem in reinforcement learning and interactive decision making is to understand what modeling assumptions lead to sample-efficient learning guarantees, and what algorithm design principles achieve optimal sample complexity. Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient (DEC), a measure of statistical complexity which leads to upper and lower bounds… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  13. arXiv:2211.14250  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Model-Free Reinforcement Learning with the Decision-Estimation Coefficient

    Authors: Dylan J. Foster, Noah Golowich, Jian Qian, Alexander Rakhlin, Ayush Sekhari

    Abstract: We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation. Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient, a measure of statistical complexity that lower bounds the optimal regret for interactive decision making, as well as a meta-algorithm, Estimation-to-Decisions, which ach… ▽ More

    Submitted 12 August, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: V2 changes: Improved writing and added more examples

  14. arXiv:2210.07169  [pdf, ps, other

    econ.TH cs.GT cs.LG math.ST stat.ML

    Forecast Hedging and Calibration

    Authors: Dean P. Foster, Sergiu Hart

    Abstract: Calibration means that forecasts and average realized frequencies are close. We develop the concept of forecast hedging, which consists of choosing the forecasts so as to guarantee that the expected track record can only improve. This yields all the calibration results by the same simple basic argument while differentiating between them by the forecast-hedging tools used: deterministic and fixed p… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: http://www.ma.huji.ac.il/hart/publ.html#calib-int

    Report number: HUJI DP-731

    Journal ref: Journal of Political Economy 129, 12 (December 2021), 3447-3490

  15. arXiv:2210.07152  [pdf, ps, other

    econ.TH cs.GT cs.LG math.ST stat.ML

    Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

    Authors: Dean P. Foster, Sergiu Hart

    Abstract: We propose to smooth out the calibration score, which measures how good a forecaster is, by combining nearby forecasts. While regular calibration can be guaranteed only by randomized forecasting procedures, we show that smooth calibration can be guaranteed by deterministic procedures. As a consequence, it does not matter if the forecasts are leaked, i.e., made known in advance: smooth calibration… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: http://www.ma.huji.ac.il/hart/publ.html#calib-eq

    Report number: HUJI DP-692

    Journal ref: Games and Economic Behavior 109 (May 2018), 271-293

  16. arXiv:2210.04157  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    The Role of Coverage in Online Reinforcement Learning

    Authors: Tengyang Xie, Dylan J. Foster, Yu Bai, Nan Jiang, Sham M. Kakade

    Abstract: Coverage conditions -- which assert that the data logging distribution adequately covers the state space -- play a fundamental role in determining the sample complexity of offline reinforcement learning. While such conditions might seem irrelevant to online reinforcement learning at first glance, we establish a new connection by showing -- somewhat surprisingly -- that the mere existence of a data… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  17. arXiv:2210.03137  [pdf, other

    cs.LG math.OC

    Deep Inventory Management

    Authors: Dhruv Madeka, Kari Torkkola, Carson Eisenach, Anna Luo, Dean P. Foster, Sham M. Kakade

    Abstract: This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, our results show that several policy learning approaches are competitive with or outperform classical methods. In order to train… ▽ More

    Submitted 28 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  18. arXiv:2206.13063  [pdf, other

    cs.LG math.OC math.ST stat.ML

    On the Complexity of Adversarial Decision Making

    Authors: Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

    Abstract: A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees. We consider a general adversarial decision making framework that encompasses (structured) bandit problems with adversarial rewards and reinforcement learning problems with adversarial dynamics. Our main result… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  19. arXiv:2112.13487  [pdf, other

    cs.LG math.OC math.ST stat.ML

    The Statistical Complexity of Interactive Decision Making

    Authors: Dylan J. Foster, Sham M. Kakade, Jian Qian, Alexander Rakhlin

    Abstract: A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is analogous to the classical problem of optimal (supervised) statistical learning, where there are well-known complexity measures (e.g., VC dimension and Rademacher… ▽ More

    Submitted 11 July, 2023; v1 submitted 26 December, 2021; originally announced December 2021.

    Comments: Minor improvements to writing and organization

  20. Minimax Rates for Conditional Density Estimation via Empirical Entropy

    Authors: Blair Bilodeau, Dylan J. Foster, Daniel M. Roy

    Abstract: We consider the task of estimating a conditional density using i.i.d. samples from a joint distribution, which is a fundamental problem with applications in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capac… ▽ More

    Submitted 14 June, 2023; v1 submitted 21 September, 2021; originally announced September 2021.

    Comments: 59 pages, 1 figure. Minor edits to match published version

    Journal ref: Annals of Statistics, 51(2):762-790, 2023

  21. arXiv:2108.04552  [pdf, other

    cs.LG math.OC stat.ML

    The Benefits of Implicit Regularization from SGD in Least Squares Problems

    Authors: Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Dean P. Foster, Sham M. Kakade

    Abstract: Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches. In this work, we seek to understand these issues in the simpler setting of linear regression (including both underparameterized and overparameterized regimes), where our goal is to make s… ▽ More

    Submitted 10 July, 2022; v1 submitted 10 August, 2021; originally announced August 2021.

    Comments: 33 pages, 1 figure. In NeurIPS 2021

  22. arXiv:2107.02237  [pdf, other

    cs.LG math.ST stat.ML

    Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

    Authors: Dylan J. Foster, Akshay Krishnamurthy

    Abstract: A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise, often quantified by the performance of the best hypothesis; such results are known as first-order or small-loss guarantees. While first-order guarantees are relatively well understood in statistical and online learning, adapting to low noise in contextua… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  23. arXiv:2010.11895  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    What are the Statistical Limits of Offline RL with Linear Function Approximation?

    Authors: Ruosong Wang, Dean P. Foster, Sham M. Kakade

    Abstract: Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation methods (to deal with the curse of dimensionality) can provide a means to help alleviate the excessive sample complexity burden in modern sequential decision making p… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  24. arXiv:2010.03799  [pdf, ps, other

    cs.LG math.OC math.ST stat.ML

    Learning the Linear Quadratic Regulator from Nonlinear Observations

    Authors: Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

    Abstract: We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR. In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs, but the agent operates on high-dimensional, nonlinear observations such as images from a camera. To enable sample-efficient learning, we assume that the learn… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020

  25. arXiv:2010.03104  [pdf, other

    cs.LG math.ST stat.ML

    Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

    Authors: Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

    Abstract: In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm. Are similar guarantees possible for contextual bandits? While positive results are known for certain special cases, there is no general theory characterizing when and how instance-dependent regret bounds for contextual bandits ca… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  26. arXiv:2006.13476  [pdf, other

    cs.LG math.OC stat.ML

    Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

    Authors: Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

    Abstract: We design an algorithm which finds an $ε$-approximate stationary point (with $\|\nabla F(x)\|\le ε$) using $O(ε^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed. We prove a lower bound which establishes that this rate is optimal and---surprisingly---tha… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted to CONFERENCE ON LEARNING THEORY (COLT) 2020

  27. arXiv:2004.14681  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Learning nonlinear dynamical systems from a single trajectory

    Authors: Dylan J. Foster, Alexander Rakhlin, Tuhin Sarkar

    Abstract: We introduce algorithms for learning nonlinear dynamical systems of the form $x_{t+1}=σ(Θ^{\star}x_t)+\varepsilon_t$, where $Θ^{\star}$ is a weight matrix, $σ$ is a nonlinear link function, and $\varepsilon_t$ is a mean-zero noise process. We give an algorithm that recovers the weight matrix $Θ^{\star}$ from a single trajectory with optimal sample complexity and linear running time. The algorithm… ▽ More

    Submitted 30 April, 2020; originally announced April 2020.

    Comments: To appear at L4DC 2020

  28. arXiv:2003.00189  [pdf, ps, other

    cs.LG math.OC stat.ML

    Logarithmic Regret for Adversarial Online Control

    Authors: Dylan J. Foster, Max Simchowitz

    Abstract: We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances. Existing regret bounds for this setting scale as $\sqrt{T}$ unless strong stochastic assumptions are imposed on the disturbance process. We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences, provided the state and control costs are g… ▽ More

    Submitted 23 June, 2020; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: ICML 2020

  29. arXiv:2002.04926  [pdf, other

    cs.LG math.ST stat.ML

    Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

    Authors: Dylan J. Foster, Alexander Rakhlin

    Abstract: A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification and regression. Algorithms based on regression have shown promising empirical success, but theoretical guarantees have remained elusive except in special cases. We provide the first universal and op… ▽ More

    Submitted 23 June, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  30. arXiv:2001.09576  [pdf, other

    cs.LG math.OC stat.ML

    Naive Exploration is Optimal for Online LQR

    Authors: Max Simchowitz, Dylan J. Foster

    Abstract: We consider the problem of online adaptive control of the linear quadratic regulator, where the true system parameters are unknown. We prove new upper and lower bounds demonstrating that the optimal regret scales as $\widetildeΘ({\sqrt{d_{\mathbf{u}}^2 d_{\mathbf{x}} T}})$, where $T$ is the number of time steps, $d_{\mathbf{u}}$ is the dimension of the input space, and $d_{\mathbf{x}}$ is the dime… ▽ More

    Submitted 3 October, 2023; v1 submitted 26 January, 2020; originally announced January 2020.

  31. arXiv:1912.02365  [pdf, other

    math.OC cs.IT cs.LG stat.ML

    Lower Bounds for Non-Convex Stochastic Optimization

    Authors: Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

    Abstract: We lower bound the complexity of finding $ε$-stationary points (with gradient norm at most $ε$) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least $ε^{-4}$ queries to find an… ▽ More

    Submitted 27 February, 2022; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: Correction to hard instance dimensions in Theorem 3

  32. arXiv:1911.06468  [pdf, ps, other

    cs.LG math.ST stat.ML

    $\ell_{\infty}$ Vector Contraction for Rademacher Complexity

    Authors: Dylan J. Foster, Alexander Rakhlin

    Abstract: We show that the Rademacher complexity of any $\mathbb{R}^{K}$-valued function class composed with an $\ell_{\infty}$-Lipschitz function is bounded by the maximum Rademacher complexity of the restriction of the function class along each coordinate, times a factor of $\tilde{O}(\sqrt{K})$.

    Submitted 14 November, 2019; originally announced November 2019.

    Comments: Technical note

  33. arXiv:1906.00531  [pdf, other

    cs.LG math.ST stat.ML

    Model selection for contextual bandits

    Authors: Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

    Abstract: We introduce the problem of model selection for contextual bandits, where a learner must adapt to the complexity of the optimal policy while balancing exploration and exploitation. Our main result is a new model selection guarantee for linear contextual bandits. We work in the stochastic realizable setting with a sequence of nested linear policy classes of dimension $d_1 < d_2 < \ldots$, where the… ▽ More

    Submitted 14 November, 2019; v1 submitted 2 June, 2019; originally announced June 2019.

  34. arXiv:1905.13283  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Sum-of-squares meets square loss: Fast rates for agnostic tensor completion

    Authors: Dylan J. Foster, Andrej Risteski

    Abstract: We study tensor completion in the agnostic setting. In the classical tensor completion problem, we receive $n$ entries of an unknown rank-$r$ tensor and wish to exactly complete the remaining entries. In agnostic tensor completion, we make no assumption on the rank of the unknown tensor, but attempt to predict unknown entries as well as the best rank-$r$ tensor. For agnostic learning of third-or… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: To appear at COLT 2019

  35. arXiv:1902.04686  [pdf, ps, other

    cs.LG math.OC stat.ML

    The Complexity of Making the Gradient Small in Stochastic Convex Optimization

    Authors: Dylan J. Foster, Ayush Sekhari, Ohad Shamir, Nathan Srebro, Karthik Sridharan, Blake Woodworth

    Abstract: We give nearly matching upper and lower bounds on the oracle complexity of finding $ε$-stationary points ($\| \nabla F(x) \| \leqε$) in stochastic convex optimization. We jointly analyze the oracle complexity in both the local stochastic oracle model and the global oracle (or, statistical learning) model. This allows us to decompose the complexity of finding near-stationary points into optimizatio… ▽ More

    Submitted 14 February, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

  36. arXiv:1901.09036  [pdf, other

    math.ST cs.LG econ.EM stat.ML

    Orthogonal Statistical Learning

    Authors: Dylan J. Foster, Vasilis Syrgkanis

    Abstract: We provide non-asymptotic excess risk guarantees for statistical learning in a setting where the population risk with respect to which we evaluate the target parameter depends on an unknown nuisance parameter that must be estimated from data. We analyze a two-stage sample splitting meta-algorithm that takes as input arbitrary estimation algorithms for the target parameter and nuisance parameter. W… ▽ More

    Submitted 5 June, 2023; v1 submitted 24 January, 2019; originally announced January 2019.

    Comments: Reorganized, added experiments and additional examples

  37. arXiv:1810.11059  [pdf, ps, other

    cs.LG math.OC stat.ML

    Uniform Convergence of Gradients for Non-Convex Learning and Optimization

    Authors: Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

    Abstract: We investigate 1) the rate at which refined properties of the empirical risk---in particular, gradients---converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization. Our analysis follows the tradition of norm-based capacity control. We propose vector-valued Rademacher complexities as a simple, composable, and user-f… ▽ More

    Submitted 11 November, 2018; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: To appear in Neural Information Processing Systems (NIPS) 2018

  38. arXiv:1803.07617  [pdf, other

    cs.LG math.OC stat.ML

    Online Learning: Sufficient Statistics and the Burkholder Method

    Authors: Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

    Abstract: We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain "sufficient statistics" for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is on… ▽ More

    Submitted 20 March, 2018; originally announced March 2018.

  39. arXiv:1704.04010  [pdf, other

    cs.LG math.OC stat.ML

    ZigZag: A new approach to adaptive online learning

    Authors: Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

    Abstract: We develop a novel family of algorithms for the online learning setting with regret against any data sequence bounded by the empirical Rademacher complexity of that sequence. To develop a general theory of when this type of adaptive regret bound is achievable we establish a connection to the theory of decoupling inequalities for martingales in Banach spaces. When the hypothesis class is a set of l… ▽ More

    Submitted 13 April, 2017; originally announced April 2017.

    Comments: 49 pages

  40. Knotted fields and explicit fibrations for lemniscate knots

    Authors: Benjamin Bode, Mark R Dennis, David Foster, Robert P King

    Abstract: We give an explicit construction of complex maps whose nodal line have the form of lemniscate knots. We review the properties of lemniscate knots, defined as closures of braids where all strands follow the same transverse (1, $\ell$) Lissajous figure, and are therefore a subfamily of spiral knots generalising the torus knots. We then prove that such maps exist and are in fact fibrations with appro… ▽ More

    Submitted 8 November, 2016; originally announced November 2016.

    Comments: 20 pages, 8 figures

  41. arXiv:1605.08839  [pdf, other

    math.ST

    Kernel ridge vs. principal component regression: minimax bounds and adaptability of regularization operators

    Authors: Lee H. Dicker, Dean P. Foster, Daniel Hsu

    Abstract: Regularization is an essential element of virtually all kernel methods for nonparametric regression problems. A critical factor in the effectiveness of a given kernel method is the type of regularization that is employed. This article compares and contrasts members from a general class of regularization techniques, which notably includes ridge regression and principal component regression. We deri… ▽ More

    Submitted 27 May, 2016; originally announced May 2016.

    Comments: 19 pages, 4 figures

  42. arXiv:1510.06319  [pdf, other

    math.ST stat.ME

    A Risk Ratio Comparison of $l_0$ and $l_1$ Penalized Regression

    Authors: Kory D. Johnson, Dongyu Lin, Lyle H. Ungar, Dean P. Foster, Robert A. Stine

    Abstract: There has been an explosion of interest in using $l_1$-regularization in place of $l_0$-regularization for feature selection. We present theoretical results showing that while $l_1$-penalized linear regression never outperforms $l_0$-regularization by more than a constant factor, in some cases using an $l_1$ penalty is infinitely worse than using an $l_0$ penalty. We also show that the "optimal"… ▽ More

    Submitted 21 October, 2015; originally announced October 2015.

  43. arXiv:1510.06301  [pdf, other

    math.ST

    Submodularity in Statistics: Comparing the Success of Model Selection Methods

    Authors: Kory D. Johnson, Robert A. Stine, Dean P. Foster

    Abstract: We demonstrate the usefulness of submodularity in statistics as a characterization of the difficulty of the \emph{search} problem of feature selection. The search problem is the ability of a procedure to identify an informative set of features as opposed to the performance of the optimal set of features. Submodularity arises naturally in this setting due to its connection to combinatorial optimiza… ▽ More

    Submitted 13 May, 2016; v1 submitted 21 October, 2015; originally announced October 2015.

  44. arXiv:1412.7946  [pdf, other

    physics.comp-ph math.NA

    Domain Decomposition for Heterojunction Problems in Semiconductors

    Authors: Timothy Costa, David Foster, Malgorzata Peszynska

    Abstract: We present a domain decomposition approach for the simulation of charge transport in heterojunction semiconductors. The problem is characterized by a large variation of primary variables across an interface region of a size much smaller than the device scale, and requires a multiscale approach in which that region is modeled as an internal boundary. The model combines drift diffusion equations on… ▽ More

    Submitted 26 December, 2014; originally announced December 2014.

  45. arXiv:1107.1744  [pdf, other

    math.OC cs.LG eess.SY

    Stochastic convex optimization with bandit feedback

    Authors: Alekh Agarwal, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Alexander Rakhlin

    Abstract: This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\xset$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \xset$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm'… ▽ More

    Submitted 8 October, 2011; v1 submitted 8 July, 2011; originally announced July 2011.