Skip to main content

Showing 1–46 of 46 results for author: Jamieson, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.04775  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards

    Authors: Artin Tajdini, Jonathan Scarlett, Kevin Jamieson

    Abstract: We study stochastic linear bandits with heavy-tailed rewards, where the rewards have a finite $(1+ε)$-absolute central moment bounded by $\upsilon$ for some $ε\in (0,1]$. We improve both upper and lower bounds on the minimax regret compared to prior work. When $\upsilon = \mathcal{O}(1)$, the best prior known regret upper bound is $\tilde{\mathcal{O}}(d T^{\frac{1}{1+ε}})$. While a lower with the… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  2. arXiv:2412.02529  [pdf, other

    q-bio.NC cs.LG stat.ML

    Active learning of neural population dynamics using two-photon holographic optogenetics

    Authors: Andrew Wagenmaker, Lu Mi, Marton Rozsa, Matthew S. Bull, Karel Svoboda, Kayvon Daie, Matthew D. Golub, Kevin Jamieson

    Abstract: Recent advances in techniques for monitoring and perturbing neural populations have greatly enhanced our ability to study circuits in the brain. In particular, two-photon holographic optogenetics now enables precise photostimulation of experimenter-specified groups of individual neurons, while simultaneous two-photon calcium imaging enables the measurement of ongoing and induced activity across th… ▽ More

    Submitted 8 May, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

    Comments: NeurIPS 2024

  3. arXiv:2410.20254  [pdf, other

    cs.LG cs.RO stat.ML

    Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL

    Authors: Andrew Wagenmaker, Kevin Huang, Liyiming Ke, Byron Boots, Kevin Jamieson, Abhishek Gupta

    Abstract: In order to mitigate the sample complexity of real-world reinforcement learning, common practice is to first train a policy in a simulator where samples are cheap, and then deploy this policy in the real world, with the hope that it generalizes effectively. Such \emph{direct sim2real} transfer is not guaranteed to succeed, however, and in cases where it fails, it is unclear how to best utilize the… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

    Comments: NeurIPS 2024

  4. arXiv:2312.08559  [pdf, other

    cs.LG cs.CY stat.ML

    Fair Active Learning in Low-Data Regimes

    Authors: Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

    Abstract: In critical machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities. In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments, where the cost of collecting labeled data prohibits the use of large, labeled datasets. In such settings, active learning promises to maximize marginal accuracy gains of small… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  5. arXiv:2310.18465  [pdf, other

    cs.LG stat.ML

    Nearly Minimax Optimal Submodular Maximization with Bandit Feedback

    Authors: Artin Tajdini, Lalit Jain, Kevin Jamieson

    Abstract: We consider maximizing an unknown monotonic, submodular set function $f: 2^{[n]} \rightarrow [0,1]$ with cardinality constraint under stochastic bandit feedback. At each time $t=1,\dots,T$ the learner chooses a set $S_t \subset [n]$ with $|S_t| \leq k$ and receives reward $f(S_t) + η_t$ where $η_t$ is mean-zero sub-Gaussian noise. The objective is to minimize the learner's regret with respect to a… ▽ More

    Submitted 12 December, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  6. arXiv:2310.06069  [pdf, other

    stat.ML cs.LG

    Optimal Exploration is no harder than Thompson Sampling

    Authors: Zhaoqi Li, Kevin Jamieson, Lalit Jain

    Abstract: Given a set of arms $\mathcal{Z}\subset \mathbb{R}^d$ and an unknown parameter vector $θ_\ast\in\mathbb{R}^d$, the pure exploration linear bandit problem aims to return $\arg\max_{z\in \mathcal{Z}} z^{\top}θ_{\ast}$, with high probability through noisy measurements of $x^{\top}θ_{\ast}$ with $x\in \mathcal{X}\subset \mathbb{R}^d$. Existing (asymptotically) optimal methods require either a) potenti… ▽ More

    Submitted 24 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

  7. arXiv:2307.15154  [pdf, other

    cs.LG stat.ML

    A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

    Authors: Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson

    Abstract: We investigate the fixed-budget best-arm identification (BAI) problem for linear bandits in a potentially non-stationary environment. Given a finite arm set $\mathcal{X}\subset\mathbb{R}^d$, a fixed budget $T$, and an unpredictable sequence of parameters $\left\lbraceθ_t\right\rbrace_{t=1}^{T}$, an algorithm will aim to correctly identify the best arm… ▽ More

    Submitted 15 February, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: 25 pages, 6 figures

  8. arXiv:2306.09210  [pdf, other

    cs.LG cs.RO eess.SY math.OC stat.ML

    Optimal Exploration for Model-Based RL in Nonlinear Systems

    Authors: Andrew Wagenmaker, Guanya Shi, Kevin Jamieson

    Abstract: Learning to control unknown nonlinear dynamical systems is a fundamental problem in reinforcement learning and control theory. A commonly applied approach is to first explore the environment (exploration), learn an accurate model of it (system identification), and then compute an optimal controller with the minimum cost on this estimated system (policy optimization). While existing work has shown… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  9. arXiv:2207.02575  [pdf, other

    cs.LG stat.ML

    Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design

    Authors: Andrew Wagenmaker, Kevin Jamieson

    Abstract: While much progress has been made in understanding the minimax sample complexity of reinforcement learning (RL) -- the complexity of learning on the "worst-case" instance -- such measures of complexity often do not capture the true difficulty of learning. In practice, on an "easy" instance, we might hope to achieve a complexity far better than that achievable on the worst-case instance. In this wo… ▽ More

    Submitted 20 July, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

  10. arXiv:2207.02357  [pdf, ps, other

    stat.ML cs.LG

    Instance-optimal PAC Algorithms for Contextual Bandits

    Authors: Zhaoqi Li, Lillian Ratliff, Houssam Nassif, Kevin Jamieson, Lalit Jain

    Abstract: In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the $(ε,δ)$-$\textit{PAC}$ setting: given a policy class $Π$ the goal of the learner is to return a policy $π\in Π$ whose expected reward is wi… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS'22), New Orleans, pp. 37590-37603, 2022

  11. arXiv:2206.11183  [pdf, other

    cs.LG stat.ML

    Active Learning with Safety Constraints

    Authors: Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

    Abstract: Active learning methods have shown great promise in reducing the number of samples necessary for learning. As automated learning systems are adopted into real-time, real-world decision-making pipelines, it is increasingly important that such algorithms are designed with safety in mind. In this work we investigate the complexity of learning the best safe decision in interactive environments. We red… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  12. arXiv:2201.11206  [pdf, other

    cs.LG stat.ML

    Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

    Authors: Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

    Abstract: Reward-free reinforcement learning (RL) considers the setting where the agent does not have access to a reward function during exploration, but must propose a near-optimal policy for an arbitrary reward function revealed only after exploring. In the the tabular setting, it is well known that this is a more difficult problem than reward-aware (PAC) RL -- where the agent has access to the reward fun… ▽ More

    Submitted 18 June, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

  13. arXiv:2112.03432  [pdf, other

    cs.LG stat.ML

    First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

    Authors: Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

    Abstract: Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making. While such bounds exist in many settings, they have proven elusive in reinforcement learning with large state spaces. In this work we address this gap, and show that it is possible… ▽ More

    Submitted 20 October, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

  14. arXiv:2111.12151  [pdf, other

    cs.LG stat.ML

    Best Arm Identification with Safety Constraints

    Authors: Zhenlin Wang, Andrew Wagenmaker, Kevin Jamieson

    Abstract: The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while learning. In this work we study the question of best-arm identification in safety-critical settings, where the goal of the agent is to find the best safe option ou… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

  15. arXiv:2111.04915  [pdf, other

    cs.LG stat.ML

    Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

    Authors: Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

    Abstract: We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification. We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting. Hence, we design novel computationally efficient algorithms for the realizable setting that ma… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

  16. arXiv:2111.01768  [pdf, other

    stat.ML cs.LG

    Nearly Optimal Algorithms for Level Set Estimation

    Authors: Blake Mason, Romain Camilleri, Subhojyoti Mukherjee, Kevin Jamieson, Robert Nowak, Lalit Jain

    Abstract: The level set estimation problem seeks to find all points in a domain ${\cal X}$ where the value of an unknown function $f:{\cal X}\rightarrow \mathbb{R}$ exceeds a threshold $α$. The estimation is based on noisy function evaluations that may be acquired at sequentially and adaptively chosen locations in ${\cal X}$. The threshold value $α$ can either be \emph{explicit} and provided a priori, or \e… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: 9 pages + appendices. 6 Figures

  17. arXiv:2108.02717  [pdf, other

    cs.LG stat.ML

    Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

    Authors: Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

    Abstract: The theory of reinforcement learning has focused on two fundamental problems: achieving low regret, and identifying $ε$-optimal policies. While a simple reduction allows one to apply a low-regret algorithm to obtain an $ε$-optimal policy and achieve the worst-case optimal rate, it is unknown whether low-regret algorithms can obtain the instance-optimal rate for policy identification. We show this… ▽ More

    Submitted 21 June, 2022; v1 submitted 5 August, 2021; originally announced August 2021.

  18. arXiv:2106.11220  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Corruption Robust Active Learning

    Authors: Yifang Chen, Simon S. Du, Kevin Jamieson

    Abstract: We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions. In this setting, every time before the learner observes a sample, the adversary decides whether to corrupt the label or not. First, we show that, in a benign corruption setting (which includes the misspecification setting as a special case), with a slight enlarge… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  19. arXiv:2105.06499  [pdf, other

    cs.LG stat.ML

    Improved Algorithms for Agnostic Pool-based Active Classification

    Authors: Julian Katz-Samuels, Jifan Zhang, Lalit Jain, Kevin Jamieson

    Abstract: We consider active learning for binary classification in the agnostic pool-based setting. The vast majority of works in active learning in the agnostic setting are inspired by the CAL algorithm where each query is uniformly sampled from the disagreement region of the current version space. The sample complexity of such algorithms is described by a quantity known as the disagreement coefficient whi… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

  20. arXiv:2102.05214  [pdf, other

    cs.LG math.OC stat.ML

    Task-Optimal Exploration in Linear Dynamical Systems

    Authors: Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

    Abstract: Exploration in unknown environments is a fundamental problem in reinforcement learning and control. In this work, we study task-guided exploration and determine what precisely an agent must learn about their environment in order to complete a particular task. Formally, we study a broad class of decision-making problems in the setting of linear dynamical systems, a class that includes the linear qu… ▽ More

    Submitted 9 July, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

  21. arXiv:2011.00576  [pdf, other

    cs.LG stat.ML

    Experimental Design for Regret Minimization in Linear Bandits

    Authors: Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

    Abstract: In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits. While existing literature tends to focus on optimism-based algorithms--which have been shown to be suboptimal in many cases--our approach carefully plans which action to take by balancing the tradeoff between information gain and reward, overcoming the fail… ▽ More

    Submitted 26 February, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

  22. arXiv:2008.06555  [pdf, other

    stat.ML cs.LG

    A New Perspective on Pool-Based Active Classification and False-Discovery Control

    Authors: Lalit Jain, Kevin Jamieson

    Abstract: In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i.e. false alarms). Such regions of the search space could differ drastically from a predicted set that minimizes 0/1 error and accurate identification could require v… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Journal ref: Published at Neurips 2019

  23. arXiv:2006.11685  [pdf, other

    cs.LG stat.ML

    An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

    Authors: Julian Katz-Samuels, Lalit Jain, Zohar Karnin, Kevin Jamieson

    Abstract: This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. Leveraging ideas from the theory of suprema of empirical processes, we provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. Unlike previous approaches which sample b… ▽ More

    Submitted 20 June, 2020; originally announced June 2020.

  24. arXiv:2002.07297  [pdf, other

    stat.ML cs.LG

    Estimating the number and effect sizes of non-null hypotheses

    Authors: Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

    Abstract: We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting. Knowing this distribution allows us to calculate the power (type II error) of any experimental design. We show that it is possible to estimate this distribution using an inexpensive pilot experiment, which takes significantly fewer sampl… ▽ More

    Submitted 24 July, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  25. arXiv:2002.00495  [pdf, other

    cs.LG eess.SY stat.ML

    Active Learning for Identification of Linear Dynamical Systems

    Authors: Andrew Wagenmaker, Kevin Jamieson

    Abstract: We propose an algorithm to actively estimate the parameters of a linear dynamical system. Given complete control over the system's input, our algorithm adaptively chooses the inputs to accelerate estimation. We show a finite time bound quantifying the estimation rate our algorithm attains and prove matching upper and lower bounds which guarantee its asymptotic optimality, up to constants. In addit… ▽ More

    Submitted 22 June, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

  26. arXiv:1906.08399  [pdf, other

    stat.ML cs.LG

    Sequential Experimental Design for Transductive Linear Bandits

    Authors: Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff

    Abstract: In this paper we introduce the transductive linear bandit problem: given a set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$, a set of items $\mathcal{Z}\subset \mathbb{R}^d$, a fixed confidence $δ$, and an unknown vector $θ^{\ast}\in \mathbb{R}^d$, the goal is to infer $\text{argmax}_{z\in \mathcal{Z}} z^\topθ^\ast$ with probability $1-δ$ by making as few sequentially chosen noisy meas… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

  27. arXiv:1906.06594  [pdf, other

    stat.ML cs.LG

    The True Sample Complexity of Identifying Good Arms

    Authors: Julian Katz-Samuels, Kevin Jamieson

    Abstract: We consider two multi-armed bandit problems with $n$ arms: (i) given an $ε> 0$, identify an arm with mean that is within $ε$ of the largest mean and (ii) given a threshold $μ_0$ and integer $k$, identify $k$ arms with means larger than $μ_0$. Existing lower bounds and algorithms for the PAC framework suggest that both of these problems require $Ω(n)$ samples. However, we argue that these definitio… ▽ More

    Submitted 15 June, 2019; originally announced June 2019.

  28. arXiv:1905.03814  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

    Authors: Max Simchowitz, Kevin Jamieson

    Abstract: This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs. In contrast to prior work, our bounds do not suffer a dependence on diameter-like quantities or ergodicity, and smoothly interpolate between the gap dependent logarithmic-regret, and the $\widetilde{\mathcal{O}}(\sqrt{HSAT})$-minimax rate. The key technique in our analysi… ▽ More

    Submitted 28 October, 2019; v1 submitted 9 May, 2019; originally announced May 2019.

  29. arXiv:1904.03257  [pdf, ps, other

    cs.LG cs.DB cs.DC cs.SE stat.ML

    MLSys: The New Frontier of Machine Learning Systems

    Authors: Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood , et al. (44 additional authors not shown)

    Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a ne… ▽ More

    Submitted 1 December, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

  30. arXiv:1903.05176  [pdf, other

    cs.LG stat.ML

    Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning

    Authors: Liam Li, Evan Sparks, Kevin Jamieson, Ameet Talwalkar

    Abstract: Hyperparameter tuning of multi-stage pipelines introduces a significant computational burden. Motivated by the observation that work can be reused across pipelines if the intermediate computations are the same, we propose a pipeline-aware approach to hyperparameter tuning. Our approach optimizes both the design and execution of pipelines to maximize reuse. We design pipelines amenable for reuse by… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  31. arXiv:1811.06149   

    stat.ML cs.LG

    Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs

    Authors: Maryam Aziz, Kevin Jamieson, Javed Aslam

    Abstract: This paper considers a multi-armed bandit game where the number of arms is much larger than the maximum budget and is effectively infinite. We characterize necessary and sufficient conditions on the total budget for an algorithm to return an ε-good arm with probability at least 1 - δ. In such situations, the sample complexity depends on ε, δ and the so-called reservoir distribution ν from which th… ▽ More

    Submitted 12 January, 2019; v1 submitted 14 November, 2018; originally announced November 2018.

    Comments: We found an irrecoverable error in one of the proofs

  32. arXiv:1810.05934  [pdf, other

    cs.LG stat.ML

    A System for Massively Parallel Hyperparameter Tuning

    Authors: Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

    Abstract: Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to develop mature hyperparameter optimization functionality in distributed computing settings. We address this challenge by first introducing a simple and… ▽ More

    Submitted 15 March, 2020; v1 submitted 13 October, 2018; originally announced October 2018.

    Comments: v2: Corrected typo in Algorithm 1 v3: Added comparison to BOHB and parallel version of synchronous SHA. Add PBT to experiment in Section 4.3.1 v4: Added acknowledgements and slight edit to related work

    Journal ref: Conference on Machine Learning and Systems 2020

  33. arXiv:1809.02235  [pdf, other

    stat.ML cs.LG

    A Bandit Approach to Multiple Testing with False Discovery Control

    Authors: Kevin Jamieson, Lalit Jain

    Abstract: We propose an adaptive sampling approach for multiple testing which aims to maximize statistical power while ensuring anytime false discovery control. We consider $n$ distributions whose means are partitioned by whether they are below or equal to a baseline (nulls), versus above the baseline (actual positives). In addition, each distribution can be sequentially and repeatedly sampled. Inspired by… ▽ More

    Submitted 16 July, 2019; v1 submitted 6 September, 2018; originally announced September 2018.

  34. arXiv:1808.04523  [pdf, other

    cs.LG math.ST stat.ML

    Adaptive Sampling for Convex Regression

    Authors: Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

    Abstract: In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences. We present a function-specific measure of complexity and use it to prove that, for each convex function $f_{\star}$, our algorithm nearly attains the information-theoretically optimal, function-specifi… ▽ More

    Submitted 26 August, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

  35. arXiv:1706.05378  [pdf, other

    stat.ML cs.LG stat.ME

    A framework for Multi-A(rmed)/B(andit) testing with online FDR control

    Authors: Fanny Yang, Aaditya Ramdas, Kevin Jamieson, Martin J. Wainwright

    Abstract: We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time. This setup arises in many practical applications, e.g. when pharmaceutical companies test new treatment options against control pills for different diseases, or when internet companies test their default webpages versus various alternatives over time. Our framework propose… ▽ More

    Submitted 18 November, 2017; v1 submitted 16 June, 2017; originally announced June 2017.

    Comments: Published as a conference paper at NIPS 2017

  36. arXiv:1706.01566  [pdf, other

    stat.ML cs.LG

    Open Loop Hyperparameter Optimization and Determinantal Point Processes

    Authors: Jesse Dodge, Kevin Jamieson, Noah A. Smith

    Abstract: Driven by the need for parallelizable hyperparameter optimization methods, this paper studies \emph{open loop} search methods: sequences that are predetermined and can be generated before a single configuration is evaluated. Examples include grid search, uniform random search, low discrepancy sequences, and other sampling distributions. In particular, we propose the use of $k$-determinantal point… ▽ More

    Submitted 8 May, 2019; v1 submitted 5 June, 2017; originally announced June 2017.

  37. arXiv:1702.05186  [pdf, other

    cs.LG stat.ML

    The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime

    Authors: Max Simchowitz, Kevin Jamieson, Benjamin Recht

    Abstract: We propose a novel technique for analyzing adaptive sampling called the {\em Simulator}. Our approach differs from the existing methods by considering not how much information could be gathered by any fixed sampling strategy, but how difficult it is to distinguish a good sampling strategy from a bad one given the limited amount of data collected up to any given time. This change of perspective all… ▽ More

    Submitted 23 April, 2023; v1 submitted 16 February, 2017; originally announced February 2017.

  38. arXiv:1606.07081  [pdf, other

    stat.ML cs.LG

    Finite Sample Prediction and Recovery Bounds for Ordinal Embedding

    Authors: Lalit Jain, Kevin Jamieson, Robert Nowak

    Abstract: The goal of ordinal embedding is to represent items as points in a low-dimensional Euclidean space given a set of constraints in the form of distance comparisons like "item $i$ is closer to item $j$ than item $k$". Ordinal constraints like this often come from human judgments. To account for errors and variation in judgments, we consider the noisy situation in which the given constraints are indep… ▽ More

    Submitted 22 June, 2016; originally announced June 2016.

  39. arXiv:1603.06560  [pdf, other

    cs.LG stat.ML

    Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

    Authors: Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar

    Abstract: Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through adaptive resource allocation and early-stopping. We formulate hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem wh… ▽ More

    Submitted 18 June, 2018; v1 submitted 21 March, 2016; originally announced March 2016.

    Comments: Changes: - Updated to JMLR version

    Journal ref: Journal of Machine Learning Research 18 (2018) 1-52

  40. arXiv:1603.02752  [pdf, ps, other

    cs.LG stat.ML

    Best-of-K Bandits

    Authors: Max Simchowitz, Kevin Jamieson, Benjamin Recht

    Abstract: This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution. The objective is to identify the subset that achieves the highest expected reward with high probability using as few queries as possible. We present distribution-dependent lo… ▽ More

    Submitted 18 March, 2016; v1 submitted 8 March, 2016; originally announced March 2016.

  41. arXiv:1502.07943  [pdf, other

    cs.LG stat.ML

    Non-stochastic Best Arm Identification and Hyperparameter Optimization

    Authors: Kevin Jamieson, Ameet Talwalkar

    Abstract: Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem. Within the multi-armed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the non-stochastic and stochastic settings while to the best of our knowledge, the best-arm identification framework has only been considered in the stochastic setting… ▽ More

    Submitted 27 February, 2015; originally announced February 2015.

  42. arXiv:1502.00133  [pdf, other

    stat.ML cs.LG

    Sparse Dueling Bandits

    Authors: Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

    Abstract: The dueling bandit problem is a variation of the classical multi-armed bandit in which the allowable actions are noisy comparisons between pairs of arms. This paper focuses on a new approach for finding the "best" arm according to the Borda criterion using noisy comparisons. We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of… ▽ More

    Submitted 31 January, 2015; originally announced February 2015.

  43. arXiv:1312.7308  [pdf, other

    stat.ML cs.LG

    lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

    Authors: Kevin Jamieson, Matthew Malloy, Robert Nowak, Sébastien Bubeck

    Abstract: The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples. The procedure cannot be improved in the sense that the number of samples required to identify the best arm is within a constant factor of a lower bound based on the law of the iterated log… ▽ More

    Submitted 27 December, 2013; originally announced December 2013.

  44. arXiv:1306.3917  [pdf, ps, other

    stat.ML cs.LG

    On Finding the Largest Mean Among Many

    Authors: Kevin Jamieson, Matthew Malloy, Robert Nowak, Sebastien Bubeck

    Abstract: Sampling from distributions to find the one with the largest mean arises in a broad range of applications, and it can be mathematically modeled as a multi-armed bandit problem in which each distribution is associated with an arm. This paper studies the sample complexity of identifying the best arm (largest mean) in a multi-armed bandit problem. Motivated by large-scale applications, we are especia… ▽ More

    Submitted 17 June, 2013; originally announced June 2013.

  45. arXiv:1209.2434  [pdf, ps, other

    stat.ML cs.LG

    Query Complexity of Derivative-Free Optimization

    Authors: Kevin G. Jamieson, Robert D. Nowak, Benjamin Recht

    Abstract: This paper provides lower bounds on the convergence rate of Derivative Free Optimization (DFO) with noisy function evaluations, exposing a fundamental and unavoidable gap between the performance of algorithms with access to gradients and those with access to only function evaluations. However, there are situations in which DFO is unavoidable, and for such situations we propose a new DFO algorithm… ▽ More

    Submitted 11 September, 2012; originally announced September 2012.

  46. arXiv:1109.3701  [pdf, other

    cs.LG cs.IT stat.ML

    Active Ranking using Pairwise Comparisons

    Authors: Kevin G. Jamieson, Robert D. Nowak

    Abstract: This paper examines the problem of ranking a collection of objects using pairwise comparisons (rankings of two objects). In general, the ranking of $n$ objects can be identified by standard sorting methods using $n log_2 n$ pairwise comparisons. We are interested in natural situations in which relationships among the objects may allow for ranking using far fewer pairwise comparisons. Specifically,… ▽ More

    Submitted 9 December, 2011; v1 submitted 16 September, 2011; originally announced September 2011.

    Comments: 17 pages, an extended version of our NIPS 2011 paper. The new version revises the argument of the robust section and slightly modifies the result there to give it more impact