Skip to main content

Showing 1–26 of 26 results for author: Seldin, Y

.
  1. arXiv:2405.14681  [pdf, other

    cs.LG stat.ML

    Recursive PAC-Bayes: A Frequentist Approach to Sequential Prior Updates with No Information Loss

    Authors: Yi-Shan Wu, Yijie Zhang, Badr-Eddine Chérief-Abdellatif, Yevgeny Seldin

    Abstract: PAC-Bayesian analysis is a frequentist framework for incorporating prior knowledge into learning. It was inspired by Bayesian learning, which allows sequential data processing and naturally turns posteriors from one processing step into priors for the next. However, despite two and a half decades of research, the ability to update priors sequentially without losing confidence information along the… ▽ More

    Submitted 8 April, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2308.10675  [pdf, ps, other

    cs.LG stat.ML

    A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays

    Authors: Saeed Masoudian, Julian Zimmert, Yevgeny Seldin

    Abstract: We propose a new best-of-both-worlds algorithm for bandits with variably delayed feedback. In contrast to prior work, which required prior knowledge of the maximal delay $d_{\mathrm{max}}$ and had a linear dependence of the regret on it, our algorithm can tolerate arbitrary excessive delays up to order $T$ (where $T$ is the time horizon). The algorithm is based on three technical innovations, whic… ▽ More

    Submitted 27 May, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

  3. arXiv:2305.19036  [pdf, other

    cs.LG

    Delayed Bandits: When Do Intermediate Observations Help?

    Authors: Emmanuel Esposito, Saeed Masoudian, Hao Qiu, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We study a $K$-armed bandit with delayed feedback and intermediate observations. We consider a model where intermediate observations have a form of a finite state, which is observed immediately after taking an action, whereas the loss is observed after an adversarially chosen delay. We show that the regime of the mapping of states to losses determines the complexity of the problem, irrespective of… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  4. arXiv:2206.14906  [pdf, ps, other

    cs.LG stat.ML

    A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback

    Authors: Saeed Masoudian, Julian Zimmert, Yevgeny Seldin

    Abstract: We present a modified tuning of the algorithm of Zimmert and Seldin [2020] for adversarial multiarmed bandits with delayed feedback, which in addition to the minimax optimal adversarial regret guarantee shown by Zimmert and Seldin simultaneously achieves a near-optimal regret guarantee in the stochastic setting with fixed delays. Specifically, the adversarial regret guarantee is… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

  5. arXiv:2206.00706  [pdf, other

    stat.ML cs.LG

    Split-kl and PAC-Bayes-split-kl Inequalities for Ternary Random Variables

    Authors: Yi-Shan Wu, Yevgeny Seldin

    Abstract: We present a new concentration of measure inequality for sums of independent bounded random variables, which we name a split-kl inequality. The inequality is particularly well-suited for ternary random variables, which naturally show up in a variety of problems, including analysis of excess losses in classification, analysis of weighted majority votes, and learning with abstention. We demonstrate… ▽ More

    Submitted 17 January, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: aligned with the camera-ready version published to NeurIPS 2022

  6. arXiv:2206.00557  [pdf, ps, other

    cs.LG

    A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs

    Authors: Chloé Rouyer, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We consider online learning with feedback graphs, a sequential decision-making framework where the learner's feedback is determined by a directed graph over the action set. We present a computationally efficient algorithm for learning in this framework that simultaneously achieves near-optimal regret bounds in both stochastic and adversarial environments. The bound against oblivious adversaries is… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  7. arXiv:2106.13624  [pdf, other

    cs.LG stat.ML

    Chebyshev-Cantelli PAC-Bayes-Bennett Inequality for the Weighted Majority Vote

    Authors: Yi-Shan Wu, Andrés R. Masegosa, Stephan S. Lorenzen, Christian Igel, Yevgeny Seldin

    Abstract: We present a new second-order oracle bound for the expected risk of a weighted majority vote. The bound is based on a novel parametric form of the Chebyshev- Cantelli inequality (a.k.a. one-sided Chebyshev's), which is amenable to efficient minimization. The new form resolves the optimization challenge faced by prior oracle bounds based on the Chebyshev-Cantelli inequality, the C-bounds [Germain e… ▽ More

    Submitted 17 January, 2023; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: aligned with the camera-ready version published at NeurIPS 2021

  8. arXiv:2103.12487  [pdf, ps, other

    cs.LG stat.ML

    Improved Analysis of the Tsallis-INF Algorithm in Stochastically Constrained Adversarial Bandits and Stochastic Bandits with Adversarial Corruptions

    Authors: Saeed Masoudian, Yevgeny Seldin

    Abstract: We derive improved regret bounds for the Tsallis-INF algorithm of Zimmert and Seldin (2021). We show that in adversarial regimes with a $(Δ,C,T)$ self-bounding constraint the algorithm achieves… ▽ More

    Submitted 13 September, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Published Version in COLT 2021

    Journal ref: Conference on Learning Theory 134 (2021) 3330-3350

  9. arXiv:2102.09864  [pdf, other

    cs.LG stat.ML

    An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

    Authors: Chloé Rouyer, Yevgeny Seldin, Nicolò Cesa-Bianchi

    Abstract: We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price $λ$ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax opt… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  10. arXiv:2007.13532  [pdf, other

    cs.LG stat.ML

    Second Order PAC-Bayesian Bounds for the Weighted Majority Vote

    Authors: Andrés R. Masegosa, Stephan S. Lorenzen, Christian Igel, Yevgeny Seldin

    Abstract: We present a novel analysis of the expected risk of weighted majority vote in multiclass classification. The analysis takes correlation of predictions by ensemble members into account and provides a bound that is amenable to efficient minimization, which yields improved weighting for the majority vote. We also provide a specialized version of our bound for binary classification, which allows to ex… ▽ More

    Submitted 17 December, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  11. arXiv:1910.06054  [pdf, ps, other

    cs.LG stat.ML

    An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

    Authors: Julian Zimmert, Yevgeny Seldin

    Abstract: We propose a new algorithm for adversarial multi-armed bandits with unrestricted delays. The algorithm is based on a novel hybrid regularizer applied in the Follow the Regularized Leader (FTRL) framework. It achieves $\mathcal{O}(\sqrt{kn}+\sqrt{D\log(k)})$ regret guarantee, where $k$ is the number of arms, $n$ is the number of rounds, and $D$ is the total delay. The result matches the lower bound… ▽ More

    Submitted 16 June, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

  12. arXiv:1906.00670  [pdf, other

    cs.LG stat.ML

    Nonstochastic Multiarmed Bandits with Unrestricted Delays

    Authors: Tobias Sommer Thune, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We investigate multiarmed bandits with delayed feedback, where the delays need neither be identical nor bounded. We first prove that "delayed" Exp3 achieves the $O(\sqrt{(KT + D)\ln K} )$ regret bound conjectured by Cesa-Bianchi et al. [2019] in the case of variable, but bounded delays. Here, $K$ is the number of actions and $D$ is the total delay over $T$ rounds. We then introduce a new algorithm… ▽ More

    Submitted 19 November, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: 9 pages, Neurips camera ready

  13. arXiv:1810.09746  [pdf, ps, other

    cs.LG stat.ML

    On PAC-Bayesian Bounds for Random Forests

    Authors: Stephan Sloth Lorenzen, Christian Igel, Yevgeny Seldin

    Abstract: Existing guarantees in terms of rigorous upper bounds on the generalization error for the original random forest algorithm, one of the most frequently used machine learning methods, are unsatisfying. We discuss and evaluate various PAC-Bayesian approaches to derive such bounds. The bounds do not require additional hold-out data, because the out-of-bag samples from the bagging in the training proce… ▽ More

    Submitted 6 March, 2019; v1 submitted 23 October, 2018; originally announced October 2018.

  14. arXiv:1807.07623  [pdf, other

    cs.LG stat.ML

    Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits

    Authors: Julian Zimmert, Yevgeny Seldin

    Abstract: We derive an algorithm that achieves the optimal (within constants) pseudo-regret in both adversarial and stochastic multi-armed bandits without prior knowledge of the regime and time horizon. The algorithm is based on online mirror descent (OMD) with Tsallis entropy regularization with power $α=1/2$ and reduced-variance loss estimators. More generally, we define an adversarial regime with a self-… ▽ More

    Submitted 2 March, 2022; v1 submitted 19 July, 2018; originally announced July 2018.

  15. arXiv:1807.01488  [pdf, ps, other

    cs.LG stat.ML

    Factored Bandits

    Authors: Julian Zimmert, Yevgeny Seldin

    Abstract: We introduce the factored bandits model, which is a framework for learning with limited (bandit) feedback, where actions can be decomposed into a Cartesian product of atomic actions. Factored bandits incorporate rank-1 bandits as a special case, but significantly relax the assumptions on the form of the reward function. We provide an anytime algorithm for stochastic factored bandits and up to cons… ▽ More

    Submitted 29 October, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  16. arXiv:1807.00636  [pdf, other

    cs.LG stat.ML

    Adaptation to Easy Data in Prediction with Limited Advice

    Authors: Tobias Sommer Thune, Yevgeny Seldin

    Abstract: We derive an online learning algorithm with improved regret guarantees for `easy' loss sequences. We consider two types of `easiness': (a) stochastic loss sequences and (b) adversarial loss sequences with small effective range of the losses. While a number of algorithms have been proposed for exploiting small effective range in the full information setting, Gerchinovitz and Lattimore [2016] have s… ▽ More

    Submitted 27 August, 2019; v1 submitted 2 July, 2018; originally announced July 2018.

    Comments: Fixed a mistake in the proof and statement of Theorem 3

  17. arXiv:1702.06103  [pdf, ps, other

    cs.LG stat.ML

    An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

    Authors: Yevgeny Seldin, Gábor Lugosi

    Abstract: We present a new strategy for gap estimation in randomized algorithms for multiarmed bandits and combine it with the EXP3++ algorithm of Seldin and Slivkins (2014). In the stochastic regime the strategy reduces dependence of regret on a time horizon from $(\ln t)^3$ to $(\ln t)^2$ and eliminates an additive factor of order $Δe^{1/Δ^2}$, where $Δ$ is the minimal gap of a problem instance. In the ad… ▽ More

    Submitted 9 May, 2017; v1 submitted 20 February, 2017; originally announced February 2017.

  18. arXiv:1608.06253  [pdf, other

    cs.IR cs.LG stat.ML

    Multi-Dueling Bandits and Their Application to Online Ranker Evaluation

    Authors: Brian Brost, Yevgeny Seldin, Ingemar J. Cox, Christina Lioma

    Abstract: New ranking algorithms are continually being developed and refined, necessitating the development of efficient methods for evaluating these rankers. Online ranker evaluation focuses on the challenge of efficiently determining, from implicit user feedback, which ranker out of a finite set of rankers is the best. Online ranker evaluation can be modeled by dueling ban- dits, a mathematical model for… ▽ More

    Submitted 22 August, 2016; originally announced August 2016.

  19. arXiv:1608.05610  [pdf, other

    cs.LG stat.ML

    A Strongly Quasiconvex PAC-Bayesian Bound

    Authors: Niklas Thiemann, Christian Igel, Olivier Wintenberger, Yevgeny Seldin

    Abstract: We propose a new PAC-Bayesian bound and a way of constructing a hypothesis space, so that the bound is convex in the posterior distribution and also convex in a trade-off parameter between empirical performance of the posterior distribution and its complexity. The complexity is measured by the Kullback-Leibler divergence to a prior. We derive an alternating procedure for minimizing the bound. We s… ▽ More

    Submitted 24 August, 2017; v1 submitted 19 August, 2016; originally announced August 2016.

  20. arXiv:1608.00788  [pdf, other

    cs.IR

    An Improved Multileaving Algorithm for Online Ranker Evaluation

    Authors: Brian Brost, Ingemar J. Cox, Yevgeny Seldin, Christina Lioma

    Abstract: Online ranker evaluation is a key challenge in information retrieval. An important task in the online evaluation of rankers is using implicit user feedback for inferring preferences between rankers. Interleaving methods have been found to be efficient and sensitive, i.e. they can quickly detect even small differences in quality. It has recently been shown that multileaving methods exhibit similar… ▽ More

    Submitted 2 August, 2016; originally announced August 2016.

  21. arXiv:1304.3708  [pdf, ps, other

    cs.LG stat.ML

    Advice-Efficient Prediction with Expert Advice

    Authors: Yevgeny Seldin, Peter Bartlett, Koby Crammer

    Abstract: Advice-efficient prediction with expert advice (in analogy to label-efficient prediction) is a variant of prediction with expert advice game, where on each round of the game we are allowed to ask for advice of a limited number $M$ out of $N$ experts. This setting is especially interesting when asking for advice of every expert on every round is expensive. We present an algorithm for advice-efficie… ▽ More

    Submitted 12 April, 2013; originally announced April 2013.

  22. arXiv:1110.6886  [pdf, other

    cs.LG cs.IT stat.ML

    PAC-Bayesian Inequalities for Martingales

    Authors: Yevgeny Seldin, François Laviolette, Nicolò Cesa-Bianchi, John Shawe-Taylor, Peter Auer

    Abstract: We present a set of high-probability inequalities that control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. Our results extend the PAC-Bayesian analysis in learning theory from the i.i.d. setting to martingales opening the way for its application to importance weighted sampling, reinforcement learning, and ot… ▽ More

    Submitted 30 July, 2012; v1 submitted 31 October, 2011; originally announced October 2011.

  23. arXiv:1110.6755  [pdf, other

    cs.LG

    PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

    Authors: Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter Auer, François Laviolette, John Shawe-Taylor

    Abstract: We develop a new tool for data-dependent analysis of the exploration-exploitation trade-off in learning under limited feedback. Our tool is based on two main ingredients. The first ingredient is a new concentration inequality that makes it possible to control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. The s… ▽ More

    Submitted 30 January, 2012; v1 submitted 31 October, 2011; originally announced October 2011.

  24. arXiv:1105.4585  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off

    Authors: Yevgeny Seldin, Nicolò Cesa-Bianchi, François Laviolette, Peter Auer, John Shawe-Taylor, Jan Peters

    Abstract: We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolvin… ▽ More

    Submitted 23 May, 2011; originally announced May 2011.

    Comments: On-line Trading of Exploration and Exploitation 2 - ICML-2011 workshop. http://explo.cs.ucl.ac.uk/workshop/

  25. arXiv:1105.2416  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayesian Analysis of Martingales and Multiarmed Bandits

    Authors: Yevgeny Seldin, François Laviolette, John Shawe-Taylor, Jan Peters, Peter Auer

    Abstract: We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound concen… ▽ More

    Submitted 19 May, 2011; v1 submitted 12 May, 2011; originally announced May 2011.

  26. arXiv:1009.0499  [pdf, other

    cs.LG cs.DS stat.ML

    A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

    Authors: Yevgeny Seldin

    Abstract: We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian anal… ▽ More

    Submitted 2 September, 2010; originally announced September 2010.