Skip to main content

Showing 1–10 of 10 results for author: Cappé, O

Searching in archive math. Search in all archives.
.
  1. arXiv:2210.05222  [pdf, other

    cs.AI math.ST

    Stochastic Direct Search Method for Blind Resource Allocation

    Authors: Juliette Achddou, Olivier Cappe, Aurélien Garivier

    Abstract: Motivated by programmatic advertising optimization, we consider the task of sequentially allocating budget across a set of resources. At every time step, a feasible allocation is chosen and only a corresponding random return is observed. The goal is to maximize the cumulative expected sum of returns. This is a realistic model for budget allocation across subdivisions of marketing campaigns, with t… ▽ More

    Submitted 1 October, 2024; v1 submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Transactions on Machine Learning Research Journal, 2024

  2. arXiv:1606.02448  [pdf, other

    cs.LG math.ST

    Multiple-Play Bandits in the Position-Based Model

    Authors: Paul Lagrée, Claire Vernade, Olivier Cappé

    Abstract: Sequentially learning to place items in multi-position displays or lists is a task that can be cast into the multiple-play semi-bandit setting. However, a major concern in this context is when the system cannot decide whether the user feedback for each item is actually exploitable. Indeed, much of the content may have been simply ignored by the user. The present work proposes to exploit available… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

  3. arXiv:1405.3224  [pdf, other

    math.ST cs.LG stat.ML

    On the Complexity of A/B Testing

    Authors: Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

    Abstract: A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes. We provide distribution-dependent lower bounds for the performance of A/B testing that improve over the results currently available both in the fixed-confidence (or delta-PAC) and fixed-budget settings. When the distribution of the outcomes are Gaussian, we prove that the complexity… ▽ More

    Submitted 24 February, 2015; v1 submitted 13 May, 2014; originally announced May 2014.

    Journal ref: Conference on Learning Theory, Jun 2014, Barcelona, Spain. JMLR: Workshop and Conference Proceedings, 35, pp.461-481

  4. arXiv:1210.2601  [pdf, ps, other

    stat.CO math.PR math.ST stat.ME

    Adaptive MCMC with online relabeling

    Authors: Rémi Bardenet, Olivier Cappé, Gersende Fort, Balázs Kégl

    Abstract: When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal… ▽ More

    Submitted 27 July, 2015; v1 submitted 9 October, 2012; originally announced October 2012.

    Comments: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Report number: IMS-BEJ-BEJ578

    Journal ref: Bernoulli 2015, Vol. 21, No. 3, 1304-1340

  5. arXiv:1210.1136  [pdf, ps, other

    math.PR math.ST

    Kullback-Leibler upper confidence bounds for optimal sequential allocation

    Authors: Olivier Cappé, Aurélien Garivier, Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz

    Abstract: We consider optimal sequential allocation in the context of the so-called stochastic multi-armed bandit model. We describe a generic index policy, in the sense of Gittins [J. R. Stat. Soc. Ser. B Stat. Methodol. 41 (1979) 148-177], based on upper confidence bounds of the arm payoffs computed using the Kullback-Leibler divergence. We consider two classes of distributions for which instances of this… ▽ More

    Submitted 26 August, 2013; v1 submitted 3 October, 2012; originally announced October 2012.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOS1119 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1119

    Journal ref: Annals of Statistics 2013, Vol. 41, No. 3, 1516-1541

  6. arXiv:1107.1971  [pdf, ps, other

    math.ST

    Homogeneity and change-point detection tests for multivariate data using rank statistics

    Authors: Alexandre Lung-Yut-Fong, Céline Lévy-Leduc, Olivier Cappé

    Abstract: Detecting and locating changes in highly multivariate data is a major concern in several current statistical applications. In this context, the first contribution of the paper is a novel non-parametric two-sample homogeneity test for multivariate data based on the well-known Wilcoxon rank statistic. The proposed two-sample homogeneity test statistic can be extended to deal with ordinal or censored… ▽ More

    Submitted 9 February, 2012; v1 submitted 11 July, 2011; originally announced July 2011.

    Comments: 30 pages, submitted

  7. arXiv:1102.2490  [pdf, ps, other

    math.ST cs.LG eess.SY math.OC

    The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

    Authors: Aurélien Garivier, Olivier Cappé

    Abstract: This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems. We prove two distinct results: first, for arbitrary bounded rewards, the KL-UCB algorithm satisfies a uniformly better regret bound than UCB or UCB2; second, in the special case of Bernoulli rewards, it reaches the lower bound of Lai and Robbins. Furthermore, we… ▽ More

    Submitted 29 August, 2013; v1 submitted 12 February, 2011; originally announced February 2011.

    Comments: 18 pages, 3 figures; Conf. Comput. Learning Theory (COLT) 2011 in Budapest, Hungary

    MSC Class: 93E35

    Journal ref: Conference On Learning Theory n°24 Jul. 2011 pp.359-376

  8. arXiv:1004.5229  [pdf, ps, other

    cs.LG math.ST stat.ML

    Optimism in Reinforcement Learning and Kullback-Leibler Divergence

    Authors: Sarah Filippi, Olivier Cappé, Aurélien Garivier

    Abstract: We consider model-based reinforcement learning in finite Markov De- cision Processes (MDPs), focussing on so-called optimistic strategies. In MDPs, optimism can be implemented by carrying out extended value it- erations under a constraint of consistency with the estimated model tran- sition probabilities. The UCRL2 algorithm by Auer, Jaksch and Ortner (2009), which follows this strategy, has recen… ▽ More

    Submitted 13 October, 2010; v1 submitted 29 April, 2010; originally announced April 2010.

    Comments: This work has been accepted and presented at ALLERTON 2010; Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on, Monticello (Illinois) : États-Unis (2010)

  9. arXiv:0909.5524  [pdf, ps, other

    stat.AP cs.NI math.ST

    Distributed detection/localization of change-points in high-dimensional network traffic data

    Authors: Alexandre Lung-Yut-Fong, Céline Lévy-Leduc, Olivier Cappé

    Abstract: We propose a novel approach for distributed statistical detection of change-points in high-volume network traffic. We consider more specifically the task of detecting and identifying the targets of Distributed Denial of Service (DDoS) attacks. The proposed algorithm, called DTopRank, performs distributed network anomaly detection by aggregating the partial information gathered in a set of network… ▽ More

    Submitted 20 September, 2011; v1 submitted 30 September, 2009; originally announced September 2009.

    Comments: Statistics and Computing (2011) 1-12

  10. Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models

    Authors: Jimmy Olsson, Olivier Cappé, Randal Douc, Eric Moulines

    Abstract: This paper concerns the use of sequential Monte Carlo methods (SMC) for smoothing in general state space models. A well-known problem when applying the standard SMC technique in the smoothing mode is that the resampling mechanism introduces degeneracy of the approximation in the path space. However, when performing maximum likelihood estimation via the EM algorithm, all functionals involved are… ▽ More

    Submitted 6 March, 2008; v1 submitted 19 September, 2006; originally announced September 2006.

    Comments: Published in at http://dx.doi.org/10.3150/07-BEJ6150 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Journal ref: Bernoulli 14, 1 (2008) 155-179