Search | arXiv e-print repository

arXiv:2411.02023 [pdf, other]

Optimal Classification under Performative Distribution Shift

Authors: Edwige Cyffers, Muni Sreenivas Pydi, Jamal Atif, Olivier Cappé

Abstract: Performative learning addresses the increasingly pervasive situations in which algorithmic decisions may induce changes in the data distribution as a consequence of their public deployment. We propose a novel view in which these performative effects are modelled as push-forward measures. This general framework encompasses existing models and enables novel performative gradient estimation methods,… ▽ More Performative learning addresses the increasingly pervasive situations in which algorithmic decisions may induce changes in the data distribution as a consequence of their public deployment. We propose a novel view in which these performative effects are modelled as push-forward measures. This general framework encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies. For distribution shifts, unlike previous models which require full specification of the data distribution, we only assume knowledge of the shift operator that represents the performative changes. This approach can also be integrated into various change-of-variablebased models, such as VAEs or normalizing flows. Focusing on classification with a linear-in-parameters performative effect, we prove the convexity of the performative risk under a new set of assumptions. Notably, we do not limit the strength of performative effects but rather their direction, requiring only that classification becomes harder when deploying more accurate models. In this case, we also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem. Finally, we illustrate our approach on synthetic and real datasets. △ Less

Submitted 4 November, 2024; originally announced November 2024.

Comments: 38th Conference on Neural Information Processing Systems, Dec 2024, Vancouver (Canada), Canada

arXiv:2110.15573 [pdf, other]

A/B/n Testing with Control in the Presence of Subpopulations

Authors: Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen

Abstract: Motivated by A/B/n testing applications, we consider a finite set of distributions (called \emph{arms}), one of which is treated as a \emph{control}. We assume that the population is stratified into homogeneous subpopulations. At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation. The qual… ▽ More Motivated by A/B/n testing applications, we consider a finite set of distributions (called \emph{arms}), one of which is treated as a \emph{control}. We assume that the population is stratified into homogeneous subpopulations. At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation. The quality of each arm is assessed through a weighted combination of its subpopulation means. We propose a strategy for sequentially choosing one arm per time step so as to discover as fast as possible which arms, if any, have higher weighted expectation than the control. This strategy is shown to be asymptotically optimal in the following sense: if $τ_δ$ is the first time when the strategy ensures that it is able to output the correct answer with probability at least $1-δ$, then $\mathbb{E}[τ_δ]$ grows linearly with $\log(1/δ)$ at the exact optimal rate. This rate is identified in the paper in three different settings: (1) when the experimenter does not observe the subpopulation information, (2) when the subpopulation of each sample is observed but not chosen, and (3) when the experimenter can select the subpopulation from which each response is sampled. We illustrate the efficiency of the proposed strategy with numerical simulations on synthetic and real data collected from an A/B/n experiment. △ Less

Submitted 29 October, 2021; originally announced October 2021.

Journal ref: NeurIPS 2021, Dec 2021, Virtual, France

arXiv:2107.01835 [pdf, other]

Fast Rate Learning in Stochastic First Price Bidding

Authors: Juliette Achddou, Olivier Cappé, Aurélien Garivier

Abstract: First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. They have already given rise to several works in sequen… ▽ More First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. They have already given rise to several works in sequential learning, many of which consider models for which the value of the buyer or the opponents' maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose regret grows as $\sqrt{T}$ with respect to the time horizon $T$. Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower regret: when the opponents' maximal bid distribution is known we provide an algorithm whose regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ ε}$ regret, for any $ε>0$. To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by transposing results obtained in the posted price setting, we provide conditions under which the first-price biding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform. △ Less

Submitted 22 November, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

Journal ref: ACML 2021 - Proceedings of Machine Learning Research 157, 2021, Nov 2021, SIngapore, Singapore

arXiv:2011.05072 [pdf, other]

Efficient Algorithms for Stochastic Repeated Second-price Auctions

Authors: Juliette Achddou, Olivier Cappé, Aurélien Garivier

Abstract: Developing efficient sequential bidding strategies for repeated auctions is an important practical challenge in various marketing tasks. In this setting, the bidding agent obtains information, on both the value of the item at sale and the behavior of the other bidders, only when she wins the auction. Standard bandit theory does not apply to this problem due to the presence of action-dependent cen… ▽ More Developing efficient sequential bidding strategies for repeated auctions is an important practical challenge in various marketing tasks. In this setting, the bidding agent obtains information, on both the value of the item at sale and the behavior of the other bidders, only when she wins the auction. Standard bandit theory does not apply to this problem due to the presence of action-dependent censoring. In this work, we consider second-price auctions and propose novel, efficient UCB-like algorithms for this task. These algorithms are analyzed in the stochastic setting, assuming regularity of the distribution of the opponents' bids. We provide regret upper bounds that quantify the improvement over the baseline algorithm proposed in the literature. The improvement is particularly significant in cases when the value of the auctioned item is low, yielding a spectacular reduction in the order of the worst-case regret. We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies. As an alternative, we propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm. We provide a critical analysis of this class of strategies, showing both important advantages and limitations. In particular, we provide a minimax lower bound and propose a nearly minimax-optimal instance of this class. △ Less

Submitted 26 February, 2021; v1 submitted 10 November, 2020; originally announced November 2020.

Journal ref: ALT 2021, Mar 2021, Paris, France

arXiv:2011.00819 [pdf, other]

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

Authors: Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier

Abstract: Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them. In contrast to the case of linear bandits, existing algorithms for GLB have two drawbacks undermining their applicability. First, they rely on excessively pessimistic concentration bounds due to the non-linear na… ▽ More Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them. In contrast to the case of linear bandits, existing algorithms for GLB have two drawbacks undermining their applicability. First, they rely on excessively pessimistic concentration bounds due to the non-linear nature of the model. Second, they require either non-convex projection steps or burn-in phases to enforce boundedness of the estimators. Both of these issues are worsened when considering non-stationary models, in which the GLB parameter may vary with time. In this work, we focus on self-concordant GLB (which include logistic and Poisson regression) with forgetting achieved either by the use of a sliding window or exponential weights. We propose a novel confidence-based algorithm for the maximum-likehood estimator with forgetting and analyze its perfomance in abruptly changing environments. These results as well as the accompanying numerical simulations highlight the potential of the proposed approach to address non-stationarity in GLB. △ Less

Submitted 4 March, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

Journal ref: AISTATS 2021 - International Conference on Artificial Intelligence and Statistics, Apr 2021, San Diego / Virtual, United States

arXiv:2006.12843 [pdf, other]

doi 10.1109/TSP.2021.3060000

A Comparative Study of Gamma Markov Chains for Temporal Non-Negative Matrix Factorization

Authors: Louis Filstroff, Olivier Gouvert, Cédric Févotte, Olivier Cappé

Abstract: Non-negative matrix factorization (NMF) has become a well-established class of methods for the analysis of non-negative data. In particular, a lot of effort has been devoted to probabilistic NMF, namely estimation or inference tasks in probabilistic models describing the data, based for example on Poisson or exponential likelihoods. When dealing with time series data, several works have proposed t… ▽ More Non-negative matrix factorization (NMF) has become a well-established class of methods for the analysis of non-negative data. In particular, a lot of effort has been devoted to probabilistic NMF, namely estimation or inference tasks in probabilistic models describing the data, based for example on Poisson or exponential likelihoods. When dealing with time series data, several works have proposed to model the evolution of the activation coefficients as a non-negative Markov chain, most of the time in relation with the Gamma distribution, giving rise to so-called temporal NMF models. In this paper, we review four Gamma Markov chains of the NMF literature, and show that they all share the same drawback: the absence of a well-defined stationary distribution. We then introduce a fifth process, an overlooked model of the time series literature named BGAR(1), which overcomes this limitation. These temporal NMF models are then compared in a MAP framework on a prediction task, in the context of the Poisson likelihood. △ Less

Submitted 25 February, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

Comments: Code available at https://github.com/lfilstro/TemporalNMF

arXiv:2003.10113 [pdf, other]

Algorithms for Non-Stationary Generalized Linear Bandits

Authors: Yoan Russac, Olivier Cappé, Aurélien Garivier

Abstract: The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings. In the example of binary rewards, logistic regression is well-known to be preferable to the use of standard linear modeling. Previous works have shown how to deal with GLMs in contextual online learning… ▽ More The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings. In the example of binary rewards, logistic regression is well-known to be preferable to the use of standard linear modeling. Previous works have shown how to deal with GLMs in contextual online learning with bandit feedback when the environment is assumed to be stationary. In this paper, we relax this latter assumption and propose two upper confidence bound based algorithms that make use of either a sliding window or a discounted maximum-likelihood estimator. We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes. These results take the form of high probability upper bounds for the dynamic regret that are of order d^2/3 G^1/3 T^2/3 , where d, T and G are respectively the dimension of the unknown parameter, the number of rounds and the number of breakpoints up to time T. The empirical performance of the algorithms is illustrated in simulated environments. △ Less

Submitted 23 March, 2020; originally announced March 2020.

arXiv:1909.09146 [pdf, other]

Weighted Linear Bandits for Non-Stationary Environments

Authors: Yoan Russac, Claire Vernade, Olivier Cappé

Abstract: We consider a stochastic linear bandit model in which the available actions correspond to arbitrary context vectors whose associated rewards follow a non-stationary linear regression model. In this setting, the unknown regression parameter is allowed to vary in time. To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential… ▽ More We consider a stochastic linear bandit model in which the available actions correspond to arbitrary context vectors whose associated rewards follow a non-stationary linear regression model. In this setting, the unknown regression parameter is allowed to vary in time. To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past. This involves studying the deviations of the sequential weighted least-squares estimator under generic assumptions. As a by-product, we obtain novel deviation results that can be used beyond non-stationary environments. We provide theoretical guarantees on the behavior of D-LinUCB in both slowly-varying and abruptly-changing environments. We obtain an upper bound on the dynamic regret that is of order d^{2/3} B\_T^{1/3}T^{2/3}, where B\_T is a measure of non-stationarity (d and T being, respectively, dimension and horizon). This rate is known to be optimal. We also illustrate the empirical performance of D-LinUCB and compare it with recently proposed alternatives in simulated environments. △ Less

Submitted 20 March, 2020; v1 submitted 19 September, 2019; originally announced September 2019.

Journal ref: NeurIPS 2019 - 33rd Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada

arXiv:1509.09130 [pdf, ps, other]

Learning From Missing Data Using Selection Bias in Movie Recommendation

Authors: Claire Vernade, Olivier Cappé

Abstract: Recommending items to users is a challenging task due to the large amount of missing information. In many cases, the data solely consist of ratings or tags voluntarily contributed by each user on a very limited subset of the available items, so that most of the data of potential interest is actually missing. Current approaches to recommendation usually assume that the unobserved data is missing at… ▽ More Recommending items to users is a challenging task due to the large amount of missing information. In many cases, the data solely consist of ratings or tags voluntarily contributed by each user on a very limited subset of the available items, so that most of the data of potential interest is actually missing. Current approaches to recommendation usually assume that the unobserved data is missing at random. In this contribution, we provide statistical evidence that existing movie recommendation datasets reveal a significant positive association between the rating of items and the propensity to select these items. We propose a computationally efficient variational approach that makes it possible to exploit this selection bias so as to improve the estimation of ratings from small populations of users. Results obtained with this approach applied to neighborhood-based collaborative filtering illustrate its potential for improving the reliability of the recommendation. △ Less

Submitted 30 September, 2015; originally announced September 2015.

arXiv:1407.4443 [pdf, other]

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

Authors: Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

Abstract: The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning. Whereas the achievable limit in terms of regret minimization is now well known, our aim is to contribute to a better understanding of the performance in terms of identifying the m best arms. We introduce generic notions of complexity for the two domi… ▽ More The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning. Whereas the achievable limit in terms of regret minimization is now well known, our aim is to contribute to a better understanding of the performance in terms of identifying the m best arms. We introduce generic notions of complexity for the two dominant frameworks considered in the literature: fixed-budget and fixed-confidence settings. In the fixed-confidence setting, we provide the first known distribution-dependent lower bound on the complexity that involves information-theoretic quantities and holds when m is larger than 1 under general assumptions. In the specific case of two armed-bandits, we derive refined lower bounds in both the fixed-confidence and fixed-budget settings, along with matching algorithms for Gaussian and Bernoulli bandit models. These results show in particular that the complexity of the fixed-budget setting may be smaller than the complexity of the fixed-confidence setting, contradicting the familiar behavior observed when testing fully specified alternatives. In addition, we also provide improved sequential stopping rules that have guaranteed error probabilities and shorter average running times. The proofs rely on two technical results that are of independent interest : a deviation lemma for self-normalized sums (Lemma 19) and a novel change of measure inequality for bandit models (Lemma 1). △ Less

Submitted 14 November, 2016; v1 submitted 16 July, 2014; originally announced July 2014.

Comments: arXiv admin note: text overlap with arXiv:1405.3224

Journal ref: Journal of Machine Learning Research, Journal of Machine Learning Research, 2016, 17, pp.1-42

arXiv:1405.3224 [pdf, other]

On the Complexity of A/B Testing

Authors: Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

Abstract: A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes. We provide distribution-dependent lower bounds for the performance of A/B testing that improve over the results currently available both in the fixed-confidence (or delta-PAC) and fixed-budget settings. When the distribution of the outcomes are Gaussian, we prove that the complexity… ▽ More A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes. We provide distribution-dependent lower bounds for the performance of A/B testing that improve over the results currently available both in the fixed-confidence (or delta-PAC) and fixed-budget settings. When the distribution of the outcomes are Gaussian, we prove that the complexity of the fixed-confidence and fixed-budget settings are equivalent, and that uniform sampling of both alternatives is optimal only in the case of equal variances. In the common variance case, we also provide a stopping rule that terminates faster than existing fixed-confidence algorithms. In the case of Bernoulli distributions, we show that the complexity of fixed-budget setting is smaller than that of fixed-confidence setting and that uniform sampling of both alternatives -though not optimal- is advisable in practice when combined with an appropriate stopping criterion. △ Less

Submitted 24 February, 2015; v1 submitted 13 May, 2014; originally announced May 2014.

Journal ref: Conference on Learning Theory, Jun 2014, Barcelona, Spain. JMLR: Workshop and Conference Proceedings, 35, pp.461-481

arXiv:1210.2601 [pdf, ps, other]

doi 10.3150/13-BEJ578

Adaptive MCMC with online relabeling

Authors: Rémi Bardenet, Olivier Cappé, Gersende Fort, Balázs Kégl

Abstract: When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal… ▽ More When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal distribution using an online estimate of the covariance matrix of the target, are no exception. To address the label-switching issue, relabeling algorithms associate a permutation to each MCMC sample, trying to obtain reasonable marginals. In the case of adaptive Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is required. This paper is devoted to the AMOR algorithm, a provably consistent variant of AM that can cope with the label-switching problem. The idea is to nest relabeling steps within the MCMC algorithm based on the estimation of a single covariance matrix that is used both for adapting the covariance of the proposal distribution in the Metropolis algorithm step and for online relabeling. We compare the behavior of AMOR to similar relabeling methods. In the case of compactly supported target distributions, we prove a strong law of large numbers for AMOR and its ergodicity. These are the first results on the consistency of an online relabeling algorithm to our knowledge. The proof underlines latent relations between relabeling and vector quantization. △ Less

Submitted 27 July, 2015; v1 submitted 9 October, 2012; originally announced October 2012.

Comments: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ578

Journal ref: Bernoulli 2015, Vol. 21, No. 3, 1304-1340

arXiv:1102.1796 [pdf, ps, other]

Robust Retrospective Multiple Change-point Estimation for Multivariate Data

Authors: Alexandre Lung-Yut-Fong, Céline Lévy-Leduc, Olivier Cappé

Abstract: We propose a non-parametric statistical procedure for detecting multiple change-points in multidimensional signals. The method is based on a test statistic that generalizes the well-known Kruskal-Wallis procedure to the multivariate setting. The proposed approach does not require any knowledge about the distribution of the observations and is parameter-free. It is computationally efficient thanks… ▽ More We propose a non-parametric statistical procedure for detecting multiple change-points in multidimensional signals. The method is based on a test statistic that generalizes the well-known Kruskal-Wallis procedure to the multivariate setting. The proposed approach does not require any knowledge about the distribution of the observations and is parameter-free. It is computationally efficient thanks to the use of dynamic programming and can also be applied when the number of change-points is unknown. The method is shown through simulations to be more robust than alternatives, particularly when faced with atypical distributions (e.g., with outliers), high noise levels and/or high-dimensional data. △ Less

Submitted 10 February, 2011; v1 submitted 9 February, 2011; originally announced February 2011.

Comments: submitted to IEEE Workshop on Statistical Signal Processing 2011

arXiv:1011.1745 [pdf, ps, other]

Online Expectation-Maximisation

Authors: Olivier Cappé

Abstract: Tutorial chapter on the Online EM algorithm to appear in the volume 'Mixtures' edited by Kerrie Mengersen, Mike Titterington and Christian P. Robert. Tutorial chapter on the Online EM algorithm to appear in the volume 'Mixtures' edited by Kerrie Mengersen, Mike Titterington and Christian P. Robert. △ Less

Submitted 8 November, 2010; originally announced November 2010.

arXiv:1004.5229 [pdf, ps, other]

doi 10.1109/ALLERTON.2010.5706896

Optimism in Reinforcement Learning and Kullback-Leibler Divergence

Authors: Sarah Filippi, Olivier Cappé, Aurélien Garivier

Abstract: We consider model-based reinforcement learning in finite Markov De- cision Processes (MDPs), focussing on so-called optimistic strategies. In MDPs, optimism can be implemented by carrying out extended value it- erations under a constraint of consistency with the estimated model tran- sition probabilities. The UCRL2 algorithm by Auer, Jaksch and Ortner (2009), which follows this strategy, has recen… ▽ More We consider model-based reinforcement learning in finite Markov De- cision Processes (MDPs), focussing on so-called optimistic strategies. In MDPs, optimism can be implemented by carrying out extended value it- erations under a constraint of consistency with the estimated model tran- sition probabilities. The UCRL2 algorithm by Auer, Jaksch and Ortner (2009), which follows this strategy, has recently been shown to guarantee near-optimal regret bounds. In this paper, we strongly argue in favor of using the Kullback-Leibler (KL) divergence for this purpose. By studying the linear maximization problem under KL constraints, we provide an ef- ficient algorithm, termed KL-UCRL, for solving KL-optimistic extended value iteration. Using recent deviation bounds on the KL divergence, we prove that KL-UCRL provides the same guarantees as UCRL2 in terms of regret. However, numerical experiments on classical benchmarks show a significantly improved behavior, particularly when the MDP has reduced connectivity. To support this observation, we provide elements of com- parison between the two algorithms based on geometric considerations. △ Less

Submitted 13 October, 2010; v1 submitted 29 April, 2010; originally announced April 2010.

Comments: This work has been accepted and presented at ALLERTON 2010; Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on, Monticello (Illinois) : États-Unis (2010)

arXiv:0909.5524 [pdf, ps, other]

doi 10.1007/s11222-011-9240-5

Distributed detection/localization of change-points in high-dimensional network traffic data

Authors: Alexandre Lung-Yut-Fong, Céline Lévy-Leduc, Olivier Cappé

Abstract: We propose a novel approach for distributed statistical detection of change-points in high-volume network traffic. We consider more specifically the task of detecting and identifying the targets of Distributed Denial of Service (DDoS) attacks. The proposed algorithm, called DTopRank, performs distributed network anomaly detection by aggregating the partial information gathered in a set of network… ▽ More We propose a novel approach for distributed statistical detection of change-points in high-volume network traffic. We consider more specifically the task of detecting and identifying the targets of Distributed Denial of Service (DDoS) attacks. The proposed algorithm, called DTopRank, performs distributed network anomaly detection by aggregating the partial information gathered in a set of network monitors. In order to address massive data while limiting the communication overhead within the network, the approach combines record filtering at the monitor level and a nonparametric rank test for doubly censored time series at the central decision site. The performance of the DTopRank algorithm is illustrated both on synthetic data as well as from a traffic trace provided by a major Internet service provider. △ Less

Submitted 20 September, 2011; v1 submitted 30 September, 2009; originally announced September 2009.

Comments: Statistics and Computing (2011) 1-12

arXiv:0908.2359 [pdf, ps, other]

Online EM Algorithm for Hidden Markov Models

Authors: Olivier Cappé

Abstract: Online (also called "recursive" or "adaptive") estimation of fixed model parameters in hidden Markov models is a topic of much interest in times series modelling. In this work, we propose an online parameter estimation algorithm that combines two key ideas. The first one, which is deeply rooted in the Expectation-Maximization (EM) methodology consists in reparameterizing the problem using complete… ▽ More Online (also called "recursive" or "adaptive") estimation of fixed model parameters in hidden Markov models is a topic of much interest in times series modelling. In this work, we propose an online parameter estimation algorithm that combines two key ideas. The first one, which is deeply rooted in the Expectation-Maximization (EM) methodology consists in reparameterizing the problem using complete-data sufficient statistics. The second ingredient consists in exploiting a purely recursive form of smoothing in HMMs based on an auxiliary recursion. Although the proposed online EM algorithm resembles a classical stochastic approximation (or Robbins-Monro) algorithm, it is sufficiently different to resist conventional analysis of convergence. We thus provide limited results which identify the potential limiting points of the recursion as well as the large-sample behavior of the quantities involved in the algorithm. The performance of the proposed algorithm is numerically evaluated through simulations in the case of a noisily observed Markov chain. In this case, the algorithm reaches estimation results that are comparable to that of the maximum likelihood estimator for large sample sizes. △ Less

Submitted 15 February, 2011; v1 submitted 17 August, 2009; originally announced August 2009.

Comments: Revised version, to appear in J. Comput. Graph. Statist

arXiv:0908.0319 [pdf, ps, other]

Regret Bounds for Opportunistic Channel Access

Authors: Sarah Filippi, Olivier Cappé, Aurélien Garivier

Abstract: We consider the task of opportunistic channel access in a primary system composed of independent Gilbert-Elliot channels where the secondary (or opportunistic) user does not dispose of a priori information regarding the statistical characteristics of the system. It is shown that this problem may be cast into the framework of model-based learning in a specific class of Partially Observed Markov D… ▽ More We consider the task of opportunistic channel access in a primary system composed of independent Gilbert-Elliot channels where the secondary (or opportunistic) user does not dispose of a priori information regarding the statistical characteristics of the system. It is shown that this problem may be cast into the framework of model-based learning in a specific class of Partially Observed Markov Decision Processes (POMDPs) for which we introduce an algorithm aimed at striking an optimal tradeoff between the exploration (or estimation) and exploitation requirements. We provide finite horizon regret bounds for this algorithm as well as a numerical evaluation of its performance in the single channel model as well as in the case of stochastically identical channels. △ Less

Submitted 3 August, 2009; originally announced August 2009.

arXiv:0903.0837 [pdf, ps, other]

doi 10.1103/PhysRevD.80.023507

Estimation of cosmological parameters using adaptive importance sampling

Authors: Darren Wraith, Martin Kilbinger, Karim Benabed, Olivier Cappé, Jean-François Cardoso, Gersende Fort, Simon Prunet, Christian P. Robert

Abstract: We present a Bayesian sampling algorithm called adaptive importance sampling or Population Monte Carlo (PMC), whose computational workload is easily parallelizable and thus has the potential to considerably reduce the wall-clock time required for sampling, along with providing other benefits. To assess the performance of the approach for cosmological problems, we use simulated and actual data co… ▽ More We present a Bayesian sampling algorithm called adaptive importance sampling or Population Monte Carlo (PMC), whose computational workload is easily parallelizable and thus has the potential to considerably reduce the wall-clock time required for sampling, along with providing other benefits. To assess the performance of the approach for cosmological problems, we use simulated and actual data consisting of CMB anisotropies, supernovae of type Ia, and weak cosmological lensing, and provide a comparison of results to those obtained using state-of-the-art Markov Chain Monte Carlo (MCMC). For both types of data sets, we find comparable parameter estimates for PMC and MCMC, with the advantage of a significantly lower computational time for PMC. In the case of WMAP5 data, for example, the wall-clock time reduces from several days for MCMC to a few hours using PMC on a cluster of processors. Other benefits of the PMC approach, along with potential difficulties in using the approach, are analysed and discussed. △ Less

Submitted 4 March, 2009; originally announced March 2009.

Comments: 17 pages, 11 figures

Journal ref: Phys.Rev.D80:023507,2009

arXiv:0712.4273 [pdf, ps, other]

doi 10.1111/j.1467-9868.2009.00698.x

Online EM Algorithm for Latent Data Models

Authors: Olivier Cappé, Eric Moulines

Abstract: In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data… ▽ More In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model. △ Less

Submitted 1 March, 2017; v1 submitted 27 December, 2007; originally announced December 2007.

Comments: Version that includes the corrigendum published in volume 73, part 5 (2011), of the Journal of the Royal Statistical Society, Series B + the correction of a typo in Eqs. (32-33)

Journal ref: Journal of the Royal Statistical Society: Series B, Royal Statistical Society, 2009, 71 (3), pp.593-613

arXiv:0710.4242 [pdf, ps, other]

doi 10.1007/s11222-008-9059-x

Adaptive Importance Sampling in General Mixture Classes

Authors: Olivier Cappé, Randal Douc, Arnaud Guillin, Jean-Michel Marin, Christian P. Robert

Abstract: In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the importance sampling performances, as measured by an entropy criterion. The method is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student… ▽ More In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the importance sampling performances, as measured by an entropy criterion. The method is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performances of the proposed scheme are studied on both artificial and real examples, highlighting in particular the benefit of a novel Rao-Blackwellisation device which can be easily incorporated in the updating scheme. △ Less

Submitted 30 May, 2008; v1 submitted 23 October, 2007; originally announced October 2007.

Comments: Removed misleading comment in Section 2

Journal ref: Statistics and Computing 18, 4 (2008) 447-459

Showing 1–21 of 21 results for author: Cappé, O