On the bias, risk and consistency of sample means in multi-armed bandits

Shin, Jaehyeok; Ramdas, Aaditya; Rinaldo, Alessandro

Mathematics > Statistics Theory

arXiv:1902.00746 (math)

[Submitted on 2 Feb 2019 (v1), last revised 30 Apr 2021 (this version, v3)]

Title:On the bias, risk and consistency of sample means in multi-armed bandits

Authors:Jaehyeok Shin, Aaditya Ramdas, Alessandro Rinaldo

View PDF

Abstract:The sample mean is among the most well studied estimators in statistics, having many desirable properties such as unbiasedness and consistency. However, when analyzing data collected using a multi-armed bandit (MAB) experiment, the sample mean is biased and much remains to be understood about its properties. For example, when is it consistent, how large is its bias, and can we bound its mean squared error? This paper delivers a thorough and systematic treatment of the bias, risk and consistency of MAB sample means. Specifically, we identify four distinct sources of selection bias (sampling, stopping, choosing and rewinding) and analyze them both separately and together. We further demonstrate that a new notion of \emph{effective sample size} can be used to bound the risk of the sample mean under suitable loss functions. We present several carefully designed examples to provide intuition on the different sources of selection bias we study. Our treatment is nonparametric and algorithm-agnostic, meaning that it is not tied to a specific algorithm or goal. In a nutshell, our proofs combine variational representations of information-theoretic divergences with new martingale concentration inequalities.

Comments:	48 pages
Subjects:	Statistics Theory (math.ST); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1902.00746 [math.ST]
	(or arXiv:1902.00746v3 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.1902.00746

Submission history

From: Jaehyeok Shin [view email]
[v1] Sat, 2 Feb 2019 16:23:08 UTC (812 KB)
[v2] Fri, 11 Oct 2019 07:11:31 UTC (591 KB)
[v3] Fri, 30 Apr 2021 03:41:45 UTC (58 KB)

Mathematics > Statistics Theory

Title:On the bias, risk and consistency of sample means in multi-armed bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:On the bias, risk and consistency of sample means in multi-armed bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators