Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Foster, Dylan J.; Krishnamurthy, Akshay

Computer Science > Machine Learning

arXiv:1806.10745v1 (cs)

[Submitted on 28 Jun 2018 (this version), latest version 4 Nov 2018 (v2)]

Title:Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Authors:Dylan J. Foster, Akshay Krishnamurthy

View PDF

Abstract:We introduce a new family of margin-based regret guarantees for adversarial contextual bandit learning. Our results are based on multiclass surrogate losses. Using the ramp loss, we derive a universal margin-based regret bound in terms of the sequential metric entropy for a benchmark class of real-valued regression functions. The new margin bound serves as a complete contextual bandit analogue of the classical margin bound from statistical learning. The result applies to large nonparametric classes, improving on the best known results for Lipschitz contextual bandits (Cesa-Bianchi et al., 2017) and, as a special case, generalizes the dimension-independent Banditron regret bound (Kakade et al., 2008) to arbitrary linear classes with smooth norms.
On the algorithmic side, we use the hinge loss to derive an efficient algorithm with a $\sqrt{dT}$-type mistake bound against benchmark policies induced by $d$-dimensional regression functions. This provides the first hinge loss-based solution to the open problem of Abernethy and Rakhlin (2009). With an additional i.i.d. assumption we give a simple oracle-efficient algorithm whose regret matches our generic metric entropy-based bound for sufficiently complex nonparametric classes.
Under realizability assumptions our results also yield classical regret bounds.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1806.10745 [cs.LG]
	(or arXiv:1806.10745v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1806.10745

Submission history

From: Dylan Foster [view email]
[v1] Thu, 28 Jun 2018 02:50:38 UTC (121 KB)
[v2] Sun, 4 Nov 2018 12:50:58 UTC (151 KB)

Computer Science > Machine Learning

Title:Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators