Skip to main content

Showing 1–1 of 1 results for author: Wali, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:1905.09898  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Feedback graph regret bounds for Thompson Sampling and UCB

    Authors: Thodoris Lykouris, Eva Tardos, Drishti Wali

    Abstract: We study the stochastic multi-armed bandit problem with the graph-based feedback structure introduced by Mannor and Shamir. We analyze the performance of the two most prominent stochastic bandit algorithms, Thompson Sampling and Upper Confidence Bound (UCB), in the graph-based feedback setting. We show that these algorithms achieve regret guarantees that combine the graph structure and the gaps be… ▽ More

    Submitted 14 February, 2020; v1 submitted 23 May, 2019; originally announced May 2019.

    Comments: Appeared in ALT 2020