Federated Bandit: A Gossiping Approach

Zhu, Zhaowei; Zhu, Jingxuan; Liu, Ji; Liu, Yang

doi:10.1145/3447380

Computer Science > Machine Learning

arXiv:2010.12763 (cs)

[Submitted on 24 Oct 2020 (v1), last revised 7 Apr 2021 (this version, v2)]

Title:Federated Bandit: A Gossiping Approach

Authors:Zhaowei Zhu, Jingxuan Zhu, Ji Liu, Yang Liu

View PDF

Abstract:In this paper, we study \emph{Federated Bandit}, a decentralized Multi-Armed Bandit problem with a set of $N$ agents, who can only communicate their local data with neighbors described by a connected graph $G$. Each agent makes a sequence of decisions on selecting an arm from $M$ candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm Gossip_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that Gossip_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of $O(\max\{ \texttt{poly}(N,M) \log T, \texttt{poly}(N,M)\log_{\lambda_2^{-1}} N\})$ for all $N$ agents, where $\lambda_2\in(0,1)$ is the second largest eigenvalue of the expected gossip matrix, which is a function of $G$. We then propose Fed_UCB, a differentially private version of Gossip_UCB, in which the agents preserve $\epsilon$-differential privacy of their local data while achieving $O(\max \{\frac{\texttt{poly}(N,M)}{\epsilon}\log^{2.5} T, \texttt{poly}(N,M) (\log_{\lambda_2^{-1}} N + \log T) \})$ regret.

Comments:	Accepted by ACM SIGMETRICS 2021
Subjects:	Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Cite as:	arXiv:2010.12763 [cs.LG]
	(or arXiv:2010.12763v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2010.12763
Journal reference:	Proc. ACM Meas. Anal. Comput. Syst., Vol. 5, No. 1, Article 2. Publication date: March 2021
Related DOI:	https://doi.org/10.1145/3447380

Submission history

From: Zhaowei Zhu [view email]
[v1] Sat, 24 Oct 2020 03:44:25 UTC (2,939 KB)
[v2] Wed, 7 Apr 2021 04:59:14 UTC (1,187 KB)

Computer Science > Machine Learning

Title:Federated Bandit: A Gossiping Approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Federated Bandit: A Gossiping Approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators