Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks

Terashita, Naoyuki; Hara, Satoshi

Statistics > Machine Learning

arXiv:2210.02129 (stat)

[Submitted on 5 Oct 2022 (v1), last revised 13 Jun 2023 (this version, v3)]

Title:Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks

Authors:Naoyuki Terashita, Satoshi Hara

View PDF

Abstract:This paper addresses the communication issues when estimating hyper-gradients in decentralized federated learning (FL). Hyper-gradients in decentralized FL quantifies how the performance of globally shared optimal model is influenced by the perturbations in clients' hyper-parameters. In prior work, clients trace this influence through the communication of Hessian matrices over a static undirected network, resulting in (i) excessive communication costs and (ii) inability to make use of more efficient and robust networks, namely, time-varying directed networks. To solve these issues, we introduce an alternative optimality condition for FL using an averaging operation on model parameters and gradients. We then employ Push-Sum as the averaging operation, which is a consensus optimization technique for time-varying directed networks. As a result, the hyper-gradient estimator derived from our optimality condition enjoys two desirable properties; (i) it only requires Push-Sum communication of vectors and (ii) it can operate over time-varying directed networks. We confirm the convergence of our estimator to the true hyper-gradient both theoretically and empirically, and we further demonstrate that it enables two novel applications: decentralized influence estimation and personalization over time-varying networks.

Comments:	Under review
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2210.02129 [stat.ML]
	(or arXiv:2210.02129v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2210.02129

Submission history

From: Naoyuki Terashita [view email]
[v1] Wed, 5 Oct 2022 10:23:45 UTC (581 KB)
[v2] Tue, 31 Jan 2023 08:55:24 UTC (585 KB)
[v3] Tue, 13 Jun 2023 05:04:17 UTC (204 KB)

Statistics > Machine Learning

Title:Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators