Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

Morafah, Mahdi; Reisser, Matthias; Lin, Bill; Louizos, Christos

Computer Science > Machine Learning

arXiv:2405.07925 (cs)

[Submitted on 13 May 2024]

Title:Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

Authors:Mahdi Morafah, Matthias Reisser, Bill Lin, Christos Louizos

View PDF HTML (experimental)

Abstract:The proliferation of edge devices has brought Federated Learning (FL) to the forefront as a promising paradigm for decentralized and collaborative model training while preserving the privacy of clients' data. However, FL struggles with a significant performance reduction and poor convergence when confronted with Non-Independent and Identically Distributed (Non-IID) data distributions among participating clients. While previous efforts, such as client drift mitigation and advanced server-side model fusion techniques, have shown some success in addressing this challenge, they often overlook the root cause of the performance reduction - the absence of identical data accurately mirroring the global data distribution among clients. In this paper, we introduce Gen-FedSD, a novel approach that harnesses the powerful capability of state-of-the-art text-to-image foundation models to bridge the significant Non-IID performance gaps in FL. In Gen-FedSD, each client constructs textual prompts for each class label and leverages an off-the-shelf state-of-the-art pre-trained Stable Diffusion model to synthesize high-quality data samples. The generated synthetic data is tailored to each client's unique local data gaps and distribution disparities, effectively making the final augmented local data IID. Through extensive experimentation, we demonstrate that Gen-FedSD achieves state-of-the-art performance and significant communication cost savings across various datasets and Non-IID settings.

Comments:	International Workshop on Federated Foundation Models for the Web 2024 (FL@FM-TheWebConf'24)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2405.07925 [cs.LG]
	(or arXiv:2405.07925v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.07925

Submission history

From: Mahdi Morafah [view email]
[v1] Mon, 13 May 2024 16:57:48 UTC (7,291 KB)

Computer Science > Machine Learning

Title:Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators