Skip to main content

Showing 1–12 of 12 results for author: Iacob, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.22549  [pdf, other

    cs.LG

    DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models

    Authors: Alex Iacob, Lorenzo Sani, Mher Safaryan, Paris Giampouras, Samuel Horváth, Andrej Jovanovic, Meghdad Kurmanji, Preslav Aleksandrov, William F. Shen, Xinchi Qiu, Nicholas D. Lane

    Abstract: Scaling foundation model training with Distributed Data Parallel (DDP) methods is bandwidth-limited. Existing infrequent communication methods like Local SGD were designed to synchronize only model parameters and cannot be trivially applied to adaptive optimizers due to additional optimizer states. Current approaches extending Local SGD either lack convergence guarantees or require synchronizing a… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Keywords: Distributed Training, Foundation Models, Large Language Models, Optimizers, Communication Efficiency, Federated Learning, Distributed Systems, Optimization Theory, Scaling, Robustness. Preprint, under review at NeurIPS

  2. arXiv:2504.05153  [pdf, other

    cs.LG

    SparsyFed: Sparse Adaptive Federated Training

    Authors: Adriano Guastella, Lorenzo Sani, Alex Iacob, Alessio Mora, Paolo Bellavista, Nicholas D. Lane

    Abstract: Sparse training is often adopted in cross-device federated learning (FL) environments where constrained devices collaboratively train a machine learning model on private data by exchanging pseudo-gradients across heterogeneous networks. Although sparse training methods can reduce communication overhead and computational burden in FL, they are often not used in practice for the following key reason… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: Published as a conference paper at ICLR 2025

  3. arXiv:2502.07218  [pdf, other

    cs.LG cs.AI

    LUNAR: LLM Unlearning via Neural Activation Redirection

    Authors: William F. Shen, Xinchi Qiu, Meghdad Kurmanji, Alex Iacob, Lorenzo Sani, Yihong Chen, Nicola Cancedda, Nicholas D. Lane

    Abstract: Large Language Models (LLMs) benefit from training on ever larger amounts of textual data, but as a result, they increasingly incur the risk of leaking private information. The ability to selectively remove knowledge from LLMs is, therefore, a highly desirable capability. In this paper, we propose LUNAR, a novel unlearning methodology grounded in the Linear Representation Hypothesis. LUNAR operate… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  4. arXiv:2411.02908  [pdf, other

    cs.LG cs.DC

    Photon: Federated LLM Pre-Training

    Authors: Lorenzo Sani, Alex Iacob, Zeyu Cao, Royson Lee, Bill Marino, Yan Gao, Dongqi Cai, Zexi Li, Wanru Zhao, Xinchi Qiu, Nicholas D. Lane

    Abstract: Scaling large language models (LLMs) demands extensive data and computing resources, which are traditionally constrained to data centers by the high-bandwidth requirements of distributed training. Low-bandwidth methods like federated learning (FL) could enable collaborative training of larger models across weakly-connected GPUs if they can effectively be used for pre-training. To achieve this, we… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Comments: 13 pages, 9 appendix pages, 10 figures, 3 algorithms, 8 tables

  5. arXiv:2410.05021  [pdf, other

    cs.LG cs.CL

    DEPT: Decoupled Embeddings for Pre-training Language Models

    Authors: Alex Iacob, Lorenzo Sani, Meghdad Kurmanji, William F. Shen, Xinchi Qiu, Dongqi Cai, Yan Gao, Nicholas D. Lane

    Abstract: Language Model pre-training uses broad data mixtures to enhance performance across domains and languages. However, training on such heterogeneous text corpora requires extensive and expensive efforts. Since these data sources vary significantly in lexical, syntactic, and semantic aspects, they cause negative interference or the ``curse of multilinguality''. To address these challenges we propose a… ▽ More

    Submitted 7 April, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: Published as a conference paper at ICLR 2025

  6. arXiv:2405.14446  [pdf, other

    cs.LG cs.AI cs.CL cs.DC

    Worldwide Federated Training of Language Models

    Authors: Alex Iacob, Lorenzo Sani, Bill Marino, Preslav Aleksandrov, William F. Shen, Nicholas Donald Lane

    Abstract: The reliance of language model training on massive amounts of computation and vast datasets scraped from potentially low-quality, copyrighted, or sensitive data has come into question practically, legally, and ethically. Federated learning provides a plausible alternative by enabling previously untapped data to be voluntarily gathered from collaborating organizations. However, when scaled globally… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 19 pages, 8 figures, Under Review

    ACM Class: I.2.7

  7. arXiv:2405.10853  [pdf, other

    cs.LG cs.AI cs.DC

    The Future of Large Language Model Pre-training is Federated

    Authors: Lorenzo Sani, Alex Iacob, Zeyu Cao, Bill Marino, Yan Gao, Tomas Paulik, Wanru Zhao, William F. Shen, Preslav Aleksandrov, Xinchi Qiu, Nicholas D. Lane

    Abstract: Generative pre-trained large language models (LLMs) have demonstrated impressive performance over a wide range of tasks, thanks to the unprecedented amount of data they have been trained on. As established scaling laws indicate, LLMs' future performance improvement depends on the amount of computing and data sources they can leverage for pre-training. Federated learning (FL) has the potential to u… ▽ More

    Submitted 14 October, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 24 pages, 15 figures, pre-print

  8. arXiv:2402.10191  [pdf, other

    cs.LG

    FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients

    Authors: Xinchi Qiu, Yan Gao, Lorenzo Sani, Heng Pan, Wanru Zhao, Pedro P. B. Gusmao, Mina Alibeigi, Alex Iacob, Nicholas D. Lane

    Abstract: Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices while keeping data localized. The deployment of FL in numerous real-world applications faces delays, primarily due to the prevalent reliance on supervised tasks. Generating detailed labels at edge devices, if feasible, is demanding, given resource constraints a… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  9. arXiv:2306.17453  [pdf, other

    cs.DC

    Pollen: High-throughput Federated Learning Simulation via Resource-Aware Client Placement

    Authors: Lorenzo Sani, Pedro Porto Buarque de Gusmão, Alex Iacob, Wanru Zhao, Xinchi Qiu, Yan Gao, Javier Fernandez-Marques, Nicholas Donald Lane

    Abstract: Federated Learning (FL) is a privacy-focused machine learning paradigm that collaboratively trains models directly on edge devices. Simulation plays an essential role in FL adoption, helping develop novel aggregation and client sampling strategies. However, current simulators cannot emulate large-scale systems in a time-efficient manner, which limits their utility and casts doubts on generalizabil… ▽ More

    Submitted 20 May, 2024; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: 22 pages, 22 figures, 9 tables, under review

  10. arXiv:2305.12134  [pdf, other

    cs.LG cs.AI

    Privacy in Multimodal Federated Human Activity Recognition

    Authors: Alex Iacob, Pedro P. B. Gusmão, Nicholas D. Lane, Armand K. Koupai, Mohammud J. Bocus, Raúl Santos-Rodríguez, Robert J. Piechocki, Ryan McConville

    Abstract: Human Activity Recognition (HAR) training data is often privacy-sensitive or held by non-cooperative entities. Federated Learning (FL) addresses such concerns by training ML models on edge clients. This work studies the impact of privacy in federated HAR at a user, environment, and sensor level. We show that the performance of FL for HAR depends on the assumed privacy level of the FL system and pr… ▽ More

    Submitted 2 June, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: In 3rd On-Device Intelligence Workshop at MLSys 2023, 8 pages

  11. Can Fair Federated Learning reduce the need for Personalisation?

    Authors: Alex Iacob, Pedro P. B. Gusmão, Nicholas D. Lane

    Abstract: Federated Learning (FL) enables training ML models on edge clients without sharing data. However, the federated model's performance on local data varies, disincentivising the participation of clients who benefit little from FL. Fair FL reduces accuracy disparity by focusing on clients with higher losses while personalisation locally fine-tunes the model. Personalisation provides a participation in… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: In 3rd Workshop on Machine Learning and Systems (EuroMLSys 2023), 9 pages

  12. arXiv:2201.05231  [pdf, other

    cs.LG cs.SI

    Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach (Extended Version)

    Authors: Alexandra Iacob, Bogdan Cautis, Silviu Maniu

    Abstract: Motivated by scenarios of information diffusion and advertising in social media, we study an influence maximization problem in which little is assumed to be known about the diffusion network or about the model that determines how information may propagate. In such a highly uncertain environment, one can focus on multi-round diffusion campaigns, with the objective to maximize the number of distinct… ▽ More

    Submitted 13 January, 2022; originally announced January 2022.

    Comments: Extended version of conference article in SIAM International Conference on Data Mining (SDM) 2022. 14 pages, 2 figures, 4 tables