Skip to main content

Showing 1–19 of 19 results for author: Stephan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.24802  [pdf, ps, other

    cs.LG

    ByzFL: Research Framework for Robust Federated Learning

    Authors: Marc González, Rachid Guerraoui, Rafael Pinot, Geovani Rizk, John Stephan, François Taïani

    Abstract: We present ByzFL, an open-source Python library for developing and benchmarking robust federated learning (FL) algorithms. ByzFL provides a unified and extensible framework that includes implementations of state-of-the-art robust aggregators, a suite of configurable attacks, and tools for simulating a variety of FL scenarios, including heterogeneous data distributions, multiple training algorithms… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  2. arXiv:2505.01874  [pdf, ps, other

    cs.LG cs.CR cs.DC

    Towards Trustworthy Federated Learning with Untrusted Participants

    Authors: Youssef Allouah, Rachid Guerraoui, John Stephan

    Abstract: Resilience against malicious participants and data privacy are essential for trustworthy federated learning, yet achieving both with good utility typically requires the strong assumption of a trusted central server. This paper shows that a significantly weaker assumption suffices: each pair of participants shares a randomness seed unknown to others. In a setting where malicious participants may co… ▽ More

    Submitted 4 June, 2025; v1 submitted 3 May, 2025; originally announced May 2025.

    Comments: ICML 2025 conference paper

  3. arXiv:2501.03126  [pdf, other

    cs.DC

    CrowdProve: Community Proving for ZK Rollups

    Authors: John Stephan, Matej Pavlovic, Antonio Locascio, Benjamin Livshits

    Abstract: Zero-Knowledge (ZK) rollups have become a popular solution for scaling blockchain systems, offering improved transaction throughput and reduced costs by aggregating Layer 2 transactions and submitting them as a single batch to a Layer 1 blockchain. However, the computational burden of generating validity proofs, a key feature of ZK rollups, presents significant challenges in terms of performance a… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

  4. arXiv:2405.14670  [pdf, other

    cs.LG

    Overcoming the Challenges of Batch Normalization in Federated Learning

    Authors: Rachid Guerraoui, Rafael Pinot, Geovani Rizk, John Stephan, François Taiani

    Abstract: Batch normalization has proven to be a very beneficial mechanism to accelerate the training and improve the accuracy of deep neural networks in centralized environments. Yet, the scheme faces significant challenges in federated learning, especially under high data heterogeneity. Essentially, the main challenges arise from external covariate shifts and inconsistent statistics across clients. We int… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  5. arXiv:2405.14432  [pdf, other

    cs.LG

    Adaptive Gradient Clipping for Robust Federated Learning

    Authors: Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Ahmed Jellouli, Geovani Rizk, John Stephan

    Abstract: Robust federated learning aims to maintain reliable performance despite the presence of adversarial or misbehaving workers. While state-of-the-art (SOTA) robust distributed gradient descent (Robust-DGD) methods were proven theoretically optimal, their empirical success has often relied on pre-aggregation gradient clipping. However, existing static clipping strategies yield inconsistent results: en… ▽ More

    Submitted 9 May, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

  6. arXiv:2312.15282  [pdf, other

    stat.ML cs.LG

    Causal Forecasting for Pricing

    Authors: Douglas Schultz, Johannes Stephan, Julian Sieber, Trudie Yeh, Manuel Kunz, Patrick Doupe, Tim Januschowski

    Abstract: This paper proposes a novel method for demand forecasting in a pricing context. Here, modeling the causal relationship between price as an input variable to demand is crucial because retailers aim to set prices in a (profit) optimal manner in a downstream decision making problem. Our methods bring together the Double Machine Learning methodology for causal inference and state-of-the-art transforme… ▽ More

    Submitted 30 January, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

  7. arXiv:2312.14712  [pdf, other

    cs.LG cs.CR cs.DC

    Robustness, Efficiency, or Privacy: Pick Two in Machine Learning

    Authors: Youssef Allouah, Rachid Guerraoui, John Stephan

    Abstract: The success of machine learning (ML) applications relies on vast datasets and distributed architectures which, as they grow, present major challenges. In real-world scenarios, where data often contains sensitive information, issues like data poisoning and hardware failures are common. Ensuring privacy and robustness is vital for the broad adoption of ML in public life. This paper examines the cost… ▽ More

    Submitted 11 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  8. arXiv:2309.05395  [pdf, other

    cs.LG cs.CR cs.DC

    SABLE: Secure And Byzantine robust LEarning

    Authors: Antoine Choffrut, Rachid Guerraoui, Rafael Pinot, Renaud Sirdey, John Stephan, Martin Zuber

    Abstract: Due to the widespread availability of data, machine learning (ML) algorithms are increasingly being implemented in distributed topologies, wherein various nodes collaborate to train ML models via the coordination of a central server. However, distributed learning approaches face significant vulnerabilities, primarily stemming from two potential threats. Firstly, the presence of Byzantine nodes pos… ▽ More

    Submitted 14 December, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

  9. arXiv:2305.14406  [pdf, other

    cs.LG cs.AI

    Deep Learning based Forecasting: a case study from the online fashion industry

    Authors: Manuel Kunz, Stefan Birr, Mones Raslan, Lei Ma, Zhen Li, Adele Gouttes, Mateusz Koren, Tofigh Naghibi, Johannes Stephan, Mariia Bulycheva, Matthias Grzeschik, Armin Kekić, Michael Narodovitch, Kashif Rasul, Julian Sieber, Tim Januschowski

    Abstract: Demand forecasting in the online fashion industry is particularly amendable to global, data-driven forecasting models because of the industry's set of particular challenges. These include the volume of data, the irregularity, the high amount of turn-over in the catalog and the fixed inventory assumption. While standard deep learning forecasting approaches cater for many of these, the fixed invento… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  10. arXiv:2302.04787  [pdf, other

    cs.LG cs.CR cs.DC

    On the Privacy-Robustness-Utility Trilemma in Distributed Learning

    Authors: Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan

    Abstract: The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any alg… ▽ More

    Submitted 29 May, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted paper at ICML

  11. arXiv:2302.01772  [pdf, other

    cs.LG cs.DC

    Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity

    Authors: Youssef Allouah, Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan

    Abstract: Byzantine machine learning (ML) aims to ensure the resilience of distributed learning algorithms to misbehaving (or Byzantine) machines. Although this problem received significant attention, prior works often assume the data held by the machines to be homogeneous, which is seldom true in practical settings. Data heterogeneity makes Byzantine ML considerably more challenging, since a Byzantine mach… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted paper at AISTATS 2023

  12. arXiv:2209.15259  [pdf, ps, other

    cs.LG cs.AI cs.CR

    On the Impossible Safety of Large AI Models

    Authors: El-Mahdi El-Mhamdi, Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Lê-Nguyên Hoang, Rafael Pinot, Sébastien Rouault, John Stephan

    Abstract: Large AI Models (LAIMs), of which large language models are the most prominent recent example, showcase some impressive performance. However they have been empirically found to pose serious security issues. This paper systematizes our knowledge about the fundamental impossibility of building arbitrarily accurate and secure machine learning models. More precisely, we identify key challenging featur… ▽ More

    Submitted 9 May, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: 40 pages

  13. arXiv:2209.10931  [pdf, other

    cs.LG cs.DC

    Robust Collaborative Learning with Linear Gradient Overhead

    Authors: Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Lê Nguyên Hoang, Rafael Pinot, John Stephan

    Abstract: Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty machines that may deviate from their prescribed algorithm because of software or hardware bugs, poisoned data or malicious behaviors. While many solutions have been proposed to enhance the robustness of D-SGD to such machines, previous works either resort to strong assumptions (trusted server, homogeneous da… ▽ More

    Submitted 3 June, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: Accepted paper at ICML 2023

  14. Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

    Authors: Wael Elwasif, William Godoy, Nick Hagerty, J. Austin Harris, Oscar Hernandez, Balint Joo, Paul Kent, Damien Lebrun-Grandie, Elijah Maccarthy, Veronica G. Melesse Vergara, Bronson Messer, Ross Miller, Sarp Opal, Sergei Bastrakov, Michael Bussmann, Alexander Debus, Klaus Steinger, Jan Stephan, Rene Widera, Spencer H. Bryngelson, Henry Le Berre, Anand Radhakrishnan, Jefferey Young, Sunita Chandrasekaran, Florina Ciorba , et al. (6 additional authors not shown)

    Abstract: This paper assesses and reports the experience of ten teams working to port,validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems built by GIGABYTE, each one equipped with a server-class Arm CPU from Ampere Computing and A100 data center GPU from NVIDIA Corp. The syst… ▽ More

    Submitted 19 December, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

    Journal ref: Proceedings of the HPC Asia 2023 Workshops, pg 35-49

  15. arXiv:2205.12173  [pdf, other

    cs.LG cs.DC

    Byzantine Machine Learning Made Easy by Resilient Averaging of Momentums

    Authors: Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan

    Abstract: Byzantine resilience emerged as a prominent topic within the distributed machine learning community. Essentially, the goal is to enhance distributed optimization algorithms, such as distributed SGD, in a way that guarantees convergence despite the presence of some misbehaving (a.k.a., {\em Byzantine}) workers. Although a myriad of techniques addressing the problem have been proposed, the field arg… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted at ICML 2022

  16. Challenges Porting a C++ Template-Metaprogramming Abstraction Layer to Directive-based Offloading

    Authors: Jeffrey Kelling, Sergei Bastrakov, Alexander Debus, Thomas Kluge, Matt Leinhauser, Richard Pausch, Klaus Steiniger, Jan Stephan, René Widera, Jeff Young, Michael Bussmann, Sunita Chandrasekaran, Guido Juckeland

    Abstract: HPC systems employ a growing variety of compute accelerators with different architectures and from different vendors. Large scientific applications are required to run efficiently across these systems but need to retain a single code-base in order to not stifle development. Directive-based offloading programming models set out to provide the required portability, but, to existing codes, they thems… ▽ More

    Submitted 24 January, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: 20 pages, 1 figure, 3 tables, WACCPD@SC21

    ACM Class: D.1.3; D.2.1; D.3.3

  17. arXiv:2110.03991  [pdf, other

    cs.LG cs.CR

    Combining Differential Privacy and Byzantine Resilience in Distributed SGD

    Authors: Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, Sebastien Rouault, John Stephan

    Abstract: Privacy and Byzantine resilience (BR) are two crucial requirements of modern-day distributed machine learning. The two concepts have been extensively studied individually but the question of how to combine them effectively remains unanswered. This paper contributes to addressing this question by studying the extent to which the distributed SGD algorithm, in the standard parameter-server architectu… ▽ More

    Submitted 5 October, 2023; v1 submitted 8 October, 2021; originally announced October 2021.

  18. arXiv:2107.03743  [pdf, other

    cs.LG cs.AI

    Probabilistic Time Series Forecasting with Implicit Quantile Networks

    Authors: Adèle Gouttes, Kashif Rasul, Mateusz Koren, Johannes Stephan, Tofigh Naghibi

    Abstract: Here, we propose a general method for probabilistic time series forecasting. We combine an autoregressive recurrent neural network to model temporal dynamics with Implicit Quantile Networks to learn a large class of distributions over a time-series target. When compared to other probabilistic neural forecasting models on real- and simulated data, our approach is favorable in terms of point-wise pr… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: Accepted at the ICML 2021 Time Series Workshop

  19. arXiv:2102.08166  [pdf, other

    cs.LG cs.CR cs.DC

    Differential Privacy and Byzantine Resilience in SGD: Do They Add Up?

    Authors: Rachid Guerraoui, Nirupam Gupta, Rafaël Pinot, Sébastien Rouault, John Stephan

    Abstract: This paper addresses the problem of combining Byzantine resilience with privacy in machine learning (ML). Specifically, we study if a distributed implementation of the renowned Stochastic Gradient Descent (SGD) learning algorithm is feasible with both differential privacy (DP) and $(α,f)$-Byzantine resilience. To the best of our knowledge, this is the first work to tackle this problem from a theor… ▽ More

    Submitted 24 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.