-
Tolerating Disasters with Hierarchical Consensus
Authors:
Wassim Yahyaoui,
Joachim Bruneau-Queyreix,
Jérémie Decouchant,
Marcus Völp
Abstract:
Geo-replication provides disaster recovery after catastrophic accidental failures or attacks, such as fires, blackouts or denial-of-service attacks to a data center or region. Naturally distributed data structures, such as Blockchains, when well designed, are immune against such disruptions, but they also benefit from leveraging locality. In this work, we consolidate the performance of geo-replica…
▽ More
Geo-replication provides disaster recovery after catastrophic accidental failures or attacks, such as fires, blackouts or denial-of-service attacks to a data center or region. Naturally distributed data structures, such as Blockchains, when well designed, are immune against such disruptions, but they also benefit from leveraging locality. In this work, we consolidate the performance of geo-replicated consensus by leveraging novel insights about hierarchical consensus and a construction methodology that allows creating novel protocols from existing building blocks. In particular we show that cluster confirmation, paired with subgroup rotation, allows protocols to safely operate through situations where all members of the global consensus group are Byzantine. We demonstrate our compositional construction by combining the recent HotStuff and Damysus protocols into a hierarchical geo-replicated blockchain with global durability guarantees. We present a compositionality proof and demonstrate the correctness of our protocol, including its ability to tolerate cluster crashes. Our protocol -ORION 1 -achieves a 20% higher throughput than GeoBFT, the latest hierarchical Byzantine Fault-Tolerant (BFT) protocol.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
Taming Double-Spending in Offline Payments with Reputation-Weighted Loan Networks
Authors:
Nektarios Evangelou,
Rowdy Chotkan,
Bulat Nasrulin,
Jérémie Decouchant
Abstract:
Blockchain solutions typically assume a synchronous network to ensure consistency and achieve consensus. In contrast, offline transaction systems aim to enable users to agree on and execute transactions without assuming bounded communication delays when interacting with the blockchain. Most existing offline payment schemes depend on trusted hardware wallets that are assumed to be secure and tamper…
▽ More
Blockchain solutions typically assume a synchronous network to ensure consistency and achieve consensus. In contrast, offline transaction systems aim to enable users to agree on and execute transactions without assuming bounded communication delays when interacting with the blockchain. Most existing offline payment schemes depend on trusted hardware wallets that are assumed to be secure and tamper-proof. While this work introduces Overdraft, a novel offline payment system that shifts the reliance from hardware to users themselves. Overdraft allows potential payment receivers to assess the likelihood of being paid, allowing them to accept transactions with confidence or deny them. Overdraft achieves this by maintaining a loan network that is weighted by online reputation. This loan network contains time-limited agreements where users pledge to cover another user's payment if necessary. For example, when a payer lacks sufficient funds at the moment of commitment. Offline users rely on the last known view of the loan network -- which they had access to when last online -- to determine whether to participate in an offline transaction. This view is used to estimate the probability of eventual payment, possibly using multiple loans. Once online again, users commit their transactions to the blockchain with any conflicts being resolved deterministically. Overdraft incorporates incentives for users and is designed to be resilient against Sybil attacks. As a proof of concept, we implemented Overdraft as an Ethereum Solidity smart contract and deployed it on the Sepolia testnet to evaluate its performance.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
SoK: Microservice Architectures from a Dependability Perspective
Authors:
Dāvis Kažemaks,
Jérémie Decouchant
Abstract:
The microservice software architecture leverages the idea of splitting large monolithic applications into multiple smaller services that interact using lightweight communication schemes. While the microservice architecture has proven its ability to support modern business applications, it also introduces new possible weak points in a system. Some scientific literature surveys have already addresse…
▽ More
The microservice software architecture leverages the idea of splitting large monolithic applications into multiple smaller services that interact using lightweight communication schemes. While the microservice architecture has proven its ability to support modern business applications, it also introduces new possible weak points in a system. Some scientific literature surveys have already addressed fault tolerance or security concerns but most of them lack analysis on the fault and vulnerability coverage that is introduced by microservice architectures. We explore the known faults and vulnerabilities that microservice architecture might suffer from, and the recent scientific literature that addresses them. We emphasize runtime detection and recovery mechanisms instead of offline prevention and mitigation mechanisms to limit the scope of this document.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Optimizing Streamlined Blockchain Consensus with Generalized Weighted Voting and Enhanced Leader Rotation
Authors:
Diana Micloiu,
Rowdy Chotkan,
Jérémie Decouchant
Abstract:
Streamlined Byzantine Fault Tolerant (BFT) protocols, such as HotStuff [PODC'19], and weighted voting represent two possible strategies to improve consensus in the distributed systems world. Several studies have been conducted on both techniques, but the research on combining the two is scarce. To cover this knowledge gap, we introduce a weighted voting approach on Hotstuff, along with two optimis…
▽ More
Streamlined Byzantine Fault Tolerant (BFT) protocols, such as HotStuff [PODC'19], and weighted voting represent two possible strategies to improve consensus in the distributed systems world. Several studies have been conducted on both techniques, but the research on combining the two is scarce. To cover this knowledge gap, we introduce a weighted voting approach on Hotstuff, along with two optimisations targeting weight assignment distribution and leader rotation in the underlying state replication protocol. Moreover, the weighted protocols developed rely on studies proving the effectiveness of a specific voting power assignment based on discrete values. We generalise this approach by presenting a novel continuous weighting scheme applied to the Hotstuff protocol to highlight the effectiveness of this technique in faulty scenarios. We prove the significant latency reduction impact of weighted voting on streamlined protocols and advocate for further research.
△ Less
Submitted 29 October, 2024;
originally announced October 2024.
-
Partition Detection in Byzantine Networks
Authors:
Yérom-David Bromberg,
Jérémie Decouchant,
Manon Sourisseau,
François Taïani
Abstract:
Detecting and handling network partitions is a fundamental requirement of distributed systems. Although existing partition detection methods in arbitrary graphs tolerate unreliable networks, they either assume that all nodes are correct or that a limited number of nodes might crash. In particular, Byzantine behaviors are out of the scope of these algorithms despite Byzantine fault tolerance being…
▽ More
Detecting and handling network partitions is a fundamental requirement of distributed systems. Although existing partition detection methods in arbitrary graphs tolerate unreliable networks, they either assume that all nodes are correct or that a limited number of nodes might crash. In particular, Byzantine behaviors are out of the scope of these algorithms despite Byzantine fault tolerance being an active research topic for important problems such as consensus. Moreover, Byzantinetolerant protocols, such as broadcast or consensus, always rely on the assumption of connected networks. This paper addresses the problem of detecting partition in Byzantine networks (without connectivity assumption). We present a novel algorithm, which we call NECTAR, that safely detects partitioned and possibly partitionable networks and prove its correctness. NECTAR allows all correct nodes to detect whether a network could suffer from Byzantine nodes. We evaluate NECTAR's performance and compare it to two existing baselines using up to 100 nodes running real code, on various realistic topologies. Our results confirm that NECTAR maintains a 100% accuracy while the accuracy of the various existing baselines decreases by at least 40% as soon as one participant is Byzantine. Although NECTAR's network cost increases with the number of nodes and decreases with the network's diameter, it does not go above around 500KB in the worst cases.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Reliable Communication in Hybrid Authentication and Trust Models
Authors:
Rowdy Chotkan,
Bart Cox,
Vincent Rahli,
Jérémie Decouchant
Abstract:
Reliable communication is a fundamental distributed communication abstraction that allows any two nodes of a network to communicate with each other. It is necessary for more powerful communication primitives, such as broadcast and consensus. Using different authentication models, two classical protocols implement reliable communication in unknown and sufficiently connected networks. In the first o…
▽ More
Reliable communication is a fundamental distributed communication abstraction that allows any two nodes of a network to communicate with each other. It is necessary for more powerful communication primitives, such as broadcast and consensus. Using different authentication models, two classical protocols implement reliable communication in unknown and sufficiently connected networks. In the first one, network links are authenticated, and processes rely on dissemination paths to authenticate messages. In the second one, processes generate digital signatures that are flooded in the network. This work considers the hybrid system model that combines authenticated links and authenticated processes. We additionally aim to leverage the possible presence of trusted nodes and trusted components in networks, which have been assumed in the scientific literature and in practice. We first extend the two classical reliable communication protocols to leverage trusted nodes. We then propose DualRC, a novel algorithm that enables reliable communication in the hybrid authentication model by manipulating both dissemination paths and digital signatures, and leverages the possible presence of trusted nodes (e.g., network gateways) and trusted components (e.g., Intel SGX enclaves). We provide correctness verification algorithms to assess whether our algorithms implement reliable communication for all nodes on a given network.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Training Diffusion Models with Federated Learning
Authors:
Matthijs de Goede,
Bart Cox,
Jérémie Decouchant
Abstract:
The training of diffusion-based models for image generation is predominantly controlled by a select few Big Tech companies, raising concerns about privacy, copyright, and data authority due to their lack of transparency regarding training data. To ad-dress this issue, we propose a federated diffusion model scheme that enables the independent and collaborative training of diffusion models without e…
▽ More
The training of diffusion-based models for image generation is predominantly controlled by a select few Big Tech companies, raising concerns about privacy, copyright, and data authority due to their lack of transparency regarding training data. To ad-dress this issue, we propose a federated diffusion model scheme that enables the independent and collaborative training of diffusion models without exposing local data. Our approach adapts the Federated Averaging (FedAvg) algorithm to train a Denoising Diffusion Model (DDPM). Through a novel utilization of the underlying UNet backbone, we achieve a significant reduction of up to 74% in the number of parameters exchanged during training,compared to the naive FedAvg approach, whilst simultaneously maintaining image quality comparable to the centralized setting, as evaluated by the FID score.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Parameterizing Federated Continual Learning for Reproducible Research
Authors:
Bart Cox,
Jeroen Galjaard,
Aditya Shankar,
Jérémie Decouchant,
Lydia Y. Chen
Abstract:
Federated Learning (FL) systems evolve in heterogeneous and ever-evolving environments that challenge their performance. Under real deployments, the learning tasks of clients can also evolve with time, which calls for the integration of methodologies such as Continual Learning. To enable research reproducibility, we propose a set of experimental best practices that precisely capture and emulate co…
▽ More
Federated Learning (FL) systems evolve in heterogeneous and ever-evolving environments that challenge their performance. Under real deployments, the learning tasks of clients can also evolve with time, which calls for the integration of methodologies such as Continual Learning. To enable research reproducibility, we propose a set of experimental best practices that precisely capture and emulate complex learning scenarios. Our framework, Freddie, is the first entirely configurable framework for Federated Continual Learning (FCL), and it can be seamlessly deployed on a large number of machines thanks to the use of Kubernetes and containerization. We demonstrate the effectiveness of Freddie on two use cases, (i) large-scale FL on CIFAR100 and (ii) heterogeneous task sequence on FCL, which highlight unaddressed performance challenges in FCL scenarios.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients
Authors:
Yuncong Zuo,
Bart Cox,
Lydia Y. Chen,
Jérémie Decouchant
Abstract:
Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. In this paper, we propose a new FL arch…
▽ More
Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. In this paper, we propose a new FL architecture, to our knowledge, the first multi-server FL system that is entirely asynchronous, and therefore addresses these two limitations simultaneously. Our solution keeps both servers and clients continuously active. As in previous multi-server methods, clients interact solely with their nearest server, ensuring efficient update integration into the model. Differently, however, servers also periodically update each other asynchronously, and never postpone interactions with clients. We compare our solution to three representative baselines - FedAvg, FedAsync and HierFAVG - on the MNIST and CIFAR-10 image classification datasets and on the WikiText-2 language modeling dataset. Our solution converges to similar or higher accuracy levels than previous baselines and requires 61% less time to do so in geo-distributed settings.
△ Less
Submitted 20 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Asynchronous Byzantine Federated Learning
Authors:
Bart Cox,
Abele Mălan,
Lydia Y. Chen,
Jérémie Decouchant
Abstract:
Federated learning (FL) enables a set of geographically distributed clients to collectively train a model through a server. Classically, the training process is synchronous, but can be made asynchronous to maintain its speed in presence of slow clients and in heterogeneous networks. The vast majority of Byzantine fault-tolerant FL systems however rely on a synchronous training process. Our solutio…
▽ More
Federated learning (FL) enables a set of geographically distributed clients to collectively train a model through a server. Classically, the training process is synchronous, but can be made asynchronous to maintain its speed in presence of slow clients and in heterogeneous networks. The vast majority of Byzantine fault-tolerant FL systems however rely on a synchronous training process. Our solution is one of the first Byzantine-resilient and asynchronous FL algorithms that does not require an auxiliary server dataset and is not delayed by stragglers, which are shortcomings of previous works. Intuitively, the server in our solution waits to receive a minimum number of updates from clients on its latest model to safely update it, and is later able to safely leverage the updates that late clients might send. We compare the performance of our solution with state-of-the-art algorithms on both image and text datasets under gradient inversion, perturbation, and backdoor attacks. Our results indicate that our solution trains a model faster than previous synchronous FL solution, and maintains a higher accuracy, up to 1.54x and up to 1.75x for perturbation and gradient inversion attacks respectively, in the presence of Byzantine clients than previous asynchronous FL solutions.
△ Less
Submitted 20 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Share Secrets for Privacy: Confidential Forecasting with Vertical Federated Learning
Authors:
Aditya Shankar,
Jérémie Decouchant,
Dimitra Gkorou,
Rihan Hai,
Lydia Y. Chen
Abstract:
Vertical federated learning (VFL) is a promising area for time series forecasting in many applications, such as healthcare and manufacturing. Critical challenges to address include data privacy and over-fitting on small and noisy datasets during both training and inference. Additionally, such forecasting models must scale well with the number of parties while ensuring strong convergence and low-tu…
▽ More
Vertical federated learning (VFL) is a promising area for time series forecasting in many applications, such as healthcare and manufacturing. Critical challenges to address include data privacy and over-fitting on small and noisy datasets during both training and inference. Additionally, such forecasting models must scale well with the number of parties while ensuring strong convergence and low-tuning complexity. We address these challenges and propose ``Secret-shared Time Series Forecasting with VFL'' (STV), a novel framework with the following key features: i) a privacy-preserving algorithm for forecasting with SARIMAX and autoregressive trees on vertically-partitioned data; ii) decentralised forecasting using secret sharing and multi-party computation; and iii) novel N-party algorithms for matrix multiplication and inverse operations for exact parameter optimization, giving strong convergence with minimal tuning complexity. We evaluate on six representative datasets from public and industry-specific contexts. Results demonstrate that STV's forecasting accuracy is comparable to those of centralized approaches. Our exact optimization outperforms centralized methods, including state-of-the-art diffusion models and long-short-term memory, by 23.81% on forecasting accuracy. We also evaluate scalability by examining the communication costs of exact and iterative optimization to navigate the choice between the two. STV's code and supplementary material is available online: https://github.com/adis98/STV.
△ Less
Submitted 11 June, 2025; v1 submitted 31 May, 2024;
originally announced May 2024.
-
CCBNet: Confidential Collaborative Bayesian Networks Inference
Authors:
Abele Mălan,
Jérémie Decouchant,
Thiago Guzella,
Lydia Chen
Abstract:
Effective large-scale process optimization in manufacturing industries requires close cooperation between different human expert parties who encode their knowledge of related domains as Bayesian network models. For instance, Bayesian networks for domains such as lithography equipment, processes, and auxiliary tools must be conjointly used to effectively identify process optimizations in the semico…
▽ More
Effective large-scale process optimization in manufacturing industries requires close cooperation between different human expert parties who encode their knowledge of related domains as Bayesian network models. For instance, Bayesian networks for domains such as lithography equipment, processes, and auxiliary tools must be conjointly used to effectively identify process optimizations in the semiconductor industry. However, business confidentiality across domains hinders such collaboration, and encourages alternatives to centralized inference. We propose CCBNet, the first Confidentiality-preserving Collaborative Bayesian Network inference framework. CCBNet leverages secret sharing to securely perform analysis on the combined knowledge of party models by joining two novel subprotocols: (i) CABN, which augments probability distributions for features across parties by modeling them into secret shares of their normalized combination; and (ii) SAVE, which aggregates party inference result shares through distributed variable elimination. We extensively evaluate CCBNet via 9 public Bayesian networks. Our results show that CCBNet achieves predictive quality that is similar to the ones of centralized methods while preserving model confidentiality. We further demonstrate that CCBNet scales to challenging manufacturing use cases that involve 16-128 parties in large networks of 223-1003 features, and decreases, on average, computational overhead by 23%, while communicating 71k values per request. Finally, we showcase possible attacks and mitigations for partially reconstructing party networks in the two subprotocols.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
TabVFL: Improving Latent Representation in Vertical Federated Learning
Authors:
Mohamed Rashad,
Zilong Zhao,
Jeremie Decouchant,
Lydia Y. Chen
Abstract:
Autoencoders are popular neural networks that are able to compress high dimensional data to extract relevant latent information. TabNet is a state-of-the-art neural network model designed for tabular data that utilizes an autoencoder architecture for training. Vertical Federated Learning (VFL) is an emerging distributed machine learning paradigm that allows multiple parties to train a model collab…
▽ More
Autoencoders are popular neural networks that are able to compress high dimensional data to extract relevant latent information. TabNet is a state-of-the-art neural network model designed for tabular data that utilizes an autoencoder architecture for training. Vertical Federated Learning (VFL) is an emerging distributed machine learning paradigm that allows multiple parties to train a model collaboratively on vertically partitioned data while maintaining data privacy. The existing design of training autoencoders in VFL is to train a separate autoencoder in each participant and aggregate the latent representation later. This design could potentially break important correlations between feature data of participating parties, as each autoencoder is trained on locally available features while disregarding the features of others. In addition, traditional autoencoders are not specifically designed for tabular data, which is ubiquitous in VFL settings. Moreover, the impact of client failures during training on the model robustness is under-researched in the VFL scene. In this paper, we propose TabVFL, a distributed framework designed to improve latent representation learning using the joint features of participants. The framework (i) preserves privacy by mitigating potential data leakage with the addition of a fully-connected layer, (ii) conserves feature correlations by learning one latent representation vector, and (iii) provides enhanced robustness against client failures during training phase. Extensive experiments on five classification datasets show that TabVFL can outperform the prior work design, with 26.12% of improvement on f1-score.
△ Less
Submitted 25 June, 2024; v1 submitted 27 April, 2024;
originally announced April 2024.
-
Liveness Checking of the HotStuff Protocol Family
Authors:
Jérémie Decouchant,
Burcu Kulahcioglu Ozkan,
Yanzhuo Zhou
Abstract:
Byzantine consensus protocols aim at maintaining safety guarantees under any network synchrony model and at providing liveness in partially or fully synchronous networks. However, several Byzantine consensus protocols have been shown to violate liveness properties under certain scenarios. Existing testing methods for checking the liveness of consensus protocols check for time-bounded liveness viol…
▽ More
Byzantine consensus protocols aim at maintaining safety guarantees under any network synchrony model and at providing liveness in partially or fully synchronous networks. However, several Byzantine consensus protocols have been shown to violate liveness properties under certain scenarios. Existing testing methods for checking the liveness of consensus protocols check for time-bounded liveness violations, which generate a large number of false positives. In this work, for the first time, we check the liveness of Byzantine consensus protocols using the temperature and lasso detection methods, which require the definition of ad-hoc system state abstractions. We focus on the HotStuff protocol family that has been recently developed for blockchain consensus. In this family, the HotStuff protocol is both safe and live under the partial synchrony assumption, while the 2-Phase Hotstuff and Sync HotStuff protocols are known to violate liveness in subtle fault scenarios. We implemented our liveness checking methods on top of the Twins automated unit test generator to test the HotStuff protocol family. Our results indicate that our methods successfully detect all known liveness violations and produce fewer false positives than the traditional time-bounded liveness checks.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Intrusion Resilience Systems for Modern Vehicles
Authors:
Ali Shoker,
Vincent Rahli,
Jeremie Decouchant,
Paulo Esteves-Verissimo
Abstract:
Current vehicular Intrusion Detection and Prevention Systems either incur high false-positive rates or do not capture zero-day vulnerabilities, leading to safety-critical risks. In addition, prevention is limited to few primitive options like dropping network packets or extreme options, e.g., ECU Bus-off state. To fill this gap, we introduce the concept of vehicular Intrusion Resilience Systems (I…
▽ More
Current vehicular Intrusion Detection and Prevention Systems either incur high false-positive rates or do not capture zero-day vulnerabilities, leading to safety-critical risks. In addition, prevention is limited to few primitive options like dropping network packets or extreme options, e.g., ECU Bus-off state. To fill this gap, we introduce the concept of vehicular Intrusion Resilience Systems (IRS) that ensures the resilience of critical applications despite assumed faults or zero-day attacks, as long as threat assumptions are met. IRS enables running a vehicular application in a replicated way, i.e., as a Replicated State Machine, over several ECUs, and then requiring the replicated processes to reach a form of Byzantine agreement before changing their local state. Our study rides the mutation of modern vehicular environments, which are closing the gap between simple and resource-constrained "real-time and embedded systems", and complex and powerful "information technology" ones. It shows that current vehicle (e.g., Zonal) architectures and networks are becoming plausible for such modular fault and intrusion tolerance solutions,deemed too heavy in the past. Our evaluation on a simulated Automotive Ethernet network running two state-of-the-art agreement protocols (Damysus and Hotstuff) shows that the achieved latency and throughout are feasible for many Automotive applications.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
LØ: An Accountable Mempool for MEV Resistance
Authors:
Bulat Nasrulin,
Georgy Ishmaev,
Jérémie Decouchant,
Johan Pouwelse
Abstract:
Possible manipulation of user transactions by miners in a permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue, known as Miner Extractable Value (MEV), incurs highs costs on users of decentralised applications. Furthermore, transaction manipulations create other issues in blockchain systems such as congestion, higher fees, and system instability. D…
▽ More
Possible manipulation of user transactions by miners in a permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue, known as Miner Extractable Value (MEV), incurs highs costs on users of decentralised applications. Furthermore, transaction manipulations create other issues in blockchain systems such as congestion, higher fees, and system instability. Detecting transaction manipulations is difficult, even though it is known that they originate from the pre-consensus phase of transaction selection for a block building, at the base layer of blockchain protocols. In this paper we summarize known transaction manipulation attacks. We then present LØ, an accountable base layer protocol specifically designed to detect and mitigate transaction manipulations. LØ is built around accurate detection of transaction manipulations and assignment of blame at the granularity of a single mining node. LØ forces miners to log all the transactions they receive into a secure mempool data structure and to process them in a verifiable manner. Overall, LØ quickly and efficiently detects reordering, injection or censorship attempts. Our performance evaluation shows that LØ is also practical and only introduces a marginal performance overhead.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Leveraging the Verifier's Dilemma to Double Spend in Bitcoin
Authors:
Tong Cao,
Jérémie Decouchant,
Jiangshan Yu
Abstract:
We describe and analyze perishing mining, a novel block-withholding mining strategy that lures profit-driven miners away from doing useful work on the public chain by releasing block headers from a privately maintained chain. We then introduce the dual private chain (DPC) attack, where an adversary that aims at double spending increases its success rate by intermittently dedicating part of its has…
▽ More
We describe and analyze perishing mining, a novel block-withholding mining strategy that lures profit-driven miners away from doing useful work on the public chain by releasing block headers from a privately maintained chain. We then introduce the dual private chain (DPC) attack, where an adversary that aims at double spending increases its success rate by intermittently dedicating part of its hash power to perishing mining. We detail the DPC attack's Markov decision process, evaluate its double spending success rate using Monte Carlo simulations. We show that the DPC attack lowers Bitcoin's security bound in the presence of profit-driven miners that do not wait to validate the transactions of a block before mining on it.
△ Less
Submitted 8 February, 2023; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Aergia: Leveraging Heterogeneity in Federated Learning Systems
Authors:
Bart Cox,
Lydia Y. Chen,
Jérémie Decouchant
Abstract:
Federated Learning (FL) is a popular approach for distributed deep learning that prevents the pooling of large amounts of data in a central server. FL relies on clients to update a global model using their local datasets. Classical FL algorithms use a central federator that, for each training round, waits for all clients to send their model updates before aggregating them. In practical deployments…
▽ More
Federated Learning (FL) is a popular approach for distributed deep learning that prevents the pooling of large amounts of data in a central server. FL relies on clients to update a global model using their local datasets. Classical FL algorithms use a central federator that, for each training round, waits for all clients to send their model updates before aggregating them. In practical deployments, clients might have different computing powers and network capabilities, which might lead slow clients to become performance bottlenecks. Previous works have suggested to use a deadline for each learning round so that the federator ignores the late updates of slow clients, or so that clients send partially trained models before the deadline. To speed up the training process, we instead propose Aergia, a novel approach where slow clients (i) freeze the part of their model that is the most computationally intensive to train; (ii) train the unfrozen part of their model; and (iii) offload the training of the frozen part of their model to a faster client that trains it using its own dataset. The offloading decisions are orchestrated by the federator based on the training speed that clients report and on the similarities between their datasets, which are privately evaluated thanks to a trusted execution environment. We show through extensive experiments that Aergia maintains high accuracy and significantly reduces the training time under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL, respectively.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Secure and Distributed Assessment of Privacy-Preserving Releases of GWAS
Authors:
Túlio Pascoal,
Jérémie Decouchant,
Marcus Völp
Abstract:
Genome-wide association studies (GWAS) identify correlations between the genetic variants and an observable characteristic such as a disease. Previous works presented privacy-preserving distributed algorithms for a federation of genome data holders that spans multiple institutional and legislative domains to securely compute GWAS results. However, these algorithms have limited applicability, since…
▽ More
Genome-wide association studies (GWAS) identify correlations between the genetic variants and an observable characteristic such as a disease. Previous works presented privacy-preserving distributed algorithms for a federation of genome data holders that spans multiple institutional and legislative domains to securely compute GWAS results. However, these algorithms have limited applicability, since they still require a centralized instance to decide whether GWAS results can be safely disclosed, which is in violation to privacy regulations, such as GDPR. In this work, we introduce GenDPR, a distributed middleware that leverages Trusted Execution Environments (TEEs) to securely determine a subset of the potential GWAS statistics that can be safely released. GenDPR achieves the same accuracy as centralized solutions, but requires transferring significantly less data because TEEs only exchange intermediary results but no genomes. Additionally, GenDPR can be configured to tolerate all-but-one honest-but-curious federation members colluding with the aim to expose genomes of correct members.
△ Less
Submitted 20 September, 2022; v1 submitted 31 August, 2022;
originally announced August 2022.
-
MUDGUARD: Taming Malicious Majorities in Federated Learning using Privacy-Preserving Byzantine-Robust Clustering
Authors:
Rui Wang,
Xingkai Wang,
Huanhuan Chen,
Jérémie Decouchant,
Stjepan Picek,
Nikolaos Laoutaris,
Kaitai Liang
Abstract:
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is prov…
▽ More
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is provided with an auxiliary dataset used to filter malicious updates. FLAME (USENIX '22) and EIFFeL (CCS '22) maintain the semi-honest majority assumption to guarantee robustness and the confidentiality of updates. It is therefore currently impossible to ensure Byzantine robustness and confidentiality of updates without assuming a semi-honest majority. To tackle this problem, we propose a novel Byzantine-robust and privacy-preserving FL system, called MUDGUARD, that can operate under malicious minority \emph{or majority} in both the server and client sides. Based on DBSCAN, we design a new method for extracting features from model updates via pairwise adjusted cosine similarity to boost the accuracy of the resulting clustering. To thwart attacks from a malicious majority, we develop a method called \textit{Model Segmentation}, that aggregates together only the updates from within a cluster, sending the corresponding model only to the clients of the corresponding cluster. The fundamental idea is that even if malicious clients are in their majority, their poisoned updates cannot harm benign clients if they are confined only within the malicious cluster. We also leverage multiple cryptographic tools to conduct clustering without sacrificing training correctness and updates confidentiality. We present a detailed security proof and empirical evaluation along with a convergence analysis for MUDGUARD.
△ Less
Submitted 14 November, 2023; v1 submitted 22 August, 2022;
originally announced August 2022.
-
I-GWAS: Privacy-Preserving Interdependent Genome-Wide Association Studies
Authors:
Túlio Pascoal,
Jérémie Decouchant,
Antoine Boutet,
Marcus Völp
Abstract:
Genome-wide Association Studies (GWASes) identify genomic variations that are statistically associated with a trait, such as a disease, in a group of individuals. Unfortunately, careless sharing of GWAS statistics might give rise to privacy attacks. Several works attempted to reconcile secure processing with privacy-preserving releases of GWASes. However, we highlight that these approaches remain…
▽ More
Genome-wide Association Studies (GWASes) identify genomic variations that are statistically associated with a trait, such as a disease, in a group of individuals. Unfortunately, careless sharing of GWAS statistics might give rise to privacy attacks. Several works attempted to reconcile secure processing with privacy-preserving releases of GWASes. However, we highlight that these approaches remain vulnerable if GWASes utilize overlapping sets of individuals and genomic variations. In such conditions, we show that even when relying on state-of-the-art techniques for protecting releases, an adversary could reconstruct the genomic variations of up to 28.6% of participants, and that the released statistics of up to 92.3% of the genomic variations would enable membership inference attacks. We introduce I-GWAS, a novel framework that securely computes and releases the results of multiple possibly interdependent GWASes. I-GWAS continuously releases privacy-preserving and noise-free GWAS results as new genomes become available.
△ Less
Submitted 20 September, 2022; v1 submitted 17 August, 2022;
originally announced August 2022.
-
Distributed Attestation Revocation in Self-Sovereign Identity
Authors:
Rowdy Chotkan,
Jérémie Decouchant,
Johan Pouwelse
Abstract:
Self-Sovereign Identity (SSI) aspires to create a standardised identity layer for the Internet by placing citizens at the centre of their data, thereby weakening the grip of big tech on current digital identities. However, as millions of both physical and digital identities are lost annually, it is also necessary for SSIs to possibly be revoked to prevent misuse. Previous attempts at designing a r…
▽ More
Self-Sovereign Identity (SSI) aspires to create a standardised identity layer for the Internet by placing citizens at the centre of their data, thereby weakening the grip of big tech on current digital identities. However, as millions of both physical and digital identities are lost annually, it is also necessary for SSIs to possibly be revoked to prevent misuse. Previous attempts at designing a revocation mechanism typically violate the principles of SSI by relying on central trusted components. This lack of a distributed revocation mechanism hampers the development of SSI. In this paper, we address this limitation and present the first fully distributed SSI revocation mechanism that does not rely on specialised trusted nodes. Our novel gossip-based propagation algorithm disseminates revocations throughout the network and provides nodes with a proof of revocation that enables offline verification of revocations. We demonstrate through simulations that our protocol adequately scales to national levels.
△ Less
Submitted 12 August, 2022; v1 submitted 10 August, 2022;
originally announced August 2022.
-
SyncPCN/PSyncPCN: Payment Channel Networks without Blockchain Synchrony
Authors:
Oğuzhan Ersoy,
Jérémie Decouchant,
Satwik Prabhu Kimble,
Stefanie Roos
Abstract:
Payment channel networks (PCNs) enhance the scalability of blockchains by allowing parties to conduct transactions off-chain, i.e, without broadcasting every transaction to all blockchain participants. To conduct transactions, a sender and a receiver can either establish a direct payment channel with a funding blockchain transaction or leverage existing channels in a multi-hop payment. The securit…
▽ More
Payment channel networks (PCNs) enhance the scalability of blockchains by allowing parties to conduct transactions off-chain, i.e, without broadcasting every transaction to all blockchain participants. To conduct transactions, a sender and a receiver can either establish a direct payment channel with a funding blockchain transaction or leverage existing channels in a multi-hop payment. The security of PCNs usually relies on the synchrony of the underlying blockchain, i.e., evidence of misbehavior needs to be published on the blockchain within a time limit. Alternative payment channel proposals that do not require blockchain synchrony rely on quorum certificates and use a committee to register the transactions of a channel. However, these proposals do not support multi-hop payments, a limitation we aim to overcome. In this paper, we demonstrate that it is in fact impossible to design a multi-hop payment protocol with both network asynchrony and faulty channels, i.e., channels that may not correctly follow the protocol. We then detail two committee-based multi-hop payment protocols that respectively assume synchronous communications and possibly faulty channels, or asynchronous communication and correct channels. The first protocol relies on possibly faulty committees instead of the blockchain to resolve channel disputes, and enforces privacy properties within a synchronous network. The second one relies on committees that contain at most f faulty members out of 3f+1 and successively delegate to each other the role of eventually completing a multi-hop payment. We show that both protocols satisfy the security requirements of a multi-hop payment and compare their communication complexity and latency.
△ Less
Submitted 4 August, 2022; v1 submitted 23 July, 2022;
originally announced July 2022.
-
AGIC: Approximate Gradient Inversion Attack on Federated Learning
Authors:
Jin Xu,
Chi Hong,
Jiyue Huang,
Lydia Y. Chen,
Jérémie Decouchant
Abstract:
Federated learning is a private-by-design distributed learning paradigm where clients train local models on their own data before a central server aggregates their local updates to compute a global model. Depending on the aggregation method used, the local updates are either the gradients or the weights of local learning models. Recent reconstruction attacks apply a gradient inversion optimization…
▽ More
Federated learning is a private-by-design distributed learning paradigm where clients train local models on their own data before a central server aggregates their local updates to compute a global model. Depending on the aggregation method used, the local updates are either the gradients or the weights of local learning models. Recent reconstruction attacks apply a gradient inversion optimization on the gradient update of a single minibatch to reconstruct the private data used by clients during training. As the state-of-the-art reconstruction attacks solely focus on single update, realistic adversarial scenarios are overlooked, such as observation across multiple updates and updates trained from multiple mini-batches. A few studies consider a more challenging adversarial scenario where only model updates based on multiple mini-batches are observable, and resort to computationally expensive simulation to untangle the underlying samples for each local step. In this paper, we propose AGIC, a novel Approximate Gradient Inversion Attack that efficiently and effectively reconstructs images from both model or gradient updates, and across multiple epochs. In a nutshell, AGIC (i) approximates gradient updates of used training samples from model updates to avoid costly simulation procedures, (ii) leverages gradient/model updates collected from multiple epochs, and (iii) assigns increasing weights to layers with respect to the neural network structure for reconstruction quality. We extensively evaluate AGIC on three datasets, CIFAR-10, CIFAR-100 and ImageNet. Our results show that AGIC increases the peak signal-to-noise ratio (PSNR) by up to 50% compared to two representative state-of-the-art gradient inversion attacks. Furthermore, AGIC is faster than the state-of-the-art simulation based attack, e.g., it is 5x faster when attacking FedAvg with 8 local steps in between model updates.
△ Less
Submitted 14 July, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
Authors:
Federico Lucchetti,
Jérémie Decouchant,
Maria Fernandes,
Lydia Y. Chen,
Marcus Völp
Abstract:
Federated learning allows clients to collaboratively train models on datasets that are acquired in different locations and that cannot be exchanged because of their size or regulations. Such collected data is increasingly non-independent and non-identically distributed (non-IID), negatively affecting training accuracy. Previous works tried to mitigate the effects of non-IID datasets on training ac…
▽ More
Federated learning allows clients to collaboratively train models on datasets that are acquired in different locations and that cannot be exchanged because of their size or regulations. Such collected data is increasingly non-independent and non-identically distributed (non-IID), negatively affecting training accuracy. Previous works tried to mitigate the effects of non-IID datasets on training accuracy, focusing mainly on non-IID labels, however practical datasets often also contain non-IID features. To address both non-IID labels and features, we propose FedGMCC, a novel framework where a central server aggregates client models that it can cluster together. FedGMCC clustering relies on a Monte Carlo procedure that samples the output space of client models, infers their position in the weight space on a loss manifold and computes their geometric connection via an affine curve parametrization. FedGMCC aggregates connected models along their path connectivity to produce a richer global model, incorporating knowledge of all connected client models. FedGMCC outperforms FedAvg and FedProx in terms of convergence rates on the EMNIST62 and a genomic sequence classification datasets (by up to +63%). FedGMCC yields an improved accuracy (+4%) on the genomic dataset with respect to CFL, in high non-IID feature space settings and label incongruency.
△ Less
Submitted 23 April, 2022;
originally announced April 2022.
-
Practical Byzantine Reliable Broadcast on Partially Connected Networks (Extended version)
Authors:
Silvia Bonomi,
Jérémie Decouchant,
Giovanni Farina,
Vincent Rahli,
Sébastien Tixeuil
Abstract:
In this paper, we consider the Byzantine reliable broadcast problem on authenticated and partially connected networks. The state-of-the-art method to solve this problem consists in combining two algorithms from the literature. Handling asynchrony and faulty senders is typically done thanks to Gabriel Bracha's authenticated double-echo broadcast protocol, which assumes an asynchronous fully connect…
▽ More
In this paper, we consider the Byzantine reliable broadcast problem on authenticated and partially connected networks. The state-of-the-art method to solve this problem consists in combining two algorithms from the literature. Handling asynchrony and faulty senders is typically done thanks to Gabriel Bracha's authenticated double-echo broadcast protocol, which assumes an asynchronous fully connected network. Danny Dolev's algorithm can then be used to provide reliable communications between processes in the global fault model, where up to f processes among N can be faulty in a communication network that is at least 2f+1-connected. Following recent works that showed that Dolev's protocol can be made more practical thanks to several optimizations, we show that the state-of-the-art methods to solve our problem can be optimized thanks to layer-specific and cross-layer optimizations. Our simulations with the Omnet++ network simulator show that these optimizations can be efficiently combined to decrease the total amount of information transmitted or the protocol's latency (e.g., respectively, -25% and -50% with a 16B payload, N=31 and f=4) compared to the state-of-the-art combination of Bracha's and Dolev's protocols.
△ Less
Submitted 26 February, 2024; v1 submitted 8 April, 2021;
originally announced April 2021.
-
PISTIS: An Event-Triggered Real-Time Byzantine-Resilient Protocol Suite
Authors:
David Kozhaya,
Jeremie Decouchant,
Vincent Rahli,
Paulo Esteves-Verissimo
Abstract:
The accelerated digitalisation of society along with technological evolution have extended the geographical span of cyber-physical systems. Two main threats have made the reliable and real-time control of these systems challenging: (i) uncertainty in the communication infrastructure induced by scale, and heterogeneity of the environment and devices; and (ii) targeted attacks maliciously worsening…
▽ More
The accelerated digitalisation of society along with technological evolution have extended the geographical span of cyber-physical systems. Two main threats have made the reliable and real-time control of these systems challenging: (i) uncertainty in the communication infrastructure induced by scale, and heterogeneity of the environment and devices; and (ii) targeted attacks maliciously worsening the impact of the above-mentioned communication uncertainties, disrupting the correctness of real-time applications. This paper addresses those challenges by showing how to build distributed protocols that provide both real-time with practical performance, and scalability in the presence of network faults and attacks, in probabilistic synchronous environments. We provide a suite of real-time Byzantine protocols, which we prove correct, starting from a reliable broadcast protocol, called PISTIS, up to atomic broadcast and consensus. This suite simplifies the construction of powerful distributed and decentralized monitoring and control applications, including state-machine replication. Extensive empirical simulations showcase PISTIS's robustness, latency, and scalability. For example, PISTIS can withstand message loss (and delay) rates up to 50% in systems with 49 nodes and provides bounded delivery latencies in the order of a few milliseconds.
△ Less
Submitted 18 March, 2021; v1 submitted 21 July, 2020;
originally announced July 2020.
-
PriLok: Citizen-protecting distributed epidemic tracing
Authors:
Paulo Esteves-Verissimo,
Jérémie Decouchant,
Marcus Völp,
Alireza Esfahani,
Rafal Graczyk
Abstract:
Contact tracing is an important instrument for national health services to fight epidemics. As part of the COVID-19 situation, many proposals have been made for scaling up contract tracing capacities with the help of smartphone applications, an important but highly critical endeavor due to the privacy risks involved in such solutions. Extending our previously expressed concern, we clearly articula…
▽ More
Contact tracing is an important instrument for national health services to fight epidemics. As part of the COVID-19 situation, many proposals have been made for scaling up contract tracing capacities with the help of smartphone applications, an important but highly critical endeavor due to the privacy risks involved in such solutions. Extending our previously expressed concern, we clearly articulate in this article, the functional and non-functional requirements that any solution has to meet, when striving to serve, not mere collections of individuals, but the whole of a nation, as required in face of such potentially dangerous epidemics. We present a critical information infrastructure, PriLock, a fully-open preliminary architecture proposal and design draft for privacy preserving contact tracing, which we believe can be constructed in a way to fulfill the former requirements. Our architecture leverages the existing regulated mobile communication infrastructure and builds upon the concept of "checks and balances", requiring a majority of independent players to agree to effect any operation on it, thus preventing abuse of the highly sensitive information that must be collected and processed for efficient contact tracing. This is enforced with a largely decentralised layout and highly resilient state-of-the-art technology, which we explain in the paper, finishing by giving a security, dependability and resilience analysis, showing how it meets the defined requirements, even while the infrastructure is under attack.
△ Less
Submitted 1 June, 2020; v1 submitted 9 May, 2020;
originally announced May 2020.
-
RT-ByzCast: Byzantine-Resilient Real-Time Reliable Broadcast
Authors:
David Kozhaya,
Jérémie Decouchant,
Paulo Esteves-Verissimo
Abstract:
Today's cyber-physical systems face various impediments to achieving their intended goals, namely, communication uncertainties and faults, relative to the increased integration of networked and wireless devices, hinder the synchronism needed to meet real-time deadlines. Moreover, being critical, these systems are also exposed to significant security threats. This threat combination increases the r…
▽ More
Today's cyber-physical systems face various impediments to achieving their intended goals, namely, communication uncertainties and faults, relative to the increased integration of networked and wireless devices, hinder the synchronism needed to meet real-time deadlines. Moreover, being critical, these systems are also exposed to significant security threats. This threat combination increases the risk of physical damage. This paper addresses these problems by studying how to build the first real-time Byzantine reliable broadcast protocol (RTBRB) tolerating network uncertainties, faults, and attacks. Previous literature describes either real-time reliable broadcast protocols, or asynchronous (non real-time) Byzantine~ones.
We first prove that it is impossible to implement RTBRB using traditional distributed computing paradigms, e.g., where the error/failure detection mechanisms of processes are decoupled from the broadcast algorithm itself, even with the help of the most powerful failure detectors. We circumvent this impossibility by proposing RT-ByzCast, an algorithm based on aggregating digital signatures in a sliding time-window and on empowering processes with self-crashing capabilities to mask and bound losses. We show that RT-ByzCast (i) operates in real-time by proving that messages broadcast by correct processes are delivered within a known bounded delay, and (ii) is reliable by demonstrating that correct processes using our algorithm crash themselves with a negligible probability, even with message loss rates as high as 60%.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.