Search | arXiv e-print repository

Recipe: Hardware-Accelerated Replication Protocols

Authors: Dimitra Giantsidi, Emmanouil Giortamis, Julian Pritzi, Maurice Bailleu, Manos Kapritsos, Pramod Bhatotia

Abstract: Replication protocols are essential for distributed systems, ensuring consistency, reliability, and fault tolerance. Traditional Crash Fault Tolerant (CFT) protocols, which assume a fail-stop model, are inadequate for untrusted cloud environments where adversaries or software bugs can cause Byzantine behavior. Byzantine Fault Tolerant (BFT) protocols address these threats but face significant perf… ▽ More Replication protocols are essential for distributed systems, ensuring consistency, reliability, and fault tolerance. Traditional Crash Fault Tolerant (CFT) protocols, which assume a fail-stop model, are inadequate for untrusted cloud environments where adversaries or software bugs can cause Byzantine behavior. Byzantine Fault Tolerant (BFT) protocols address these threats but face significant performance, resource overheads, and scalability challenges. This paper introduces Recipe, a novel approach to transforming CFT protocols to operate securely in Byzantine settings without altering their core logic. Recipe rethinks CFT protocols in the context of modern cloud hardware, including many-core servers, RDMA-capable networks, and Trusted Execution Environments (TEEs). The approach leverages these advancements to enhance the security and performance of replication protocols in untrusted cloud environments. Recipe implements two practical security mechanisms, i.e., transferable authentication and non-equivocation, using TEEs and high-performance networking stacks (e.g., RDMA, DPDK). These mechanisms ensure that any CFT protocol can be transformed into a BFT protocol, guaranteeing authenticity and non-equivocation. The Recipe protocol consists of five key components: transferable authentication, initialization, normal operation, view change, and recovery phases. The protocol's correctness is formally verified using Tamarin, a symbolic model checker. Recipe is implemented as a library and applied to transform four widely used CFT protocols-Raft, Chain Replication, ABD, and AllConcur-into Byzantine settings. The results demonstrate up to 24x higher throughput compared to PBFT and 5.9x better performance than state-of-the-art BFT protocols. Additionally, Recipe requires fewer replicas and offers confidentiality, a feature absent in traditional BFT protocols. △ Less

Submitted 13 February, 2025; originally announced February 2025.

arXiv:2312.11029 [pdf, other]

Picsou: Enabling Efficient Cross-Consensus Communication

Authors: Reginald Frank, Micah Murray, Suyash Gupta, Ethan Xu, Natacha Crooks, Manos Kapritsos

Abstract: Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present PICSOU, a practical C3B implementation. PICSOU draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in… ▽ More Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present PICSOU, a practical C3B implementation. PICSOU draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in the failure-free case and minimal number of message resends in the case of failures. PICSOU is flexible and allows both crash fault-tolerant and byzantine fault-tolerant protocols to communicate. At the heart of PICSOU's good performance and generality lies a novel technique we call QUACKs (quorum acknowledgements) that allow nodes in each RSM to precisely determine when messages have definitely been received, or definitely been lost. Our results are promising: we obtain up to 24x better performance than existing all-to-all solutions. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2202.13833 [pdf, other]

Formally verified asymptotic consensus in robust networks

Authors: Mohit Tekriwal, Avi Tachna-Fram, Jean-Baptiste Jeannin, Manos Kapritsos, Dimitra Panagou

Abstract: Distributed architectures are used to improve performance and reliability of various systems. Examples include drone swarms and load-balancing servers. An important capability of a distributed architecture is the ability to reach consensus among all its nodes. Several consensus algorithms have been proposed, and many of these algorithms come with intricate proofs of correctness, that are not mecha… ▽ More Distributed architectures are used to improve performance and reliability of various systems. Examples include drone swarms and load-balancing servers. An important capability of a distributed architecture is the ability to reach consensus among all its nodes. Several consensus algorithms have been proposed, and many of these algorithms come with intricate proofs of correctness, that are not mechanically checked. In the controls community, algorithms often achieve consensus asymptotically, e.g., for problems such as the design of human control systems, or the analysis of natural systems like bird flocking. This is in contrast to exact consensus algorithm such as Paxos, which have received much more recent attention in the formal methods community. This paper presents the first formal proof of an asymptotic consensus algorithm, and addresses various challenges in its formalization. Using the Coq proof assistant, we verify the correctness of a widely used consensus algorithm in the distributed controls community, the Weighted-Mean Subsequence Reduced (W-MSR) algorithm. We formalize the necessary and sufficient conditions required to achieve resilient asymptotic consensus under the assumed attacker model. During the formalization, we clarify several imprecisions in the paper proof, including an imprecision on quantifiers in the main theorem. △ Less

Submitted 31 January, 2024; v1 submitted 28 February, 2022; originally announced February 2022.

Comments: This paper has been accepted for publication at the TACAS,2024 conference

arXiv:2006.01885 [pdf, ps, other]

On the Significance of Consecutive Ballots in Paxos

Authors: Eli Goldweber, Nuda Zhang, Manos Kapritsos

Abstract: In this paper we examine the Paxos protocol and demonstrate how the discrete numbering of ballots can be leveraged to weaken the conditions for learning. Specifically, we define the notion of consecutive ballots and use this to define Consecutive Quorums. Consecutive Quorums weakens the learning criterion such that a learner does not need matching $accept$ messages sent in the $same \; ballot$ fro… ▽ More In this paper we examine the Paxos protocol and demonstrate how the discrete numbering of ballots can be leveraged to weaken the conditions for learning. Specifically, we define the notion of consecutive ballots and use this to define Consecutive Quorums. Consecutive Quorums weakens the learning criterion such that a learner does not need matching $accept$ messages sent in the $same \; ballot$ from a majority of acceptors to learn a value. We prove that this modification preserves the original safety and liveness guarantees of Paxos. We define $Consecutive \; Paxos$ which encapsulates the properties of discrete consecutive ballots. To establish the correctness of these results, we, in addition to a paper proof, formally verify the correctness of a State Machine Replication Library built on top of an optimized version of Multi-Paxos modified to reflect $Consecutive \; Paxos$. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: 19 Pages

ACM Class: C.4.1

Showing 1–4 of 4 results for author: Kapritsos, M