-
Spider: A BFT Architecture for Geo-Replicated Cloud Services
Authors:
Michael Eischer,
Tobias Distler
Abstract:
Traditionally, Byzantine fault tolerance (BFT) in geo-replicated systems is achieved by executing complex agreement protocols over large-distance communication links, and therefore typically incurs high response times. In this paper we address this problem with Spider, a resilient and modular BFT replication architecture for geo-distributed systems that leverages characteristic features of today's…
▽ More
Traditionally, Byzantine fault tolerance (BFT) in geo-replicated systems is achieved by executing complex agreement protocols over large-distance communication links, and therefore typically incurs high response times. In this paper we address this problem with Spider, a resilient and modular BFT replication architecture for geo-distributed systems that leverages characteristic features of today's public-cloud infrastructures to minimize both complexity as well as latency. Spider is composed of multiple largely independent replica groups that each are distributed across different availability zones of their respective cloud region. This design offers the possibility to provide low response times by placing replica groups in close geographic distance to clients, while at the same time enabling intra-group communication over short-distance links. To handle the interaction between groups that is necessary for strong consistency, Spider uses a novel message-channel abstraction with first-in-first-out semantics and built-in flow control that greatly simplifies system design.
△ Less
Submitted 18 May, 2024;
originally announced July 2024.
-
Probabilistic Byzantine Fault Tolerance (Extended Version)
Authors:
Diogo Avelãs,
Hasan Heydari,
Eduardo Alchieri,
Tobias Distler,
Alysson Bessani
Abstract:
Consensus is a fundamental building block for constructing reliable and fault-tolerant distributed services. Many Byzantine fault-tolerant consensus protocols designed for partially synchronous systems adopt a pessimistic approach when dealing with adversaries, ensuring safety in a deterministic way even under the worst-case scenarios that adversaries can create. Following this approach typically…
▽ More
Consensus is a fundamental building block for constructing reliable and fault-tolerant distributed services. Many Byzantine fault-tolerant consensus protocols designed for partially synchronous systems adopt a pessimistic approach when dealing with adversaries, ensuring safety in a deterministic way even under the worst-case scenarios that adversaries can create. Following this approach typically results in either an increase in the message complexity (e.g., PBFT) or an increase in the number of communication steps (e.g., HotStuff). In practice, however, adversaries are not as powerful as the ones assumed by these protocols. Furthermore, it might suffice to ensure safety and liveness properties with high probability. In order to accommodate more realistic and optimistic adversaries and improve the scalability of the BFT consensus, we propose ProBFT (Probabilistic Byzantine Fault Tolerance). ProBFT is a leader-based probabilistic consensus protocol with a message complexity of $O(n\sqrt{n})$ and an optimal number of communication steps that tolerates Byzantine faults in permissioned partially synchronous systems. It is built on top of well-known primitives, such as probabilistic Byzantine quorums and verifiable random functions. ProBFT guarantees safety and liveness with high probabilities even with faulty leaders, as long as a supermajority of replicas is correct, and using only a fraction of messages employed in PBFT (e.g., $20\%$). We provide a detailed description of ProBFT's protocol and its analysis.
△ Less
Submitted 11 June, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
Vivisecting the Dissection: On the Role of Trusted Components in BFT Protocols
Authors:
Alysson Bessani,
Miguel Correia,
Tobias Distler,
Rüdiger Kapitza,
Paulo Esteves-Verissimo,
Jiangshan Yu
Abstract:
A recent paper by Gupta et al. (EuroSys'23) challenged the usefulness of trusted component (TC) based Byzantine fault-tolerant (BFT) protocols to lower the replica group size from $3f+1$ to $2f+1$, identifying three limitations of such protocols and proposing that TCs should be used instead to improve the performance of BFT protocols. Here, we point out flaws in both arguments and advocate that th…
▽ More
A recent paper by Gupta et al. (EuroSys'23) challenged the usefulness of trusted component (TC) based Byzantine fault-tolerant (BFT) protocols to lower the replica group size from $3f+1$ to $2f+1$, identifying three limitations of such protocols and proposing that TCs should be used instead to improve the performance of BFT protocols. Here, we point out flaws in both arguments and advocate that the most worthwhile use of TCs in BFT protocols is indeed to make them as resilient as crash fault-tolerant (CFT) protocols, which can tolerate up to $f$ faulty replicas using $2f+1$ replicas.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Egalitarian Byzantine Fault Tolerance
Authors:
Michael Eischer,
Tobias Distler
Abstract:
Minimizing end-to-end latency in geo-replicated systems usually makes it necessary to compromise on resilience, resource efficiency, or throughput performance, because existing approaches either tolerate only crashes, require additional replicas, or rely on a global leader for consensus. In this paper, we eliminate the need for such tradeoffs by presenting Isos, a leaderless replication protocol t…
▽ More
Minimizing end-to-end latency in geo-replicated systems usually makes it necessary to compromise on resilience, resource efficiency, or throughput performance, because existing approaches either tolerate only crashes, require additional replicas, or rely on a global leader for consensus. In this paper, we eliminate the need for such tradeoffs by presenting Isos, a leaderless replication protocol that tolerates up to $f$ Byzantine faults with a minimum of $3f+1$ replicas. To reduce latency in wide-area environments, Isos relies on an efficient consensus algorithm that allows all participating replicas to propose new requests and thereby enables clients to avoid delays by submitting requests to their nearest replica. In addition, Isos minimizes overhead by limiting message ordering to requests that conflict with each other (e.g., due to accessing the same state parts) and by already committing them after three communication steps if at least $f+1$ replicas report each conflict. Our experimental evaluation with a geo-replicated key-value store shows that these properties allow Isos to provide lower end-to-end latency than existing protocols, especially for use-case scenarios in which the clients of a system are distributed across multiple locations.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Stream-based State-Machine Replication
Authors:
Laura Lawniczak,
Tobias Distler
Abstract:
Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an implementation. In this paper, we show how this problem can be addressed with stream-based replication, a novel approach that implements a replication protocol as appli…
▽ More
Developing state-machine replication protocols for practical use is a complex and labor-intensive process because of the myriad of essential tasks (e.g., deployment, communication, recovery) that need to be taken into account in an implementation. In this paper, we show how this problem can be addressed with stream-based replication, a novel approach that implements a replication protocol as application on top of a data-stream processing framework. With such framework already handling most essential tasks and furthermore providing means for debugging and monitoring, this technique has the key benefit of significantly minimizing overhead for both programmers as well as system operators. Our first stream-based protocol Tara tolerates crashes and comprises full-fledged mechanisms for request handling, checkpointing, and view changes. Still, Tara's prototype implementation, which is based on Twitter's Heron framework, consists of fewer than 1,500 lines of application-level code.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Resilient Cloud-based Replication with Low Latency
Authors:
Michael Eischer,
Tobias Distler
Abstract:
Existing approaches to tolerate Byzantine faults in geo-replicated environments require systems to execute complex agreement protocols over wide-area links and consequently are often associated with high response times. In this paper we address this problem with Spider, a resilient replication architecture for geo-distributed systems that leverages the availability characteristics of today's publi…
▽ More
Existing approaches to tolerate Byzantine faults in geo-replicated environments require systems to execute complex agreement protocols over wide-area links and consequently are often associated with high response times. In this paper we address this problem with Spider, a resilient replication architecture for geo-distributed systems that leverages the availability characteristics of today's public-cloud infrastructures to minimize complexity and reduce latency. Spider models a system as a collection of loosely coupled replica groups whose members are hosted in different cloud-provided fault domains (i.e., availability zones) of the same geographic region. This structural organization makes it possible to achieve low response times by placing replica groups in close proximity to clients while still enabling the replicas of a group to interact over short-distance links. To handle the inter-group communication necessary for strong consistency Spider uses a reliable group-to-group message channel with first-in-first-out semantics and built-in flow control that significantly simplifies system design.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.