-
Smart Casual Verification of the Confidential Consortium Framework
Authors:
Heidi Howard,
Markus A. Kuppe,
Edward Ashton,
Amaury Chamayou,
Natacha Crooks
Abstract:
The Confidential Consortium Framework (CCF) is an open-source platform for developing trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel…
▽ More
The Confidential Consortium Framework (CCF) is an open-source platform for developing trustworthy and reliable cloud applications. CCF powers Microsoft's Azure Confidential Ledger service and as such it is vital to build confidence in the correctness of CCF's design and implementation. This paper reports our experiences applying smart casual verification to validate the correctness of CCF's novel distributed protocols, focusing on its unique distributed consensus protocol and its custom client consistency model. We use the term smart casual verification to describe our hybrid approach, which combines the rigor of formal specification and model checking with the pragmatism of automated testing, in our case binding the formal specification in TLA+ to the C++ implementation. While traditional formal methods approaches require substantial buy-in and are often one-off efforts by domain experts, we have integrated our smart casual verification approach into CCF's CI pipeline, allowing contributors to continuously validate CCF as it evolves. We describe the challenges we faced in applying smart casual verification to a complex existing codebase and how we overcame them to find six subtle bugs in the design and implementation before they could impact production
△ Less
Submitted 16 October, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Validating Traces of Distributed Programs Against TLA+ Specifications
Authors:
Horatiu Cirstea,
Markus A. Kuppe,
Benjamin Loillier,
Stephan Merz
Abstract:
TLA+ is a formal language for specifying systems, including distributed algorithms, that is supported by powerful verification tools. In this work we present a framework for relating traces of distributed programs to high-level specifications written in TLA+. The problem is reduced to a constrained model checking problem, realized using the TLC model checker. Our framework consists of an API for i…
▽ More
TLA+ is a formal language for specifying systems, including distributed algorithms, that is supported by powerful verification tools. In this work we present a framework for relating traces of distributed programs to high-level specifications written in TLA+. The problem is reduced to a constrained model checking problem, realized using the TLC model checker. Our framework consists of an API for instrumenting Java programs in order to record traces of executions, of a collection of TLA+ operators that are used for relating those traces to specifications, and of scripts for running the model checker. Crucially, traces only contain updates to specification variables rather than full values, and developers may choose to trace only certain variables. We have applied our approach to several distributed programs, detecting discrepancies between the specifications and the implementations in all cases. We discuss reasons for these discrepancies, best practices for instrumenting programs, and how to interpret the verdict produced by TLC.
△ Less
Submitted 17 September, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Confidential Consortium Framework: Secure Multiparty Applications with Confidentiality, Integrity, and High Availability
Authors:
Heidi Howard,
Fritz Alder,
Edward Ashton,
Amaury Chamayou,
Sylvan Clebsch,
Manuel Costa,
Antoine Delignat-Lavaud,
Cedric Fournet,
Andrew Jeffery,
Matthew Kerner,
Fotios Kounelis,
Markus A. Kuppe,
Julien Maffre,
Mark Russinovich,
Christoph M. Wintersteiger
Abstract:
Confidentiality, integrity protection, and high availability, abbreviated to CIA, are essential properties for trustworthy data systems. The rise of cloud computing and the growing demand for multiparty applications however means that building modern CIA systems is more challenging than ever. In response, we present the Confidential Consortium Framework (CCF), a general-purpose foundation for deve…
▽ More
Confidentiality, integrity protection, and high availability, abbreviated to CIA, are essential properties for trustworthy data systems. The rise of cloud computing and the growing demand for multiparty applications however means that building modern CIA systems is more challenging than ever. In response, we present the Confidential Consortium Framework (CCF), a general-purpose foundation for developing secure stateful CIA applications. CCF combines centralized compute with decentralized trust, supporting deployment on untrusted cloud infrastructure and transparent governance by mutually untrusted parties. CCF leverages hardware-based trusted execution environments for remotely verifiable confidentiality and code integrity. This is coupled with state machine replication backed by an auditable immutable ledger for data integrity and high availability. CCF enables each service to bring its own application logic, custom multiparty governance model, and deployment scenario, decoupling the operators of nodes from the consortium that governs them. CCF is open-source and available now at https://github.com/microsoft/CCF.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Understanding Inconsistency in Azure Cosmos DB with TLA+
Authors:
A. Finn Hackett,
Joshua Rowe,
Markus Alexander Kuppe
Abstract:
Beyond implementation correctness of a distributed system, it is equally important to understand exactly what users should expect to see from that system. Even if the system itself works as designed, insufficient understanding of its user-visible semantics can cause bugs in its dependencies. By focusing a formal specification effort on precisely defining the expected user-facing behaviors of the A…
▽ More
Beyond implementation correctness of a distributed system, it is equally important to understand exactly what users should expect to see from that system. Even if the system itself works as designed, insufficient understanding of its user-visible semantics can cause bugs in its dependencies. By focusing a formal specification effort on precisely defining the expected user-facing behaviors of the Azure Cosmos DB service at Microsoft, we were able to write a formal specification of the database that was significantly smaller and conceptually simpler than any other specification of Cosmos DB, while representing a wider range of valid user-observable behaviors than existing more detailed specifications. Many of the additional behaviors we documented were previously poorly understood outside of the Cosmos DB development team, even informally, leading to data consistency errors in Microsoft products that depend on it. Using this model, we were able to raise two key issues in Cosmos DB's public-facing documentation, which have since been addressed. We were also able to offer a fundamental solution to a previous high-impact outage within another Azure service that depends on Cosmos DB.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
The TLA+ Toolbox
Authors:
Markus Alexander Kuppe,
Leslie Lamport,
Daniel Ricketts
Abstract:
We discuss the workflows supported by the TLA+ Toolbox to write and verify specifications. We focus on features that are useful in industry because its users are primarily engineers. Two features are novel in the scope of formal IDEs: CloudTLC connects the Toolbox with cloud computing to scale up model checking. A Profiler helps to debug inefficient expressions and to pinpoint the source of state…
▽ More
We discuss the workflows supported by the TLA+ Toolbox to write and verify specifications. We focus on features that are useful in industry because its users are primarily engineers. Two features are novel in the scope of formal IDEs: CloudTLC connects the Toolbox with cloud computing to scale up model checking. A Profiler helps to debug inefficient expressions and to pinpoint the source of state space explosion. For those who wish to contribute to the Toolbox or learn from its flaws, we present its technical architecture.
△ Less
Submitted 23 December, 2019;
originally announced December 2019.