-
Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs
Authors:
Kwing Hei Li,
Alejandro Aguirre,
Simon Oddershede Gregersen,
Philipp G. Haselwarter,
Joseph Tassarotti,
Lars Birkedal
Abstract:
We present Coneris, the first higher-order concurrent separation logic for reasoning about error probability bounds of higher-order concurrent probabilistic programs with higher-order state. To support modular reasoning about concurrent (non-probabilistic) program modules, state-of-the-art program logics internalize the classic notion of linearizability within the logic through the concept of logi…
▽ More
We present Coneris, the first higher-order concurrent separation logic for reasoning about error probability bounds of higher-order concurrent probabilistic programs with higher-order state. To support modular reasoning about concurrent (non-probabilistic) program modules, state-of-the-art program logics internalize the classic notion of linearizability within the logic through the concept of logical atomicity.
Coneris extends this idea to probabilistic concurrent program modules. Thus Coneris supports modular reasoning about probabilistic concurrent modules by capturing a novel notion of randomized logical atomicity within the logic. To do so, Coneris utilizes presampling tapes and a novel probabilistic update modality to describe how state is changed probabilistically at linearization points. We demonstrate this approach by means of smaller synthetic examples and larger case studies.
All of the presented results, including the meta-theory, have been mechanized in the Rocq proof assistant and the Iris separation logic framework
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Logical Relations for Formally Verified Authenticated Data Structures
Authors:
Simon Oddershede Gregersen,
Chaitanya Agarwal,
Joseph Tassarotti
Abstract:
Authenticated data structures allow untrusted third parties to carry out operations which produce proofs that can be used to verify an operation's output. Such data structures are challenging to develop and implement correctly. This paper gives a formal proof of security and correctness for a library that generates authenticated versions of data structures automatically. The proof is based on a ne…
▽ More
Authenticated data structures allow untrusted third parties to carry out operations which produce proofs that can be used to verify an operation's output. Such data structures are challenging to develop and implement correctly. This paper gives a formal proof of security and correctness for a library that generates authenticated versions of data structures automatically. The proof is based on a new relational separation logic for reasoning about programs that use collision-resistant cryptographic hash functions. This logic provides a basis for constructing two semantic models of a type system, which are used to justify how the library makes use of type abstraction to enforce security and correctness. Using these models, we also prove the correctness of several optimizations to the library and then show how optimized, hand-written implementations of authenticated data structures can be soundly linked with automatically generated code. All of the results in this paper have been mechanized in the Coq proof assistant using the Iris framework.
△ Less
Submitted 18 January, 2025;
originally announced January 2025.
-
Verified Foundations for Differential Privacy
Authors:
Markus de Medeiros,
Muhammad Naveed,
Tancrède Lepoint,
Temesghen Kahsai,
Tristan Ravitch,
Stefan Zetzsche,
Anjali Joshi,
Joseph Tassarotti,
Aws Albarghouthi,
Jean-Baptiste Tristan
Abstract:
Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming the foundations are correct and a perfect source of randomness is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanism…
▽ More
Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming the foundations are correct and a perfect source of randomness is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanisms and random number generation have been a critical source of vulnerabilities in real-world DP systems.
In this paper, we present SampCert, the first comprehensive, mechanized foundation for differential privacy. SampCert is written in Lean with over 12,000 lines of proof. It offers a generic and extensible notion of DP, a framework for constructing and composing DP mechanisms, and formally verified implementations of Laplace and Gaussian sampling algorithms. SampCert provides (1) a mechanized foundation for developing the next generation of differentially private algorithms, and (2) mechanically verified primitives that can be deployed in production systems. Indeed, SampCert's verified algorithms power the DP offerings of Amazon Web Services (AWS), demonstrating its real-world impact.
SampCert's key innovations include: (1) A generic DP foundation that can be instantiated for various DP definitions (e.g., pure, concentrated, Rényi DP); (2) formally verified discrete Laplace and Gaussian sampling algorithms that avoid the pitfalls of floating-point implementations; and (3) a simple probability monad and novel proof techniques that streamline the formalization. To enable proving complex correctness properties of DP and random number generation, SampCert makes heavy use of Lean's extensive Mathlib library, leveraging theorems in Fourier analysis, measure and probability theory, number theory, and topology.
△ Less
Submitted 29 April, 2025; v1 submitted 2 December, 2024;
originally announced December 2024.
-
Probabilistic Concurrent Reasoning in Outcome Logic: Independence, Conditioning, and Invariants
Authors:
Noam Zilberstein,
Alexandra Silva,
Joseph Tassarotti
Abstract:
Although randomization has long been used in concurrent programs, formal methods for reasoning about this mixture of effects have lagged behind. In particular, no existing program logics can express specifications about the distributions of outcomes resulting from programs that are both probabilistic and concurrent. To address this, we introduce Probabilistic Concurrent Outcome Logic, which incorp…
▽ More
Although randomization has long been used in concurrent programs, formal methods for reasoning about this mixture of effects have lagged behind. In particular, no existing program logics can express specifications about the distributions of outcomes resulting from programs that are both probabilistic and concurrent. To address this, we introduce Probabilistic Concurrent Outcome Logic, which incorporates ideas from concurrent and probabilistic separation logics into Outcome Logic to introduce new compositional reasoning principles.
△ Less
Submitted 18 November, 2024;
originally announced November 2024.
-
A Demonic Outcome Logic for Randomized Nondeterminism
Authors:
Noam Zilberstein,
Dexter Kozen,
Alexandra Silva,
Joseph Tassarotti
Abstract:
Programs increasingly rely on randomization in applications such as cryptography and machine learning. Analyzing randomized programs has been a fruitful research direction, but there is a gap when programs also exploit nondeterminism (for concurrency, efficiency, or algorithmic design). In this paper, we introduce Demonic Outcome Logic for reasoning about programs that exploit both randomization a…
▽ More
Programs increasingly rely on randomization in applications such as cryptography and machine learning. Analyzing randomized programs has been a fruitful research direction, but there is a gap when programs also exploit nondeterminism (for concurrency, efficiency, or algorithmic design). In this paper, we introduce Demonic Outcome Logic for reasoning about programs that exploit both randomization and nondeterminism. The logic includes several novel features, such as reasoning about multiple executions in tandem and manipulating pre- and postconditions using familiar equational laws -- including the distributive law of probabilistic choices over nondeterministic ones. We also give rules for loops that both establish termination and quantify the distribution of final outcomes from a single premise. We illustrate the reasoning capabilities of Demonic Outcome Logic through several case studies, including the Monty Hall problem, an adversarial protocol for simulating fair coins, and a heuristic based probabilistic SAT solver.
△ Less
Submitted 20 November, 2024; v1 submitted 29 October, 2024;
originally announced October 2024.
-
Approximate Relational Reasoning for Higher-Order Probabilistic Programs
Authors:
Philipp G. Haselwarter,
Kwing Hei Li,
Alejandro Aguirre,
Simon Oddershede Gregersen,
Joseph Tassarotti,
Lars Birkedal
Abstract:
Properties such as provable security and correctness for randomized programs are naturally expressed relationally as approximate equivalences. As a result, a number of relational program logics have been developed to reason about such approximate equivalences of probabilistic programs. However, existing approximate relational logics are mostly restricted to first-order programs without general sta…
▽ More
Properties such as provable security and correctness for randomized programs are naturally expressed relationally as approximate equivalences. As a result, a number of relational program logics have been developed to reason about such approximate equivalences of probabilistic programs. However, existing approximate relational logics are mostly restricted to first-order programs without general state.
In this paper we develop Approxis, a higher-order approximate relational separation logic for reasoning about approximate equivalence of programs written in an expressive ML-like language with discrete probabilistic sampling, higher-order functions, and higher-order state. The Approxis logic recasts the concept of error credits in the relational setting to reason about relational approximation, which allows for expressive notions of modularity and composition, a range of new approximate relational rules, and an internalization of a standard limiting argument for showing exact probabilistic equivalences by approximation. We also use Approxis to develop a logical relation model that quantifies over error credits, which can be used to prove exact contextual equivalence. We demonstrate the flexibility of our approach on a range of examples, including the PRP/PRF switching lemma, IND\$-CPA security of an encryption scheme, and a collection of rejection samplers. All of the results have been mechanized in the Coq proof assistant and the Iris separation logic framework.
△ Less
Submitted 3 December, 2024; v1 submitted 19 July, 2024;
originally announced July 2024.
-
Tachis: Higher-Order Separation Logic with Credits for Expected Costs
Authors:
Philipp G. Haselwarter,
Kwing Hei Li,
Markus de Medeiros,
Simon Oddershede Gregersen,
Alejandro Aguirre,
Joseph Tassarotti,
Lars Birkedal
Abstract:
We present Tachis, a higher-order separation logic to reason about the expected cost of probabilistic programs. Inspired by the uses of time credits for reasoning about the running time of deterministic programs, we introduce a novel notion of probabilistic cost credit. Probabilistic cost credits are a separation logic resource that can be used to pay for the cost of operations in programs, and th…
▽ More
We present Tachis, a higher-order separation logic to reason about the expected cost of probabilistic programs. Inspired by the uses of time credits for reasoning about the running time of deterministic programs, we introduce a novel notion of probabilistic cost credit. Probabilistic cost credits are a separation logic resource that can be used to pay for the cost of operations in programs, and that can be distributed across all possible branches of sampling instructions according to their weight, thus enabling us to reason about expected cost. The representation of cost credits as separation logic resources gives Tachis a great deal of flexibility and expressivity. In particular, it permits reasoning about amortized expected cost by storing excess credits as potential into data structures to pay for future operations. Tachis further supports a range of cost models, including running time and entropy usage. We showcase the versatility of this approach by applying our techniques to prove upper bounds on the expected cost of a variety of probabilistic algorithms and data structures, including randomized quicksort, hash tables, and meldable heaps.
All of our results have been mechanized using Coq, Iris, and the Coquelicot real analysis library.
△ Less
Submitted 12 November, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Error Credits: Resourceful Reasoning about Error Bounds for Higher-Order Probabilistic Programs
Authors:
Alejandro Aguirre,
Philipp G. Haselwarter,
Markus de Medeiros,
Kwing Hei Li,
Simon Oddershede Gregersen,
Joseph Tassarotti,
Lars Birkedal
Abstract:
Probabilistic programs often trade accuracy for efficiency, and thus may, with a small probability, return an incorrect result. It is important to obtain precise bounds for the probability of these errors, but existing verification approaches have limitations that lead to error probability bounds that are excessively coarse, or only apply to first-order programs. In this paper we present Eris, a h…
▽ More
Probabilistic programs often trade accuracy for efficiency, and thus may, with a small probability, return an incorrect result. It is important to obtain precise bounds for the probability of these errors, but existing verification approaches have limitations that lead to error probability bounds that are excessively coarse, or only apply to first-order programs. In this paper we present Eris, a higher-order separation logic for proving error probability bounds for probabilistic programs written in an expressive higher-order language. Our key novelty is the introduction of error credits, a separation logic resource that tracks an upper bound on the probability that a program returns an erroneous result. By representing error bounds as a resource, we recover the benefits of separation logic, including compositionality, modularity, and dependency between errors and program terms, allowing for more precise specifications. Moreover, we enable novel reasoning principles such as expectation-preserving error composition, amortized error reasoning, and error induction. We illustrate the advantages of our approach by proving amortized error bounds on a range of examples, including collision probabilities in hash functions, which allow us to write more modular specifications for data structures that use them as clients. We also use our logic to prove correctness and almost-sure termination of rejection sampling algorithms. All of our results have been mechanized in the Coq proof assistant using the Iris separation logic framework and the Coquelicot real analysis library.
△ Less
Submitted 29 November, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Almost-Sure Termination by Guarded Refinement
Authors:
Simon Oddershede Gregersen,
Alejandro Aguirre,
Philipp G. Haselwarter,
Joseph Tassarotti,
Lars Birkedal
Abstract:
Almost-sure termination is an important correctness property for probabilistic programs, and a number of program logics have been developed for establishing it. However, these logics have mostly been developed for first-order programs written in languages with specific syntactic patterns for looping. In this paper, we consider almost-sure termination for higher-order probabilistic programs with ge…
▽ More
Almost-sure termination is an important correctness property for probabilistic programs, and a number of program logics have been developed for establishing it. However, these logics have mostly been developed for first-order programs written in languages with specific syntactic patterns for looping. In this paper, we consider almost-sure termination for higher-order probabilistic programs with general references. This combination of features allows for recursion and looping to be encoded through a variety of patterns. Therefore, rather than developing proof rules for reasoning about particular recursion patterns, we instead propose an approach based on proving refinement between a higher-order program and a simpler probabilistic model, in such a way that the refinement preserves termination behavior. By proving a refinement, almost-sure termination behavior of the program can then be established by analyzing the simpler model. We present this approach in the form of Caliper, a higher-order separation logic for proving termination-preserving refinements. Caliper uses probabilistic couplings to carry out relational reasoning between a program and a model. To handle the range of recursion patterns found in higher-order programs, Caliper uses guarded recursion, in particular the principle of Löb induction. A technical novelty is that Caliper does not require the use of transfinite step indexing or other technical restrictions found in prior work on guarded recursion for termination-preservation refinement. We demonstrate the flexibility of this approach by proving almost-sure termination of several examples, including first-order loop constructs, a random list generator, treaps, and a sampler for Galton-Watson trees that uses higher-order store. All the results have been mechanized in the Coq proof assistant.
△ Less
Submitted 11 November, 2024; v1 submitted 12 April, 2024;
originally announced April 2024.
-
Grove: a Separation-Logic Library for Verifying Distributed Systems (Extended Version)
Authors:
Upamanyu Sharma,
Ralf Jung,
Joseph Tassarotti,
M. Frans Kaashoek,
Nickolai Zeldovich
Abstract:
Grove is a concurrent separation logic library for verifying distributed systems. Grove is the first to handle time-based leases, including their interaction with reconfiguration, crash recovery, thread-level concurrency, and unreliable networks. This paper uses Grove to verify several distributed system components written in Go, including GroveKV, a realistic distributed multi-threaded key-value…
▽ More
Grove is a concurrent separation logic library for verifying distributed systems. Grove is the first to handle time-based leases, including their interaction with reconfiguration, crash recovery, thread-level concurrency, and unreliable networks. This paper uses Grove to verify several distributed system components written in Go, including GroveKV, a realistic distributed multi-threaded key-value store. GroveKV supports reconfiguration, primary/backup replication, and crash recovery, and uses leases to execute read-only requests on any replica. GroveKV achieves high performance (67-73% of Redis on a single core), scales with more cores and more backup replicas (achieving about 2x the throughput when going from 1 to 3 servers), and can safely execute reads while reconfiguring.
△ Less
Submitted 14 September, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Asynchronous Probabilistic Couplings in Higher-Order Separation Logic
Authors:
Simon Oddershede Gregersen,
Alejandro Aguirre,
Philipp G. Haselwarter,
Joseph Tassarotti,
Lars Birkedal
Abstract:
Probabilistic couplings are the foundation for many probabilistic relational program logics and arise when relating random sampling statements across two programs. In relational program logics, this manifests as dedicated coupling rules that, e.g., say we may reason as if two sampling statements return the same value. However, this approach fundamentally requires aligning or "synchronizing" the sa…
▽ More
Probabilistic couplings are the foundation for many probabilistic relational program logics and arise when relating random sampling statements across two programs. In relational program logics, this manifests as dedicated coupling rules that, e.g., say we may reason as if two sampling statements return the same value. However, this approach fundamentally requires aligning or "synchronizing" the sampling statements of the two programs which is not always possible.
In this paper, we develop Clutch, a higher-order probabilistic relational separation logic that addresses this issue by supporting asynchronous probabilistic couplings. We use Clutch to develop a logical step-indexed logical relational to reason about contextual refinement and equivalence of higher-order programs written in a rich language with higher-order local state and impredicative polymorphism. Finally, we demonstrate the usefulness of our approach on a number of case studies.
All the results that appear in the paper have been formalized in the Coq proof assistant using the Coquelicot library and the Iris separation logic framework.
△ Less
Submitted 14 November, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
A Separation Logic for Negative Dependence
Authors:
Jialu Bao,
Marco Gaboardi,
Justin Hsu,
Joseph Tassarotti
Abstract:
Formal reasoning about hashing-based probabilistic data structures often requires reasoning about random variables where when one variable gets larger (such as the number of elements hashed into one bucket), the others tend to be smaller (like the number of elements hashed into the other buckets). This is an example of negative dependence, a generalization of probabilistic independence that has re…
▽ More
Formal reasoning about hashing-based probabilistic data structures often requires reasoning about random variables where when one variable gets larger (such as the number of elements hashed into one bucket), the others tend to be smaller (like the number of elements hashed into the other buckets). This is an example of negative dependence, a generalization of probabilistic independence that has recently found interesting applications in algorithm design and machine learning. Despite the usefulness of negative dependence for the analyses of probabilistic data structures, existing verification methods cannot establish this property for randomized programs.
To fill this gap, we design LINA, a probabilistic separation logic for reasoning about negative dependence. Following recent works on probabilistic separation logic using separating conjunction to reason about the probabilistic independence of random variables, we use separating conjunction to reason about negative dependence. Our assertion logic features two separating conjunctions, one for independence and one for negative dependence. We generalize the logic of bunched implications (BI) to support multiple separating conjunctions, and provide a sound and complete proof system. Notably, the semantics for separating conjunction relies on a non-deterministic, rather than partial, operation for combining resources. By drawing on closure properties for negative dependence, our program logic supports a Frame-like rule for negative dependence and monotone operations. We demonstrate how LINA can verify probabilistic properties of hash-based data structures and balls-into-bins processes.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Rabia: Simplifying State-Machine Replication Through Randomization
Authors:
Haochen Pan,
Jesse Tuglu,
Neo Zhou,
Tianshu Wang,
Yicheng Shen,
Xiong Zheng,
Joseph Tassarotti,
Lewis Tseng,
Roberto Palmieri
Abstract:
We introduce Rabia, a simple and high performance framework for implementing state-machine replication (SMR) within a datacenter. The main innovation of Rabia is in using randomization to simplify the design. Rabia provides the following two features: (i) It does not need any fail-over protocol and supports trivial auxiliary protocols like log compaction, snapshotting, and reconfiguration, compone…
▽ More
We introduce Rabia, a simple and high performance framework for implementing state-machine replication (SMR) within a datacenter. The main innovation of Rabia is in using randomization to simplify the design. Rabia provides the following two features: (i) It does not need any fail-over protocol and supports trivial auxiliary protocols like log compaction, snapshotting, and reconfiguration, components that are often considered the most challenging when developing SMR systems; and (ii) It provides high performance, up to 1.5x higher throughput than the closest competitor (i.e., EPaxos) in a favorable setup (same availability zone with three replicas) and is comparable with a larger number of replicas or when deployed in multiple availability zones.
△ Less
Submitted 26 September, 2021;
originally announced September 2021.
-
Verification of ML Systems via Reparameterization
Authors:
Jean-Baptiste Tristan,
Joseph Tassarotti,
Koundinya Vajjha,
Michael L. Wick,
Anindya Banerjee
Abstract:
As machine learning is increasingly used in essential systems, it is important to reduce or eliminate the incidence of serious bugs. A growing body of research has developed machine learning algorithms with formal guarantees about performance, robustness, or fairness. Yet, the analysis of these algorithms is often complex, and implementing such systems in practice introduces room for error. Proof…
▽ More
As machine learning is increasingly used in essential systems, it is important to reduce or eliminate the incidence of serious bugs. A growing body of research has developed machine learning algorithms with formal guarantees about performance, robustness, or fairness. Yet, the analysis of these algorithms is often complex, and implementing such systems in practice introduces room for error. Proof assistants can be used to formally verify machine learning systems by constructing machine checked proofs of correctness that rule out such bugs. However, reasoning about probabilistic claims inside of a proof assistant remains challenging. We show how a probabilistic program can be automatically represented in a theorem prover using the concept of \emph{reparameterization}, and how some of the tedious proofs of measurability can be generated automatically from the probabilistic program. To demonstrate that this approach is broad enough to handle rather different types of machine learning systems, we verify both a classic result from statistical learning theory (PAC-learnability of decision stumps) and prove that the null model used in a Bayesian hypothesis test satisfies a fairness criterion called demographic parity.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
A Formal Proof of PAC Learnability for Decision Stumps
Authors:
Joseph Tassarotti,
Koundinya Vajjha,
Anindya Banerjee,
Jean-Baptiste Tristan
Abstract:
We present a formal proof in Lean of probably approximately correct (PAC) learnability of the concept class of decision stumps. This classic result in machine learning theory derives a bound on error probabilities for a simple type of classifier. Though such a proof appears simple on paper, analytic and measure-theoretic subtleties arise when carrying it out fully formally. Our proof is structured…
▽ More
We present a formal proof in Lean of probably approximately correct (PAC) learnability of the concept class of decision stumps. This classic result in machine learning theory derives a bound on error probabilities for a simple type of classifier. Though such a proof appears simple on paper, analytic and measure-theoretic subtleties arise when carrying it out fully formally. Our proof is structured so as to separate reasoning about deterministic properties of a learning function from proofs of measurability and analysis of probabilities.
△ Less
Submitted 7 January, 2021; v1 submitted 1 November, 2019;
originally announced November 2019.
-
Sketching for Latent Dirichlet-Categorical Models
Authors:
Joseph Tassarotti,
Jean-Baptiste Tristan,
Michael Wick
Abstract:
Recent work has explored transforming data sets into smaller, approximate summaries in order to scale Bayesian inference. We examine a related problem in which the parameters of a Bayesian model are very large and expensive to store in memory, and propose more compact representations of parameter values that can be used during inference. We focus on a class of graphical models that we refer to as…
▽ More
Recent work has explored transforming data sets into smaller, approximate summaries in order to scale Bayesian inference. We examine a related problem in which the parameters of a Bayesian model are very large and expensive to store in memory, and propose more compact representations of parameter values that can be used during inference. We focus on a class of graphical models that we refer to as latent Dirichlet-Categorical models, and show how a combination of two sketching algorithms known as count-min sketch and approximate counters provide an efficient representation for them. We show that this sketch combination -- which, despite having been used before in NLP applications, has not been previously analyzed -- enjoys desirable properties. We prove that for this class of models, when the sketches are used during Markov Chain Monte Carlo inference, the equilibrium of sketched MCMC converges to that of the exact chain as sketch parameters are tuned to reduce the error rate.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
A Separation Logic for Concurrent Randomized Programs
Authors:
Joseph Tassarotti,
Robert Harper
Abstract:
We present Polaris, a concurrent separation logic with support for probabilistic reasoning. As part of our logic, we extend the idea of coupling, which underlies recent work on probabilistic relational logics, to the setting of programs with both probabilistic and non-deterministic choice. To demonstrate Polaris, we verify a variant of a randomized concurrent counter algorithm and a two-level conc…
▽ More
We present Polaris, a concurrent separation logic with support for probabilistic reasoning. As part of our logic, we extend the idea of coupling, which underlies recent work on probabilistic relational logics, to the setting of programs with both probabilistic and non-deterministic choice. To demonstrate Polaris, we verify a variant of a randomized concurrent counter algorithm and a two-level concurrent skip list. All of our results have been mechanized in Coq.
△ Less
Submitted 21 November, 2018; v1 submitted 8 February, 2018;
originally announced February 2018.
-
Probabilistic Recurrence Relations for Work and Span of Parallel Algorithms
Authors:
Joseph Tassarotti
Abstract:
In this paper we present a method for obtaining tail-bounds for random variables satisfying certain probabilistic recurrences that arise in the analysis of randomized parallel divide and conquer algorithms. In such algorithms, some computation is initially done to process an input x, which is then randomly split into subproblems $h_1(x), ..., h_n(x)$, and the algorithm proceeds recursively in para…
▽ More
In this paper we present a method for obtaining tail-bounds for random variables satisfying certain probabilistic recurrences that arise in the analysis of randomized parallel divide and conquer algorithms. In such algorithms, some computation is initially done to process an input x, which is then randomly split into subproblems $h_1(x), ..., h_n(x)$, and the algorithm proceeds recursively in parallel on each subproblem. The total work on input x, W(x), then satisfies a probabilistic recurrence of the form $W(x) = a(x) + \sum_{i=1}^n W (h_i(x))$, and the span (the longest chain of sequential dependencies), satisfies $S(x) = b(x) + \max_{i=1}^n S(h_i(x))$, where a(x) and b(x) are the work and span to split x and combine the results of the recursive calls.
Karp has previously presented methods for obtaining tail-bounds in the case when n = 1, and under certain stronger assumptions for the work-recurrence when n > 1, but left open the question of the span-recurrence. We first show how to extend his technique to handle the span-recurrence. We then show that in some cases, the work-recurrence can be bounded under simpler assumptions than Karp's by transforming it into a related span-recurrence and applying our first result. We demonstrate our results by deriving tail bounds for the work and span of quicksort and the height of a randomly generated binary search tree.
△ Less
Submitted 6 April, 2017;
originally announced April 2017.
-
A Higher-Order Logic for Concurrent Termination-Preserving Refinement
Authors:
Joseph Tassarotti,
Ralf Jung,
Robert Harper
Abstract:
Compiler correctness proofs for higher-order concurrent languages are difficult: they involve establishing a termination-preserving refinement between a concurrent high-level source language and an implementation that uses low-level shared memory primitives. However, existing logics for proving concurrent refinement either neglect properties such as termination, or only handle first-order state. I…
▽ More
Compiler correctness proofs for higher-order concurrent languages are difficult: they involve establishing a termination-preserving refinement between a concurrent high-level source language and an implementation that uses low-level shared memory primitives. However, existing logics for proving concurrent refinement either neglect properties such as termination, or only handle first-order state. In this paper, we address these limitations by extending Iris, a recent higher-order concurrent separation logic, with support for reasoning about termination-preserving refinements. To demonstrate the power of these extensions, we prove the correctness of an efficient implementation of a higher-order, session-typed language. To our knowledge, this is the first program logic capable of giving a compiler correctness proof for such a language. The soundness of our extensions and our compiler correctness proof have been mechanized in Coq.
△ Less
Submitted 20 January, 2017;
originally announced January 2017.
-
Augur: a Modeling Language for Data-Parallel Probabilistic Inference
Authors:
Jean-Baptiste Tristan,
Daniel Huang,
Joseph Tassarotti,
Adam Pocock,
Stephen J. Green,
Guy L. Steele Jr
Abstract:
It is time-consuming and error-prone to implement inference procedures for each new probabilistic model. Probabilistic programming addresses this problem by allowing a user to specify the model and having a compiler automatically generate an inference procedure for it. For this approach to be practical, it is important to generate inference code that has reasonable performance. In this paper, we p…
▽ More
It is time-consuming and error-prone to implement inference procedures for each new probabilistic model. Probabilistic programming addresses this problem by allowing a user to specify the model and having a compiler automatically generate an inference procedure for it. For this approach to be practical, it is important to generate inference code that has reasonable performance. In this paper, we present a probabilistic programming language and compiler for Bayesian networks designed to make effective use of data-parallel architectures such as GPUs. Our language is fully integrated within the Scala programming language and benefits from tools such as IDE support, type-checking, and code completion. We show that the compiler can generate data-parallel inference code scalable to thousands of GPU cores by making use of the conditional independence relationships in the Bayesian network.
△ Less
Submitted 10 June, 2014; v1 submitted 12 December, 2013;
originally announced December 2013.