-
State Merging with Quantifiers in Symbolic Execution
Authors:
David Trabish,
Noam Rinetzky,
Sharon Shoham,
Vaibhav Sharma
Abstract:
We address the problem of constraint encoding explosion which hinders the applicability of state merging in symbolic execution. Specifically, our goal is to reduce the number of disjunctions and if-then-else expressions introduced during state merging. The main idea is to dynamically partition the symbolic states into merging groups according to a similar uniform structure detected in their path c…
▽ More
We address the problem of constraint encoding explosion which hinders the applicability of state merging in symbolic execution. Specifically, our goal is to reduce the number of disjunctions and if-then-else expressions introduced during state merging. The main idea is to dynamically partition the symbolic states into merging groups according to a similar uniform structure detected in their path constraints, which allows to efficiently encode the merged path constraint and memory using quantifiers. To address the added complexity of solving quantified constraints, we propose a specialized solving procedure that reduces the solving time in many cases. Our evaluation shows that our approach can lead to significant performance gains.
△ Less
Submitted 24 August, 2023; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Putting the Squeeze on Array Programs: Loop Verification via Inductive Rank Reduction
Authors:
Oren Ish Shalom,
Shachar Itzhaky,
Noam Rinetzky,
Sharon Shoham
Abstract:
Automatic verification of array manipulating programs is a challenging problem because it often amounts to the inference of in ductive quantified loop invariants which, in some cases, may not even be firstorder expressible. In this paper, we suggest a novel verification tech nique that is based on induction on userdefined rank of program states as an alternative to loopinvariants. Our technique, d…
▽ More
Automatic verification of array manipulating programs is a challenging problem because it often amounts to the inference of in ductive quantified loop invariants which, in some cases, may not even be firstorder expressible. In this paper, we suggest a novel verification tech nique that is based on induction on userdefined rank of program states as an alternative to loopinvariants. Our technique, dubbed inductive rank reduction, works in two steps. Firstly, we simplify the verification problem and prove that the program is correct when the input state con tains an input array of length B or less, using the length of the array as the rank of the state. Secondly, we employ a squeezing function g which converts a program state sigma with an array of length > B to a state g(sigma) containing an array of length minus 1 or less. We prove that when g satisfies certain natural conditions then if the program violates its specification on sigma then it does so also on g(sigma). The correctness of the program on inputs with arrays of arbitrary lengths follows by induction. We make our technique automatic for array programs whose length of execution is proportional to the length of the input arrays by (i) perform ing the first step using symbolic execution, (ii) verifying the conditions required of g using Z3, and (iii) providing a heuristic procedure for syn thesizing g. We implemented our technique and applied it successfully to several interesting arraymanipulating programs, including a bidirec tional summation program whose loop invariant cannot be expressed in firstorder logic while its specification is quantifier free.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Proving Highly-Concurrent Traversals Correct
Authors:
Yotam M. Y. Feldman,
Artem Khyzha,
Constantin Enea,
Adam Morrison,
Aleksandar Nanevski,
Noam Rinetzky,
Sharon Shoham
Abstract:
Modern highly-concurrent search data structures, such as search trees, obtain multi-core scalability and performance by having operations traverse the data structure without any synchronization. As a result, however, these algorithms are notoriously difficult to prove linearizable, which requires identifying a point in time in which the traversal's result is correct. The problem is that traversing…
▽ More
Modern highly-concurrent search data structures, such as search trees, obtain multi-core scalability and performance by having operations traverse the data structure without any synchronization. As a result, however, these algorithms are notoriously difficult to prove linearizable, which requires identifying a point in time in which the traversal's result is correct. The problem is that traversing the data structure as it undergoes modifications leads to complex behaviors, necessitating intricate reasoning about all interleavings of reads by traversals and writes mutating the data structure.
In this paper, we present a general proof technique for proving unsynchronized traversals correct in a significantly simpler manner, compared to typical concurrent reasoning and prior proof techniques. Our framework relies only on sequential properties} of traversals and on a conceptually simple and widely-applicable condition about the ways an algorithm's writes mutate the data structure. Establishing that a target data structure satisfies our condition requires only simple concurrent reasoning, without considering interactions of writes and reads. This reasoning can be further simplified by using our framework.
To demonstrate our technique, we apply it to prove several interesting and challenging concurrent binary search trees: the logical-ordering AVL tree, the Citrus tree, and the full contention-friendly tree. Both the logical-ordering tree and the full contention-friendly tree are beyond the reach of previous approaches targeted at simplifying linearizability proofs.
△ Less
Submitted 10 January, 2024; v1 submitted 2 October, 2020;
originally announced October 2020.
-
A Thread-Local Semantics and Efficient Static Analyses for Race Free Programs
Authors:
Suvam Mukherjee,
Oded Padon,
Sharon Shoham,
Deepak D'Souza,
Noam Rinetzky
Abstract:
Data race free (DRF) programs constitute an important class of concurrent programs. In this paper we provide a framework for designing and proving the correctness of data flow analyses that target this class of programs. These analyses are in the same spirit as the "sync-CFG" analysis proposed in earlier literature. To achieve this, we first propose a novel concrete semantics for DRF programs, cal…
▽ More
Data race free (DRF) programs constitute an important class of concurrent programs. In this paper we provide a framework for designing and proving the correctness of data flow analyses that target this class of programs. These analyses are in the same spirit as the "sync-CFG" analysis proposed in earlier literature. To achieve this, we first propose a novel concrete semantics for DRF programs, called L-DRF, that is thread-local in nature---each thread operates on its own copy of the data state. We show that abstractions of our semantics allow us to reduce the analysis of DRF programs to a sequential analysis. This aids in rapidly porting existing sequential analyses to sound and scalable analyses for DRF programs. Next, we parameterize L-DRF with a partitioning of the program variables into "regions" which are accessed atomically. Abstractions of the region-parameterized semantics yield more precise analyses for "region-race" free concurrent programs. We instantiate these abstractions to devise efficient relational analyses for race free programs, which we have implemented in a prototype tool called RATCOP. On the benchmarks, RATCOP was able to prove up to 65% of the assertions, in comparison to 25% proved by our baseline. Moreover, in a comparative study with a recent concurrent static analyzer, RATCOP was up to 5 orders of magnitude faster.
△ Less
Submitted 6 September, 2020;
originally announced September 2020.
-
Hypothetical Reasoning via Provenance Abstraction
Authors:
Daniel Deutch,
Yuval Moskovitch,
Noam Rinetzky
Abstract:
Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Previous work has shown that fine-grained data provenance can help make such an analysis more efficient: instead of a costly re-execution of the underlying application, hypothetical scenarios are applied to a pre-computed prov…
▽ More
Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Previous work has shown that fine-grained data provenance can help make such an analysis more efficient: instead of a costly re-execution of the underlying application, hypothetical scenarios are applied to a pre-computed provenance expression. However, storing provenance for complex queries and large-scale data leads to a significant overhead, which is often a barrier to the incorporation of provenance-based solutions.
To this end, we present a framework that allows to reduce provenance size. Our approach is based on reducing the provenance granularity using user defined abstraction trees over the provenance variables; the granularity is based on the anticipated hypothetical scenarios. We formalize the tradeoff between provenance size and supported granularity of the hypothetical reasoning, and study the complexity of the resulting optimization problem, provide efficient algorithms for tractable cases and heuristics for others. We experimentally study the performance of our solution for various queries and abstraction trees. Our study shows that the algorithms generally lead to substantial speedup of hypothetical reasoning, with a reasonable loss of accuracy.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
COBRA: Compression via Abstraction of Provenance for Hypothetical Reasoning
Authors:
Daniel Deutch,
Yuval Moskovitch,
Noam Rinetzky
Abstract:
Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Recent work has proposed to leverage ideas from data provenance tracking towards supporting efficient hypothetical reasoning: instead of a costly re-execution of the underlying application, one may assign values to a pre-compu…
▽ More
Data analytics often involves hypothetical reasoning: repeatedly modifying the data and observing the induced effect on the computation result of a data-centric application. Recent work has proposed to leverage ideas from data provenance tracking towards supporting efficient hypothetical reasoning: instead of a costly re-execution of the underlying application, one may assign values to a pre-computed provenance expression. A prime challenge in leveraging this approach for large-scale data and complex applications lies in the size of the provenance. To this end, we present a framework that allows to reduce provenance size. Our approach is based on reducing the provenance granularity using abstraction. We propose a demonstration of COBRA, a system that allows examine the effect of the provenance compression on the anticipated analysis results. We will demonstrate the usefulness of COBRA in the context of business data analysis.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Order out of Chaos: Proving Linearizability Using Local Views
Authors:
Yotam M. Y. Feldman,
Constantin Enea,
Adam Morrison,
Noam Rinetzky,
Sharon Shoham
Abstract:
Proving the linearizability of highly concurrent data structures, such as those using optimistic concurrency control, is a challenging task. The main difficulty is in reasoning about the view of the memory obtained by the threads, because as they execute, threads observe different fragments of memory from different points in time. Until today, every linearizability proof has tackled this challenge…
▽ More
Proving the linearizability of highly concurrent data structures, such as those using optimistic concurrency control, is a challenging task. The main difficulty is in reasoning about the view of the memory obtained by the threads, because as they execute, threads observe different fragments of memory from different points in time. Until today, every linearizability proof has tackled this challenge from scratch.
We present a unifying proof argument for the correctness of unsynchronized traversals, and apply it to prove the linearizability of several highly concurrent search data structures, including an optimistic self-balancing binary search tree, the Lazy List and a lock-free skip list. Our framework harnesses {\em sequential reasoning} about the view of a thread, considering the thread as if it traverses the data structure without interference from other operations. Our key contribution is showing that properties of reachability along search paths can be deduced for concurrent traversals from such interference-free traversals, when certain intuitive conditions are met. Basing the correctness of traversals on such \emph{local view arguments} greatly simplifies linearizability proofs.
To apply our framework, the user proves that the data structure satisfies two conditions: (1) acyclicity of the order on memory, even when it is considered across intermediate memory states, and (2) preservation of search paths to locations modified by interfering writes. Establishing the conditions, as well as the full linearizability proof utilizing our proof argument, reduces to simple concurrent reasoning. The result is a clear and comprehensible correctness proof, and elucidates common patterns underlying several existing data structures.
△ Less
Submitted 5 August, 2018; v1 submitted 10 May, 2018;
originally announced May 2018.
-
Safe Privatization in Transactional Memory
Authors:
Artem Khyzha,
Hagit Attiya,
Alexey Gotsman,
Noam Rinetzky
Abstract:
Transactional memory (TM) facilitates the development of concurrent applications by letting the programmer designate certain code blocks as atomic. Programmers using a TM often would like to access the same data both inside and outside transactions, e.g., to improve performance or to support legacy code. In this case, programmers would ideally like the TM to guarantee strong atomicity, where trans…
▽ More
Transactional memory (TM) facilitates the development of concurrent applications by letting the programmer designate certain code blocks as atomic. Programmers using a TM often would like to access the same data both inside and outside transactions, e.g., to improve performance or to support legacy code. In this case, programmers would ideally like the TM to guarantee strong atomicity, where transactions can be viewed as executing atomically also with respect to non-transactional accesses. Since guaranteeing strong atomicity for arbitrary programs is prohibitively expensive, researchers have suggested guaranteeing it only for certain data-race free (DRF) programs, particularly those that follow the privatization idiom: from some point on, threads agree that a given object can be accessed non-transactionally. Supporting privatization safely in a TM is nontrivial, because this often requires correctly inserting transactional fences, which wait until all active transactions complete.
Unfortunately, there is currently no consensus on a single definition of transactional DRF, in particular, because no existing notion of DRF takes into account transactional fences. In this paper we propose such a notion and prove that, if a TM satisfies a certain condition generalizing opacity and a program using it is DRF assuming strong atomicity, then the program indeed has strongly atomic semantics. We show that our DRF notion allows the programmer to use privatization idioms. We also propose a method for proving our generalization of opacity and apply it to the TL2 TM.
△ Less
Submitted 12 January, 2018;
originally announced January 2018.
-
Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts
Authors:
Shelly Grossman,
Ittai Abraham,
Guy Golan-Gueta,
Yan Michalevsky,
Noam Rinetzky,
Mooly Sagiv,
Yoni Zohar
Abstract:
Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object's local states by external objects in unexpected fashions, thus breaking modularity. The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M. We define the notion of Effectively Callback Free (ECF) object…
▽ More
Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object's local states by external objects in unexpected fashions, thus breaking modularity. The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M. We define the notion of Effectively Callback Free (ECF) objects in order to allow callbacks without preventing modular reasoning.
An object is ECF in a given execution trace if there exists an equivalent execution trace without callbacks to this object. An object is ECF if it is ECF in every possible execution trace. We study the decidability of dynamically checking ECF in a given execution trace and statically checking if an object is ECF. We also show that dynamically checking ECF in Ethereum is feasible and can be done online. By running the history of all execution traces in Ethereum, we were able to verify that virtually all existing contracts, excluding the DAO or contracts with similar known vulnerabilities, are ECF. Finally, we show that ECF, whether it is verified dynamically or statically, enables modular reasoning about objects with encapsulated state.
△ Less
Submitted 11 January, 2018;
originally announced January 2018.
-
On the automated verification of web applications with embedded SQL
Authors:
Shachar Itzhaky,
Tomer Kotek,
Noam Rinetzky,
Mooly Sagiv,
Orr Tamir,
Helmut Veith,
Florian Zuleger
Abstract:
A large number of web applications is based on a relational database together with a program, typically a script, that enables the user to interact with the database through embedded SQL queries and commands. In this paper, we introduce a method for formal automated verification of such systems which connects database theory to mainstream program analysis. We identify a fragment of SQL which captu…
▽ More
A large number of web applications is based on a relational database together with a program, typically a script, that enables the user to interact with the database through embedded SQL queries and commands. In this paper, we introduce a method for formal automated verification of such systems which connects database theory to mainstream program analysis. We identify a fragment of SQL which captures the behavior of the queries in our case studies, is algorithmically decidable, and facilitates the construction of weakest preconditions. Thus, we can integrate the analysis of SQL queries into a program analysis tool chain. To this end, we implement a new decision procedure for the SQL fragment that we introduce. We demonstrate practical applicability of our results with three case studies, a web administrator, a simple firewall, and a conference management system.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.