Search | arXiv e-print repository

Automatic Inference of Relational Object Invariants

Authors: Yusen Su, Jorge A. Navas, Arie Gurfinkel, Isabel Garcia-Contreras

Abstract: Relational object invariants (or representation invariants) are relational properties held by the fields of a (memory) object throughout its lifetime. For example, the length of a buffer never exceeds its capacity. Automatic inference of these invariants is particularly challenging because they are often broken temporarily during field updates. In this paper, we present an Abstract Interpretation-… ▽ More Relational object invariants (or representation invariants) are relational properties held by the fields of a (memory) object throughout its lifetime. For example, the length of a buffer never exceeds its capacity. Automatic inference of these invariants is particularly challenging because they are often broken temporarily during field updates. In this paper, we present an Abstract Interpretation-based solution to infer object invariants. Our key insight is a new object abstraction for memory objects, where memory is divided into multiple memory banks, each containing several objects. Within each bank, the objects are further abstracted by separating the most recently used (MRU) object, represented precisely with strong updates, while the rest are summarized. For an effective implementation of this approach, we introduce a new composite abstract domain, which forms a reduced product of numerical and equality sub-domains. This design efficiently expresses relationships between a small number of variables (e.g., fields of the same abstract object). We implement the new domain in the CRAB abstract interpreter and evaluate it on several benchmarks for memory safety. We show that our approach is significantly more scalable for relational properties than the existing implementation of CRAB. For evaluating precision, we have integrated our analysis as a pre-processing step to SEABMC bounded model checker, and show that it is effective at both discharging assertions during pre-processing, and significantly improving the run-time of SEABMC. △ Less

Submitted 22 November, 2024; originally announced November 2024.

Comments: This is an extended version of the VMCAI 2025 paper, consisting of 26 pages. The artifact is available at https://doi.org/10.5281/zenodo.13849174

arXiv:2309.09100 [pdf, other]

Btor2MLIR: A Format and Toolchain for Hardware Verification

Authors: Joseph Tafese, Isabel Garcia-Contreras, Arie Gurfinkel

Abstract: Formats for representing and manipulating verification problems are extremely important for supporting the ecosystem of tools, developers, and practitioners. A good format allows representing many different types of problems, has a strong toolchain for manipulating and translating problems, and can grow with the community. In the world of hardware verification, and, specifically, the Hardware Mode… ▽ More Formats for representing and manipulating verification problems are extremely important for supporting the ecosystem of tools, developers, and practitioners. A good format allows representing many different types of problems, has a strong toolchain for manipulating and translating problems, and can grow with the community. In the world of hardware verification, and, specifically, the Hardware Model Checking Competition (HWMCC), the Btor2 format has emerged as the dominating format. It is supported by Btor2Tools, verification tools, and Verilog design tools like Yosys. In this paper, we present an alternative format and toolchain, called Btor2MLIR, based on the recent MLIR framework. The advantage of Btor2MLIR is in reusing existing components from a mature compiler infrastructure, including parsers, text and binary formats, converters to a variety of intermediate representations, and executable semantics of LLVM. We hope that the format and our tooling will lead to rapid prototyping of verification and related tools for hardware verification. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Comments: Formal Methods in Computer-Aided Design 2023

arXiv:2306.17765 [pdf, ps, other]

Speculative SAT Modulo SAT

Authors: Hari Govind V K, Isabel Garcia-Contreras, Sharon Shoham, Arie Gurfinkel

Abstract: State-of-the-art model-checking algorithms like IC3/PDR are based on uni-directional modular SAT solving for finding and/or blocking counterexamples. Modular SAT solvers divide a SAT-query into multiple sub-queries, each solved by a separate SAT solver (called a module), and propagate information (lemmas, proof obligations, blocked clauses, etc.) between modules. While modular solving is key to IC… ▽ More State-of-the-art model-checking algorithms like IC3/PDR are based on uni-directional modular SAT solving for finding and/or blocking counterexamples. Modular SAT solvers divide a SAT-query into multiple sub-queries, each solved by a separate SAT solver (called a module), and propagate information (lemmas, proof obligations, blocked clauses, etc.) between modules. While modular solving is key to IC3/PDR, it is obviously not as effective as monolithic solving, especially when individual sub-queries are harder to solve than the combined query. This is partially addressed in SAT modulo SAT (SMS) by propagating unit literals back and forth between the modules and using information from one module to simplify the sub-query in another module as soon as possible (i.e., before the satisfiability of any sub-query is established). However, bi-directionality of SMS is limited because of the strict order between decisions and propagation -- only one module is allowed to make decisions, until its sub-query is SAT. In this paper, we propose a generalization of SMS, called SPEC SMS, that speculates decisions between modules. This makes it bi-directional -- decisions are made in multiple modules, and learned clauses are exchanged in both directions. We further extend DRUP proofs and interpolation, these are useful in model checking, to SPEC SMS. We have implemented SPEC SMS in Z3 and show that it performs exponentially better on a series of benchmarks that are provably hard for SMS. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.10009 [pdf, ps, other]

Fast Approximations of Quantifier Elimination

Authors: Isabel Garcia-Contreras, Hari Govind V K, Sharon Shoham, Arie Gurfinkel

Abstract: Quantifier elimination (qelim) is used in many automated reasoning tasks including program synthesis, exist-forall solving, quantified SMT, Model Checking, and solving Constrained Horn Clauses (CHCs). Exact qelim is computationally expensive. Hence, it is often approximated. For example, Z3 uses "light" pre-processing to reduce the number of quantified variables. CHC-solver Spacer uses model-based… ▽ More Quantifier elimination (qelim) is used in many automated reasoning tasks including program synthesis, exist-forall solving, quantified SMT, Model Checking, and solving Constrained Horn Clauses (CHCs). Exact qelim is computationally expensive. Hence, it is often approximated. For example, Z3 uses "light" pre-processing to reduce the number of quantified variables. CHC-solver Spacer uses model-based projection (MBP) to under-approximate qelim relative to a given model, and over-approximations of qelim can be used as abstractions. In this paper, we present the QEL framework for fast approximations of qelim. QEL provides a uniform interface for both quantifier reduction and model-based projection. QEL builds on the egraph data structure -- the core of the EUF decision procedure in SMT -- by casting quantifier reduction as a problem of choosing ground (i.e., variable-free) representatives for equivalence classes. We have used QEL to implement MBP for the theories of Arrays and Algebraic Data Types (ADTs). We integrated QEL and our new MBP in Z3 and evaluated it within several tasks that rely on quantifier approximations, outperforming state-of-the-art. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: Published at CAV 2023

arXiv:2106.07045 [pdf, other]

VeriFly: On-the-fly Assertion Checking via Incrementality

Authors: Miguel A. Sanchez-Ordaz, Isabel Garcia-Contreras, Victor Perez-Carrasco, Jose F. Morales, Pedro lopez-Garcia, Manuel V. Hermenegildo

Abstract: Assertion checking is an invaluable programmer's tool for finding many classes of errors or verifying their absence in dynamic languages such as Prolog. For Prolog programmers this means being able to have relevant properties such as modes, types, determinacy, non-failure, sharing, constraints, cost, etc., checked and errors flagged without having to actually run the program. Such global static an… ▽ More Assertion checking is an invaluable programmer's tool for finding many classes of errors or verifying their absence in dynamic languages such as Prolog. For Prolog programmers this means being able to have relevant properties such as modes, types, determinacy, non-failure, sharing, constraints, cost, etc., checked and errors flagged without having to actually run the program. Such global static analysis tools are arguably most useful the earlier they are used in the software development cycle, and fast response times are essential for interactive use. Triggering a full and precise semantic analysis of a software project every time a change is made can be prohibitively expensive. In our static analysis and verification framework this challenge is addressed through a combination of modular and incremental (context- and path-sensitive) analysis that is responsive to program edits, at different levels of granularity. We describe how the combination of this framework within an integrated development environment (IDE) takes advantage of such incrementality to achieve a high level of reactivity when reflecting analysis and verification results back as colorings and tooltips directly on the program text -- the tool's VeriFly mode. The concrete implementation that we describe is Emacs-based and reuses in part off-the-shelf "on-the-fly" syntax checking facilities (flycheck). We believe that similar extensions are also reproducible with low effort in other mature development environments. Our initial experience with the tool shows quite promising results, with low latency times that provide early, continuous, and precise assertion checking and other semantic feedback to programmers during the development process. The tool supports Prolog natively, as well as other languages by semantic transformation into Horn clauses. This paper is under consideration for acceptance in TPLP. △ Less

Submitted 17 August, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

Comments: Paper presented at the 37th International Conference on Logic Programming (ICLP 2021), 16 pages

Report number: CLIP-1/2021.0

arXiv:1808.05197 [pdf, ps, other]

Multivariant Assertion-based Guidance in Abstract Interpretation

Authors: Isabel Garcia-Contreras, Jose F. Morales, Manuel V. Hermenegildo

Abstract: Approximations during program analysis are a necessary evil, as they ensure essential properties, such as soundness and termination of the analysis, but they also imply not always producing useful results. Automatic techniques have been studied to prevent precision loss, typically at the expense of larger resource consumption. In both cases (i.e., when analysis produces inaccurate results and when… ▽ More Approximations during program analysis are a necessary evil, as they ensure essential properties, such as soundness and termination of the analysis, but they also imply not always producing useful results. Automatic techniques have been studied to prevent precision loss, typically at the expense of larger resource consumption. In both cases (i.e., when analysis produces inaccurate results and when resource consumption is too high), it is necessary to have some means for users to provide information to guide analysis and thus improve precision and/or performance. We present techniques for supporting within an abstract interpretation framework a rich set of assertions that can deal with multivariance/context-sensitivity, and can handle different run-time semantics for those assertions that cannot be discharged at compile time. We show how the proposed approach can be applied to both improving precision and accelerating analysis. We also provide some formal results on the effects of such assertions on the analysis results. △ Less

Submitted 17 December, 2018; v1 submitted 15 August, 2018; originally announced August 2018.

Comments: Pre-proceedings paper presented at the 28th International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2018), Frankfurt am Main, Germany, 4-6 September 2018 (arXiv:1808.03326)

Report number: LOPSTR/2018/29

arXiv:1804.01839 [pdf, other]

doi 10.1017/S1471068420000496

Incremental and Modular Context-sensitive Analysis

Authors: Isabel Garcia-Contreras, Jose F. Morales, Manuel V. Hermenegildo

Abstract: Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algori… ▽ More Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algorithms that achieve cost reductions at fine levels of granularity, such as changes in program lines. However, these fine-grained techniques are not directly applicable to modular programs, nor are they designed to take advantage of modular structures. This paper describes, implements, and evaluates an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs. The experimental results show that the proposed modular algorithm shows significant improvements, in both time and memory consumption, when compared to existing non-modular, fine-grain incremental analysis techniques. Furthermore, thanks to the proposed inter-modular propagation of analysis information, our algorithm also outperforms traditional modular analysis even when analyzing from scratch. △ Less

Submitted 18 December, 2020; v1 submitted 5 April, 2018; originally announced April 2018.

Comments: 56 pages, 27 figures. To be published in Theory and Practice of Logic Programming. v3 corresponds to the extended version of the ICLP2018 Technical Communication. v4 is the revised version submitted to Theory and Practice of Logic Programming. v5 (this one) is the final author version to be published in TPLP

Report number: CLIP-2/2018 version 4 (July 2019) ACM Class: D.2.4; F.3.1; I.2.2; I.2.3

Journal ref: Theory and Practice of Logic Programming 21 (2021) 196-243

arXiv:1608.02565 [pdf, ps, other]

Semantic Code Browsing

Authors: Isabel Garcia-Contreras, Jose F. Morales, Manuel V. Hermenegildo

Abstract: Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inac… ▽ More Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inaccurate (because they basically rely on documentation) and at the same time do not offer very expressive code query languages. We propose a novel approach that focuses on querying for semantic characteristics of code obtained automatically from the code itself. Program units are pre-processed using static analysis techniques, based on abstract interpretation, obtaining safe semantic approximations. A novel, assertion-based code query language is used to express desired semantic characteristics of the code as partial specifications. Relevant code is found by comparing such partial specifications with the inferred semantics for program elements. Our approach is fully automatic and does not rely on user annotations or documentation. It is more powerful and flexible than signature matching because it is parametric on the abstract domain and properties, and does not require type definitions. Also, it reasons with relations between properties, such as implication and abstraction, rather than just equality. It is also more resilient to syntactic code differences. We describe the approach and report on a prototype implementation within the Ciao system. Under consideration for acceptance in TPLP. △ Less

Submitted 8 August, 2016; originally announced August 2016.

Comments: Paper presented at the 32nd International Conference on Logic Programming (ICLP 2016), New York City, USA, 16-21 October 2016, 15 pages, LaTeX, 4 PDF figures, 2 tables

Showing 1–8 of 8 results for author: Garcia-Contreras, I