-
Attacker Control and Bug Prioritization
Authors:
Guilhem Lacombe,
Sébastien Bardin
Abstract:
As bug-finding methods improve, bug-fixing capabilities are exceeded, resulting in an accumulation of potential vulnerabilities. There is thus a need for efficient and precise bug prioritization based on exploitability. In this work, we explore the notion of control of an attacker over a vulnerability's parameters, which is an often overlooked factor of exploitability. We show that taint as well a…
▽ More
As bug-finding methods improve, bug-fixing capabilities are exceeded, resulting in an accumulation of potential vulnerabilities. There is thus a need for efficient and precise bug prioritization based on exploitability. In this work, we explore the notion of control of an attacker over a vulnerability's parameters, which is an often overlooked factor of exploitability. We show that taint as well as straightforward qualitative and quantitative notions of control are not enough to effectively differentiate vulnerabilities. Instead, we propose to focus analysis on feasible value sets, which we call domains of control, in order to better take into account threat models and expert insight. Our new Shrink and Split algorithm efficiently extracts domains of control from path constraints obtained with symbolic execution and renders them in an easily processed, human-readable form. This in turn allows to automatically compute more complex control metrics, such as weighted Quantitative Control, which factors in the varying threat levels of different values. Experiments show that our method is both efficient and precise. In particular, it is the only one able to distinguish between vulnerabilities such as cve-2019-14192 and cve-2022-30552, while revealing a mistake in the human evaluation of cve-2022-30790. The high degree of automation of our tool also brings us closer to a fully-automated evaluation pipeline.
△ Less
Submitted 31 March, 2025; v1 submitted 29 January, 2025;
originally announced January 2025.
-
A Systematic Evaluation of Automated Tools for Side-Channel Vulnerabilities Detection in Cryptographic Libraries
Authors:
Antoine Geimer,
Mathéo Vergnolle,
Frédéric Recoules,
Lesly-Ann Daniel,
Sébastien Bardin,
Clémentine Maurice
Abstract:
To protect cryptographic implementations from side-channel vulnerabilities, developers must adopt constant-time programming practices. As these can be error-prone, many side-channel detection tools have been proposed. Despite this, such vulnerabilities are still manually found in cryptographic libraries. While a recent paper by Jancar et al. shows that developers rarely perform side-channel detect…
▽ More
To protect cryptographic implementations from side-channel vulnerabilities, developers must adopt constant-time programming practices. As these can be error-prone, many side-channel detection tools have been proposed. Despite this, such vulnerabilities are still manually found in cryptographic libraries. While a recent paper by Jancar et al. shows that developers rarely perform side-channel detection, it is unclear if existing detection tools could have found these vulnerabilities in the first place. To answer this question, we surveyed the literature to build a classification of 34 side-channel detection frameworks. The classification we offer compares multiple criteria, including the methods used, the scalability of the analysis or the threat model considered. We then built a unified common benchmark of representative cryptographic operations on a selection of 5 promising detection tools. This benchmark allows us to better compare the capabilities of each tool, and the scalability of their analysis. Additionally, we offer a classification of recently published side-channel vulnerabilities. We then test each of the selected tools on benchmarks reproducing a subset of these vulnerabilities as well as the context in which they appear. We find that existing tools can struggle to find vulnerabilities for a variety of reasons, mainly the lack of support for SIMD instructions, implicit flows, and internal secret generation. Based on our findings, we develop a set of recommendations for the research community and cryptographic library developers, with the goal to improve the effectiveness of side-channel detection tools.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
GeoExplainer: A Visual Analytics Framework for Spatial Modeling Contextualization and Report Generation
Authors:
Fan Lei,
Yuxin Ma,
Stewart Fotheringham,
Elizabeth Mack,
Ziqi Li,
Mehak Sachdeva,
Sarah Bardin,
Ross Maciejewski
Abstract:
Geographic regression models of various descriptions are often applied to identify patterns and anomalies in the determinants of spatially distributed observations. These types of analyses focus on answering why questions about underlying spatial phenomena, e.g., why is crime higher in this locale, why do children in one school district outperform those in another, etc.? Answers to these questions…
▽ More
Geographic regression models of various descriptions are often applied to identify patterns and anomalies in the determinants of spatially distributed observations. These types of analyses focus on answering why questions about underlying spatial phenomena, e.g., why is crime higher in this locale, why do children in one school district outperform those in another, etc.? Answers to these questions require explanations of the model structure, the choice of parameters, and contextualization of the findings with respect to their geographic context. This is particularly true for local forms of regression models which are focused on the role of locational context in determining human behavior. In this paper, we present GeoExplainer, a visual analytics framework designed to support analysts in creating explanative documentation that summarizes and contextualizes their spatial analyses. As analysts create their spatial models, our framework flags potential issues with model parameter selections, utilizes template-based text generation to summarize model outputs, and links with external knowledge repositories to provide annotations that help to explain the model results. As analysts explore the model results, all visualizations and annotations can be captured in an interactive report generation widget. We demonstrate our framework using a case study modeling the determinants of voting in the 2016 US Presidential Election.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
ProSpeCT: Provably Secure Speculation for the Constant-Time Policy (Extended version)
Authors:
Lesly-Ann Daniel,
Marton Bognar,
Job Noorman,
Sébastien Bardin,
Tamara Rezk,
Frank Piessens
Abstract:
We propose ProSpeCT, a generic formal processor model providing provably secure speculation for the constant-time policy. For constant-time programs under a non-speculative semantics, ProSpeCT guarantees that speculative and out-of-order execution cause no microarchitectural leaks. This guarantee is achieved by tracking secrets in the processor pipeline and ensuring that they do not influence the…
▽ More
We propose ProSpeCT, a generic formal processor model providing provably secure speculation for the constant-time policy. For constant-time programs under a non-speculative semantics, ProSpeCT guarantees that speculative and out-of-order execution cause no microarchitectural leaks. This guarantee is achieved by tracking secrets in the processor pipeline and ensuring that they do not influence the microarchitectural state during speculative execution. Our formalization covers a broad class of speculation mechanisms, generalizing prior work. As a result, our security proof covers all known Spectre attacks, including load value injection (LVI) attacks.
In addition to the formal model, we provide a prototype hardware implementation of ProSpeCT on a RISC-V processor and show evidence of its low impact on hardware cost, performance, and required software changes. In particular, the experimental evaluation confirms our expectation that for a compliant constant-time binary, enabling ProSpeCT incurs no performance overhead.
△ Less
Submitted 11 August, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
A Quantitative Flavour of Robust Reachability
Authors:
Sébastien Bardin,
Guillaume Girol
Abstract:
Many software analysis techniques attempt to determine whether bugs are reachable, but for security purpose this is only part of the story as it does not indicate whether the bugs found could be easily triggered by an attacker. The recently introduced notion of robust reachability aims at filling this gap by distinguishing the input controlled by the attacker from those that are not. Yet, this qua…
▽ More
Many software analysis techniques attempt to determine whether bugs are reachable, but for security purpose this is only part of the story as it does not indicate whether the bugs found could be easily triggered by an attacker. The recently introduced notion of robust reachability aims at filling this gap by distinguishing the input controlled by the attacker from those that are not. Yet, this qualitative notion may be too strong in practice, leaving apart bugs which are mostly but not fully replicable. We aim here at proposing a quantitative version of robust reachability, more flexible and still amenable to automation. We propose quantitative robustness, a metric expressing how easily an attacker can trigger a bug while taking into account that he can only influence part of the program input, together with a dedicated quantitative symbolic execution technique (QRSE). Interestingly, QRSE relies on a variant of model counting (namely, functional E-MAJSAT) unseen so far in formal verification, but which has been studied in AI domains such as Bayesian network, knowledge representation and probabilistic planning. Yet, the existing solving methods from these fields turn out to be unsatisfactory for formal verification purpose, leading us to propose a novel parametric method. These results have been implemented and evaluated over two security-relevant case studies, allowing to demonstrate the feasibility and relevance of our ideas.
△ Less
Submitted 10 December, 2022;
originally announced December 2022.
-
Scalable Program Clone Search Through Spectral Analysis
Authors:
Tristan Benoit,
Jean-Yves Marion,
Sébastien Bardin
Abstract:
We consider the problem of program clone search, i.e. given a target program and a repository of known programs (all in executable format), the goal is to find the program in the repository most similar to the target program - with potential applications in terms of reverse engineering, program clustering, malware lineage and software theft detection. Recent years have witnessed a blooming in code…
▽ More
We consider the problem of program clone search, i.e. given a target program and a repository of known programs (all in executable format), the goal is to find the program in the repository most similar to the target program - with potential applications in terms of reverse engineering, program clustering, malware lineage and software theft detection. Recent years have witnessed a blooming in code similarity techniques, yet most of them focus on function-level similarity and function clone search, while we are interested in program-level similarity and program clone search. Actually, our study shows that prior similarity approaches are either too slow to handle large program repositories, or not precise enough, or yet not robust against slight variations introduced by compilers, source code versions or light obfuscations. We propose a novel spectral analysis method for program-level similarity and program clone search called Programs Spectral Similarity (PSS). In a nutshell, PSS one-time spectral feature extraction is tailored for large repositories, making it a perfect fit for program clone search. We have compared the different approaches with extensive benchmarks, showing that PSS reaches a sweet spot in terms of precision, speed and robustness.
△ Less
Submitted 31 August, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Binsec/Rel: Symbolic Binary Analyzer for Security with Applications to Constant-Time and Secret-Erasure
Authors:
Lesly-Ann Daniel,
Sébastien Bardin,
Tamara Rezk
Abstract:
This paper tackles the problem of designing efficient binary-level verification for a subset of information flow properties encompassing constant-time and secret-erasure. These properties are crucial for cryptographic implementations, but are generally not preserved by compilers. Our proposal builds on relational symbolic execution enhanced with new optimizations dedicated to information flow and…
▽ More
This paper tackles the problem of designing efficient binary-level verification for a subset of information flow properties encompassing constant-time and secret-erasure. These properties are crucial for cryptographic implementations, but are generally not preserved by compilers. Our proposal builds on relational symbolic execution enhanced with new optimizations dedicated to information flow and binary-level analysis, yielding a dramatic improvement over prior work based on symbolic execution. We implement a prototype, Binsec/Rel, for bug-finding and bounded-verification of constant-time and secret-erasure, and perform extensive experiments on a set of 338 cryptographic implementations, demonstrating the benefits of our approach. Using Binsec/Rel, we also automate two prior manual studies on preservation of constant-time and secret-erasure by compilers for a total of 4148 and 1156 binaries respectively. Interestingly, our analysis highlights incorrect usages of volatile data pointer for secret erasure and shows that scrubbing mechanisms based on volatile function pointers can introduce additional register spilling which might break secret-erasure. We also discovered that gcc -O0 and backend passes of clang introduce violations of constant-time in implementations that were previously deemed secure by a state-of-the-art constant-time verification tool operating at LLVM level, showing the importance of reasoning at binary-level.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
Formal Methods for Quantum Programs: A Survey
Authors:
Christophe Chareton,
Sébastien Bardin,
Dongho Lee,
Benoît Valiron,
Renaud Vilmart,
Zhaowei Xu
Abstract:
While recent progress in quantum hardware open the door for significant speedup in certain key areas (cryptography, biology, chemistry, optimization, machine learning, etc), quantum algorithms are still hard to implement right, and the validation of such quantum programs is achallenge. Moreover, importing the testing and debugging practices at use in classical programming is extremely difficult in…
▽ More
While recent progress in quantum hardware open the door for significant speedup in certain key areas (cryptography, biology, chemistry, optimization, machine learning, etc), quantum algorithms are still hard to implement right, and the validation of such quantum programs is achallenge. Moreover, importing the testing and debugging practices at use in classical programming is extremely difficult in the quantum case, due to the destructive aspect of quantum measurement. As an alternative strategy, formal methods are prone to play a decisive role in the emerging field of quantum software. Recent works initiate solutions for problems occurring at every stage of the development process: high-level program design, implementation, compilation, etc. We review the induced challenges for an efficient use of formal methods in quantum computing and the current most promising research directions.
△ Less
Submitted 8 April, 2022; v1 submitted 14 September, 2021;
originally announced September 2021.
-
Interface Compliance of Inline Assembly: Automatically Check, Patch and Refine
Authors:
Frédéric Recoules,
Sébastien Bardin,
Richard Bonichon,
Matthieu Lemerre,
Laurent Mounier,
Marie-Laure Potet
Abstract:
Inline assembly is still a common practice in low-level C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax (supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assem…
▽ More
Inline assembly is still a common practice in low-level C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax (supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assembly codes as blackboxes and relies only on their interface to correctly glue them into the compiled C code. Therefore, the adequacy between the assembly chunk and its interface (named compliance) is of primary importance, as such compliance issues can lead to subtle and hard-to-find bugs. We propose RUSTInA, the first automated technique for formally checking inline assembly compliance, with the extra ability to propose (proven) patches and (optimization) refinements in certain cases. RUSTInA is based on an original formalization of the inline assembly compliance problem together with novel dedicated algorithms. Our prototype has been evaluated on 202 Debian packages with inline assembly (2656 chunks), finding 2183 issues in 85 packages -- 986 significant issues in 54 packages (including major projects such as ffmpeg or ALSA), and proposing patches for 92% of them. Currently, 38 patches have already been accepted (solving 156 significant issues), with positive feedback from development teams.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate
Authors:
Grégoire Menguy,
Sébastien Bardin,
Richard Bonichon,
Cauim de Souza Lima
Abstract:
Code obfuscation aims at protecting Intellectual Property and other secrets embedded into software from being retrieved. Recent works leverage advances in artificial intelligence with the hope of getting blackbox deobfuscators completely immune to standard (whitebox) protection mechanisms. While promising, this new field of AI-based blackbox deobfuscation is still in its infancy. In this article w…
▽ More
Code obfuscation aims at protecting Intellectual Property and other secrets embedded into software from being retrieved. Recent works leverage advances in artificial intelligence with the hope of getting blackbox deobfuscators completely immune to standard (whitebox) protection mechanisms. While promising, this new field of AI-based blackbox deobfuscation is still in its infancy. In this article we deepen the state of AI-based blackbox deobfuscation in three key directions: understand the current state-of-the-art, improve over it and design dedicated protection mechanisms. In particular, we define a novel generic framework for AI-based blackbox deobfuscation encompassing prior work and highlighting key components; we are the first to point out that the search space underlying code deobfuscation is too unstable for simulation-based methods (e.g., Monte Carlo Tres Search used in prior work) and advocate the use of robust methods such as S-metaheuritics; we propose the new optimized AI-based blackbox deobfuscator Xyntia which significantly outperforms prior work in terms of success rate (especially with small time budget) while being completely immune to the most recent anti-analysis code obfuscation methods; and finally we propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
No Crash, No Exploit: Automated Verification of Embedded Kernels
Authors:
Olivier Nicole,
Matthieu Lemerre,
Sébastien Bardin,
Xavier Rival
Abstract:
The kernel is the most safety- and security-critical component of many computer systems, as the most severe bugs lead to complete system crash or exploit. It is thus desirable to guarantee that a kernel is free from these bugs using formal methods, but the high cost and expertise required to do so are deterrent to wide applicability. We propose a method that can verify both absence of runtime erro…
▽ More
The kernel is the most safety- and security-critical component of many computer systems, as the most severe bugs lead to complete system crash or exploit. It is thus desirable to guarantee that a kernel is free from these bugs using formal methods, but the high cost and expertise required to do so are deterrent to wide applicability. We propose a method that can verify both absence of runtime errors (i.e. crashes) and absence of privilege escalation (i.e. exploits) in embedded kernels from their binary executables. The method can verify the kernel runtime independently from the application, at the expense of only a few lines of simple annotations. When given a specific application, the method can verify simple kernels without any human intervention. We demonstrate our method on two different use cases: we use our tool to help the development of a new embedded real-time kernel, and we verify an existing industrial real-time kernel executable with no modification. Results show that the method is fast, simple to use, and can prevent real errors and security vulnerabilities.
△ Less
Submitted 21 May, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Automatically Proving Microkernels Free from Privilege Escalation from their Executable
Authors:
Olivier Nicole,
Matthieu Lemerre,
Sébastien Bardin,
Xavier Rival
Abstract:
Operating system kernels are the security keystone of most computer systems, as they provide the core protection mechanisms. Kernels are in particular responsible for their own security, i.e. they must prevent untrusted user tasks from reaching their level of privilege. We demonstrate that proving such absence of privilege escalation is a pre-requisite for any definitive security proof of the kern…
▽ More
Operating system kernels are the security keystone of most computer systems, as they provide the core protection mechanisms. Kernels are in particular responsible for their own security, i.e. they must prevent untrusted user tasks from reaching their level of privilege. We demonstrate that proving such absence of privilege escalation is a pre-requisite for any definitive security proof of the kernel. While prior OS kernel formal verifications were performed either on source code or crafted kernels, with manual or semi-automated methods requiring significant human efforts in annotations or proofs, we show that it is possible to compute such kernel security proofs using fully-automated methods and starting from the executable code of an existing microkernel with no modification, thus formally verifying absence of privilege escalation with high confidence for a low cost. We applied our method on two embedded microkernels, including the industrial kernel AnonymOS: with only 58 lines of annotation and less than 10 minutes of computation, our method finds a vulnerability in a first (buggy) version of AnonymOS and verifies absence of privilege escalation in a second (secure) version.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
A Deductive Verification Framework for Circuit-building Quantum Programs
Authors:
Christophe Chareton,
Sébastien Bardin,
François Bobot,
Valentin Perrelle,
Benoit Valiron
Abstract:
While recent progress in quantum hardware open the door for significant speedup in certain key areas, quantum algorithms are still hard to implement right, and the validation of such quantum programs is a challenge. Early attempts either suffer from the lack of automation or parametrized reasoning, or target high-level abstract algorithm description languages far from the current de facto consensu…
▽ More
While recent progress in quantum hardware open the door for significant speedup in certain key areas, quantum algorithms are still hard to implement right, and the validation of such quantum programs is a challenge. Early attempts either suffer from the lack of automation or parametrized reasoning, or target high-level abstract algorithm description languages far from the current de facto consensus of circuit-building quantum programming languages. As a consequence, no significant quantum algorithm implementation has been currently verified in a scale-invariant manner. We propose Qbricks, the first formal verification environment for circuit-building quantum programs, featuring clear separation between code and proof, parametric specifications and proofs, high degree of proof automation and allowing to encode quantum programs in a natural way, i.e. close to textbook style. Qbricks builds on best practice of formal verification for the classical case and tailor them to the quantum case: we bring a new domain-specific circuit-building language for quantum programs, namely Qbricks-DSL, together with a new logical specification language Qbricks-Spec and a dedicated Hoare-style deductive verification rule named Hybrid Quantum Hoare Logic. Especially, we introduce and intensively build upon HOPS, a higher-order extension of the recent path-sum symbolic representation, used for both specification and automation. To illustrate the opportunity of Qbricks, we implement the first verified parametric implementations of several famous and non-trivial quantum algorithms, including the quantum part of Shor integer factoring (Order Finding - Shor-OF), quantum phase estimation (QPE) - a basic building block of many quantum algorithms, and Grover search. These breakthroughs were amply facilitated by the specification and automated deduction principles introduced within Qbricks.
△ Less
Submitted 26 July, 2020; v1 submitted 12 March, 2020;
originally announced March 2020.
-
Binary-level Directed Fuzzing for Use-After-Free Vulnerabilities
Authors:
Manh-Dung Nguyen,
Sébastien Bardin,
Richard Bonichon,
Roland Groz,
Matthieu Lemerre
Abstract:
Directed fuzzing focuses on automatically testing specific parts of the code by taking advantage of additional information such as (partial) bug stack trace, patches or risky operations. Key applications include bug reproduction, patch testing and static analysis report verification. Although directed fuzzing has received a lot of attention recently, hard-to-detect vulnerabilities such as Use-Afte…
▽ More
Directed fuzzing focuses on automatically testing specific parts of the code by taking advantage of additional information such as (partial) bug stack trace, patches or risky operations. Key applications include bug reproduction, patch testing and static analysis report verification. Although directed fuzzing has received a lot of attention recently, hard-to-detect vulnerabilities such as Use-After-Free (UAF) are still not well addressed, especially at the binary level. We propose UAFuzz, the first (binary-level) directed greybox fuzzer dedicated to UAF bugs. The technique features a fuzzing engine tailored to UAF specifics, a lightweight code instrumentation and an efficient bug triage step. Experimental evaluation for bug reproduction on real cases demonstrates that UAFuzz significantly outperforms state-of-the-art directed fuzzers in terms of fault detection rate, time to exposure and bug triaging. UAFuzz has also been proven effective in patch testing, leading to the discovery of 30 new bugs (7 CVEs) in programs such as Perl, GPAC and GNU Patch. Finally, we provide to the community a large fuzzing benchmark dedicated to UAF, built on both real codes and real bugs.
△ Less
Submitted 17 August, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Binsec/Rel: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level
Authors:
Lesly-Ann Daniel,
Sébastien Bardin,
Tamara Rezk
Abstract:
The constant-time programming discipline (CT) is an efficient countermeasure against timing side-channel attacks, requiring the control flow and the memory accesses to be independent from the secrets. Yet, writing CT code is challenging as it demands to reason about pairs of execution traces (2- hypersafety property) and it is generally not preserved by the compiler, requiring binary-level analysi…
▽ More
The constant-time programming discipline (CT) is an efficient countermeasure against timing side-channel attacks, requiring the control flow and the memory accesses to be independent from the secrets. Yet, writing CT code is challenging as it demands to reason about pairs of execution traces (2- hypersafety property) and it is generally not preserved by the compiler, requiring binary-level analysis. Unfortunately, current verification tools for CT either reason at higher level (C or LLVM), or sacrifice bug-finding or bounded-verification, or do not scale. We tackle the problem of designing an efficient binary-level verification tool for CT providing both bug-finding and bounded-verification. The technique builds on relational symbolic execution enhanced with new optimizations dedicated to information flow and binary-level analysis, yielding a dramatic improvement over prior work based on symbolic execution. We implement a prototype, Binsec/Rel, and perform extensive experiments on a set of 338 cryptographic implementations, demonstrating the benefits of our approach in both bug-finding and bounded-verification. Using Binsec/Rel, we also automate a previous manual study of CT preservation by compilers. Interestingly, we discovered that gcc -O0 and backend passes of clang introduce violations of CT in implementations that were previously deemed secure by a state-of-the-art CT verification tool operating at LLVM level, showing the importance of reasoning at binary-level.
△ Less
Submitted 11 July, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
How to Kill Symbolic Deobfuscation for Free; or Unleashing the Potential of Path-Oriented Protections
Authors:
Mathilde Ollivier,
Sébastien Bardin,
Richard Bonichon,
Jean-Yves Marion
Abstract:
Code obfuscation is a major tool for protecting software intellectual property from attacks such as reverse engineering or code tampering. Yet, recently proposed (automated) attacks based on Dynamic Symbolic Execution (DSE) shows very promising results, hence threatening software integrity. Current defenses are not fully satisfactory, being either not efficient against symbolic reasoning, or affec…
▽ More
Code obfuscation is a major tool for protecting software intellectual property from attacks such as reverse engineering or code tampering. Yet, recently proposed (automated) attacks based on Dynamic Symbolic Execution (DSE) shows very promising results, hence threatening software integrity. Current defenses are not fully satisfactory, being either not efficient against symbolic reasoning, or affecting runtime performance too much, or being too easy to spot. We present and study a new class of anti-DSE protections coined as path-oriented protections targeting the weakest spot of DSE, namely path exploration. We propose a lightweight, efficient, resistant and analytically proved class of obfuscation algorithms designed to hinder DSE-based attacks. Extensive evaluation demonstrates that these approaches critically counter symbolic deobfuscation while yielding only a very slight overhead.
△ Less
Submitted 7 August, 2019; v1 submitted 5 August, 2019;
originally announced August 2019.
-
Get rid of inline assembly through verification-oriented lifting
Authors:
Frédéric Recoules,
Sébastien Bardin,
Richard Bonichon,
Laurent Mounier,
Marie-Laure Potet
Abstract:
Formal methods for software development have made great strides in the last two decades, to the point that their application in safety-critical embedded software is an undeniable success. Their extension to non-critical software is one of the notable forthcoming challenges. For example, C programmers regularly use inline assembly for low-level optimizations and system primitives. This usually resu…
▽ More
Formal methods for software development have made great strides in the last two decades, to the point that their application in safety-critical embedded software is an undeniable success. Their extension to non-critical software is one of the notable forthcoming challenges. For example, C programmers regularly use inline assembly for low-level optimizations and system primitives. This usually results in driving state-of-the-art formal analyzers developed for C ineffective. We thus propose TInA, an automated, generic, trustable and verification-oriented lifting technique turning inline assembly into semantically equivalent C code, in order to take advantage of existing C analyzers. Extensive experiments on real-world C code with inline assembly (including GMP and ffmpeg) show the feasibility and benefits of TInA.
△ Less
Submitted 1 October, 2019; v1 submitted 15 March, 2019;
originally announced March 2019.
-
Model Generation for Quantified Formulas: A Taint-Based Approach
Authors:
Benjamin Farinier,
Sébastien Bardin,
Richard Bonichon,
Marie-Laure Potet
Abstract:
We focus in this paper on generating models of quantified first-order formulas over built-in theories, which is paramount in software verification and bug finding. While standard methods are either geared toward proving the absence of solution or targeted to specific theories, we propose a generic approach based on a reduction to the quantifier-free case. Our technique allows thus to reuse all the…
▽ More
We focus in this paper on generating models of quantified first-order formulas over built-in theories, which is paramount in software verification and bug finding. While standard methods are either geared toward proving the absence of solution or targeted to specific theories, we propose a generic approach based on a reduction to the quantifier-free case. Our technique allows thus to reuse all the efficient machinery developed for that context. Experiments show a substantial improvement over state-of-the-art methods.
△ Less
Submitted 15 February, 2018;
originally announced February 2018.
-
Abstract Interpretation using a Language of Symbolic Approximation
Authors:
Matthieu Lemerre,
Sébastien Bardin
Abstract:
The traditional abstract domain framework for imperative programs suffers from several shortcomings; in particular it does not allow precise symbolic abstractions. To solve these problems, we propose a new abstract interpretation framework, based on symbolic expressions used both as an abstraction of the program, and as the input analyzed by abstract domains. We demonstrate new applications of the…
▽ More
The traditional abstract domain framework for imperative programs suffers from several shortcomings; in particular it does not allow precise symbolic abstractions. To solve these problems, we propose a new abstract interpretation framework, based on symbolic expressions used both as an abstraction of the program, and as the input analyzed by abstract domains. We demonstrate new applications of the frame- work: an abstract domain that efficiently propagates constraints across the whole program; a new formalization of functor domains as approximate translation, which allows the production of approximate programs, on which we can perform classical symbolic techniques. We used these to build a complete analyzer for embedded C programs, that demonstrates the practical applicability of the framework.
△ Less
Submitted 28 December, 2017;
originally announced December 2017.
-
Freeing Testers from Polluting Test Objectives
Authors:
Michaël Marcozzi,
Sébastien Bardin,
Nikolai Kosmatov,
Mike Papadakis,
Virgile Prevosto,
Loïc Correnson
Abstract:
Testing is the primary approach for detecting software defects. A major challenge faced by testers lies in crafting efficient test suites, able to detect a maximum number of bugs with manageable effort. To do so, they rely on coverage criteria, which define some precise test objectives to be covered. However, many common criteria specify a significant number of objectives that occur to be infeasib…
▽ More
Testing is the primary approach for detecting software defects. A major challenge faced by testers lies in crafting efficient test suites, able to detect a maximum number of bugs with manageable effort. To do so, they rely on coverage criteria, which define some precise test objectives to be covered. However, many common criteria specify a significant number of objectives that occur to be infeasible or redundant in practice, like covering dead code or semantically equal mutants. Such objectives are well-known to be harmful to the design of test suites, impacting both the efficiency and precision of testers' effort. This work introduces a sound and scalable formal technique able to prune out a significant part of the infeasible and redundant objectives produced by a large panel of white-box criteria. In a nutshell, we reduce this challenging problem to proving the validity of logical assertions in the code under test. This technique is implemented in a tool that relies on weakest-precondition calculus and SMT solving for proving the assertions. The tool is built on top of the Frama-C verification platform, which we carefully tune for our specific scalability needs. The experiments reveal that the tool can prune out up to 27% of test objectives in a program and scale to applications of 200K lines of code.
△ Less
Submitted 29 August, 2017;
originally announced August 2017.
-
CDCL-inspired Word-level Learning for Bit-vector Constraint Solving
Authors:
Zakaria Chihani,
François Bobot,
Sébastien Bardin
Abstract:
The theory of quantifier-free bit-vectors (QF_BV) is of paramount importance in software verification. The standard approach for satisfiability checking reduces the bit-vector problem to a Boolean problem, leveraging the powerful SAT solving techniques and their conflict-driven clause learning (CDCL) mechanisms. Yet, this bit-level approach loses the structure of the initial bit-vector problem. We…
▽ More
The theory of quantifier-free bit-vectors (QF_BV) is of paramount importance in software verification. The standard approach for satisfiability checking reduces the bit-vector problem to a Boolean problem, leveraging the powerful SAT solving techniques and their conflict-driven clause learning (CDCL) mechanisms. Yet, this bit-level approach loses the structure of the initial bit-vector problem. We propose a conflict-driven, word-level, combinable constraints learning for the theory of quantifier-free bit-vectors. This work paves the way to truly word-level decision procedures for bit-vectors, taking full advantage of word-level propagations recently designed in CP and SMT communities.
△ Less
Submitted 28 June, 2017;
originally announced June 2017.
-
Targeting Infeasibility Questions on Obfuscated Codes
Authors:
Robin David,
Sébastien Bardin,
Jean-Yves Marion
Abstract:
Software deobfuscation is a crucial activity in security analysis and especially, in malware analysis. While standard static and dynamic approaches suffer from well-known shortcomings, Dynamic Symbolic Execution (DSE) has recently been proposed has an interesting alternative, more robust than static analysis and more complete than dynamic analysis. Yet, DSE addresses certain kinds of questions enc…
▽ More
Software deobfuscation is a crucial activity in security analysis and especially, in malware analysis. While standard static and dynamic approaches suffer from well-known shortcomings, Dynamic Symbolic Execution (DSE) has recently been proposed has an interesting alternative, more robust than static analysis and more complete than dynamic analysis. Yet, DSE addresses certain kinds of questions encountered by a reverser namely feasibility questions. Many issues arising during reverse, e.g. detecting protection schemes such as opaque predicates fall into the category of infeasibility questions. In this article, we present the Backward-Bounded DSE, a generic, precise, efficient and robust method for solving infeasibility questions. We demonstrate the benefit of the method for opaque predicates and call stack tampering, and give some insight for its usage for some other protection schemes. Especially, the technique has successfully been used on state-of-the-art packers as well as on the government-grade X-Tunnel malware -- allowing its entire deobfuscation. Backward-Bounded DSE does not supersede existing DSE approaches, but rather complements them by addressing infeasibility questions in a scalable and precise manner. Following this line, we propose sparse disassembly, a combination of Backward-Bounded DSE and static disassembly able to enlarge dynamic disassembly in a guaranteed way, hence getting the best of dynamic and static disassembly. This work paves the way for robust, efficient and precise disassembly tools for heavily-obfuscated binaries.
△ Less
Submitted 16 December, 2016;
originally announced December 2016.
-
Generic and Effective Specification of Structural Test Objectives
Authors:
Sébastien Bardin,
Mickaël Delahaye,
Nikolai Kosmatov,
Michaël Marcozzi,
Virgile Prevosto
Abstract:
While a wide range of different, sometimes heterogeneous test coverage criteria have been proposed, there exists no generic formalism to describe them, and available test automation tools usually support only a small subset of them. We introduce a unified specification language, called HTOL, providing a powerful generic mechanism to define test objectives, which permits encoding numerous existing…
▽ More
While a wide range of different, sometimes heterogeneous test coverage criteria have been proposed, there exists no generic formalism to describe them, and available test automation tools usually support only a small subset of them. We introduce a unified specification language, called HTOL, providing a powerful generic mechanism to define test objectives, which permits encoding numerous existing criteria and supporting them in a unified way. HTOL comes with a formal semantics and can express complex requirements over several executions (using a novel notion of hyperlabels), as well as alternative requirements or requirements over a whole program execution. A novel classification of a large class of existing criteria is proposed. Finally, a coverage measurement tool for HTOL objectives has been implemented. Initial experiments suggest that the proposed approach is both efficient and practical.
△ Less
Submitted 19 January, 2017; v1 submitted 5 September, 2016;
originally announced September 2016.
-
A Combined Approach for Constraints over Finite Domains and Arrays
Authors:
Sébastien Bardin,
Arnaud Gotlieb
Abstract:
Arrays are ubiquitous in the context of software verification. However, effective reasoning over arrays is still rare in CP, as local reasoning is dramatically ill-conditioned for constraints over arrays. In this paper, we propose an approach combining both global symbolic reasoning and local consistency filtering in order to solve constraint systems involving arrays (with accesses, updates and si…
▽ More
Arrays are ubiquitous in the context of software verification. However, effective reasoning over arrays is still rare in CP, as local reasoning is dramatically ill-conditioned for constraints over arrays. In this paper, we propose an approach combining both global symbolic reasoning and local consistency filtering in order to solve constraint systems involving arrays (with accesses, updates and size constraints) and finite-domain constraints over their elements and indexes. Our approach, named FDCC, is based on a combination of a congruence closure algorithm for the standard theory of arrays and a CP solver over finite domains. The tricky part of the work lies in the bi-directional communication mechanism between both solvers. We identify the significant information to share, and design ways to master the communication overhead. Experiments on random instances show that FDCC solves more formulas than any portfolio combination of the two solvers taken in isolation, while overhead is kept reasonable.
△ Less
Submitted 1 December, 2013;
originally announced December 2013.
-
Efficient Leverage of Symbolic ATG Tools to Advanced Coverage Criteria
Authors:
Sébastien Bardin,
Nikolai Kosmatov,
François Cheynier
Abstract:
Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we seek to bridge the gap between the coverage criteria supported by symbolic ATG tools and the most advanced coverage criteria found in the literature. We define a new testing criterion, label coverage, and prove it to be both expressive and amenable to efficient automation. We propose several innovative…
▽ More
Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we seek to bridge the gap between the coverage criteria supported by symbolic ATG tools and the most advanced coverage criteria found in the literature. We define a new testing criterion, label coverage, and prove it to be both expressive and amenable to efficient automation. We propose several innovative techniques resulting in an effective black-box support for label coverage, while a direct approach induces an exponential blow-up of the search space. Initial experiments show that ATG for label coverage can be achieved at a reasonable cost and that our optimisations yield very significant savings.
△ Less
Submitted 19 August, 2013;
originally announced August 2013.