-
A Highly Scalable, Hybrid, Cross-Platform Timing Analysis Framework Providing Accurate Differential Throughput Estimation via Instruction-Level Tracing
Authors:
Min-Yih Hsu,
Felicitas Hetzelt,
David Gens,
Michael Maitland,
Michael Franz
Abstract:
Estimating instruction-level throughput is critical for many applications: multimedia, low-latency networking, medical, automotive, avionic, and industrial control systems all rely on tightly calculable and accurate timing bounds of their software. Unfortunately, how long a program may run - or if it may indeed stop at all - cannot be answered in the general case. This is why state-of-the-art thro…
▽ More
Estimating instruction-level throughput is critical for many applications: multimedia, low-latency networking, medical, automotive, avionic, and industrial control systems all rely on tightly calculable and accurate timing bounds of their software. Unfortunately, how long a program may run - or if it may indeed stop at all - cannot be answered in the general case. This is why state-of-the-art throughput estimation tools usually focus on a subset of operations and make several simplifying assumptions. Correctly identifying these sets of constraints and regions of interest in the program typically requires source code, specialized tools, and dedicated expert knowledge. Whenever a single instruction is modified, this process must be repeated, incurring high costs when iteratively developing timing sensitive code in practice.
In this paper, we present MCAD, a novel and lightweight timing analysis framework that can identify the effects of code changes on the microarchitectural level for binary programs. MCAD provides accurate differential throughput estimates by emulating whole program execution using QEMU and forwarding traces to LLVM for instruction-level analysis. This allows developers to iterate quickly, with low overhead, using common tools: identifying execution paths that are less sensitive to changes over timing-critical paths only takes minutes within MCAD. To the best of our knowledge this represents an entirely new capability that reduces turnaround times for differential throughput estimation by several orders of magnitude compared to state-of-the-art tools. Our detailed evaluation shows that MCAD scales to real-world applications like FFmpeg and Clang with millions of instructions, achieving < 3% geo mean error compared to ground truth timings from hardware-performance counters on x86 and ARM machines.
△ Less
Submitted 16 May, 2023; v1 submitted 13 January, 2022;
originally announced January 2022.
-
V0LTpwn: Attacking x86 Processor Integrity from Software
Authors:
Zijo Kenjar,
Tommaso Frassetto,
David Gens,
Michael Franz,
Ahmad-Reza Sadeghi
Abstract:
Fault-injection attacks have been proven in the past to be a reliable way of bypassing hardware-based security measures, such as cryptographic hashes, privilege and access permission enforcement, and trusted execution environments. However, traditional fault-injection attacks require physical presence, and hence, were often considered out of scope in many real-world adversary settings.
In this p…
▽ More
Fault-injection attacks have been proven in the past to be a reliable way of bypassing hardware-based security measures, such as cryptographic hashes, privilege and access permission enforcement, and trusted execution environments. However, traditional fault-injection attacks require physical presence, and hence, were often considered out of scope in many real-world adversary settings.
In this paper we show this assumption may no longer be justified. We present V0LTpwn, a novel hardware-oriented but software-controlled attack that affects the integrity of computation in virtually any execution mode on modern x86 processors. To the best of our knowledge, this represents the first attack on x86 integrity from software. The key idea behind our attack is to undervolt a physical core to force non-recoverable hardware faults. Under a V0LTpwn attack, CPU instructions will continue to execute with erroneous results and without crashes, allowing for exploitation. In contrast to recently presented side-channel attacks that leverage vulnerable speculative execution, V0LTpwn is not limited to information disclosure, but allows adversaries to affect execution, and hence, effectively breaks the integrity goals of modern x86 platforms. In our detailed evaluation we successfully launch software-based attacks against Intel SGX enclaves from a privileged process to demonstrate that a V0LTpwn attack can successfully change the results of computations within enclave execution across multiple CPU revisions.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
When a Patch is Not Enough - HardFails: Software-Exploitable Hardware Bugs
Authors:
Ghada Dessouky,
David Gens,
Patrick Haney,
Garrett Persyn,
Arun Kanuparthi,
Hareesh Khattri,
Jason M. Fung,
Ahmad-Reza Sadeghi,
Jeyavijayan Rajendran
Abstract:
In this paper, we take a deep dive into microarchitectural security from a hardware designer's perspective by reviewing the existing approaches to detect hardware vulnerabilities during the design phase. We show that a protection gap currently exists in practice that leaves chip designs vulnerable to software-based attacks. In particular, existing verification approaches fail to detect specific cl…
▽ More
In this paper, we take a deep dive into microarchitectural security from a hardware designer's perspective by reviewing the existing approaches to detect hardware vulnerabilities during the design phase. We show that a protection gap currently exists in practice that leaves chip designs vulnerable to software-based attacks. In particular, existing verification approaches fail to detect specific classes of vulnerabilities, which we call HardFails: these bugs evade detection by current verification techniques while being exploitable from software. We demonstrate such vulnerabilities in real-world SoCs using RISC-V to showcase and analyze concrete instantiations of HardFails. Patching these hardware bugs may not always be possible and can potentially result in a product recall. We base our findings on two extensive case studies: the recent Hack@DAC 2018 hardware security competition, where 54 independent teams of researchers competed world-wide over a period of 12 weeks to catch inserted security bugs in SoC RTL designs, and an in-depth systematic evaluation of state-of-the-art verification approaches. Our findings indicate that even combinations of techniques will miss high-impact bugs due to the large number of modules with complex interdependencies and fundamental limitations of current detection approaches. We also craft a real-world software attack that exploits one of the RTL bugs from Hack@DAC that evaded detection and discuss novel approaches to mitigate the growing problem of cross-layer bugs at design time.
△ Less
Submitted 1 December, 2018;
originally announced December 2018.
-
Execution Integrity with In-Place Encryption
Authors:
Dean Sullivan,
Orlando Arias,
David Gens,
Lucas Davi,
Ahmad-Reza Sadeghi,
Yier Jin
Abstract:
Instruction set randomization (ISR) was initially proposed with the main goal of countering code-injection attacks. However, ISR seems to have lost its appeal since code-injection attacks became less attractive because protection mechanisms such as data execution prevention (DEP) as well as code-reuse attacks became more prevalent.
In this paper, we show that ISR can be extended to also protect…
▽ More
Instruction set randomization (ISR) was initially proposed with the main goal of countering code-injection attacks. However, ISR seems to have lost its appeal since code-injection attacks became less attractive because protection mechanisms such as data execution prevention (DEP) as well as code-reuse attacks became more prevalent.
In this paper, we show that ISR can be extended to also protect against code-reuse attacks while at the same time offering security guarantees similar to those of software diversity, control-flow integrity, and information hiding. We present Scylla, a scheme that deploys a new technique for in-place code encryption to hide the code layout of a randomized binary, and restricts the control flow to a benign execution path. This allows us to i) implicitly restrict control-flow targets to basic block entries without requiring the extraction of a control-flow graph, ii) achieve execution integrity within legitimate basic blocks, and iii) hide the underlying code layout under malicious read access to the program. Our analysis demonstrates that Scylla is capable of preventing state-of-the-art attacks such as just-in-time return-oriented programming (JIT-ROP) and crash-resistant oriented programming (CROP). We extensively evaluate our prototype implementation of Scylla and show feasible performance overhead. We also provide details on how this overhead can be significantly reduced with dedicated hardware support.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
CAn't Touch This: Practical and Generic Software-only Defenses Against Rowhammer Attacks
Authors:
Ferdinand Brasser,
Lucas Davi,
David Gens,
Christopher Liebchen,
Ahmad-Reza Sadeghi
Abstract:
Rowhammer is a hardware bug that can be exploited to implement privilege escalation and remote code execution attacks. Previous proposals on rowhammer mitigation either require hardware changes or follow heuristic-based approaches (based on CPU performance counters). To date, there exists no instant protection against rowhammer attacks on legacy systems.
In this paper, we present the design and…
▽ More
Rowhammer is a hardware bug that can be exploited to implement privilege escalation and remote code execution attacks. Previous proposals on rowhammer mitigation either require hardware changes or follow heuristic-based approaches (based on CPU performance counters). To date, there exists no instant protection against rowhammer attacks on legacy systems.
In this paper, we present the design and implementation of two practical and efficient software-only defenses against rowhammer attacks. Our defenses prevent the attacker from leveraging rowhammer to corrupt physically co-located data in memory that is owned by a different system entity. Our first defense, B-CATT, extends the system bootloader to disable vulnerable physical memory. B-CATT is highly practical, does not require changes to the operating system, and can be deployed on virtually all x86-based systems. While B-CATT is able to stop all known rowhammer attacks, it does not yet tackle the fundamental problem of missing memory isolation in physical memory. To address this problem, we introduce our second defense G-CATT, a generic solution that extends the physical memory allocator of the OS to physically isolate the memory of different system entities (e.g., kernel and user space).
As proof of concept, we implemented B-CATT on x86, and our generic defense, G-CATT, on x86 and ARM to mitigate rowhammer-based kernel exploits. Our extensive evaluation shows that both mitigation schemes (i) can stop available real- world rowhammer attacks, (ii) impose virtually no run-time overhead for common user and kernel benchmarks as well as commonly used applications, and (iii) do not affect the stability of the overall system.
△ Less
Submitted 7 December, 2016; v1 submitted 25 November, 2016;
originally announced November 2016.