Search | arXiv e-print repository

DFI: An Interprocedural Value-Flow Analysis Framework that Scales to Large Codebases

Authors: Min-Yih Hsu, Felicitas Hetzelt, Michael Franz

Abstract: Context- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime requirements. This paper proposes a new scalable approach to compute value-flows via graph reachability. To this end, we develop a new graph structure as an extensio… ▽ More Context- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime requirements. This paper proposes a new scalable approach to compute value-flows via graph reachability. To this end, we develop a new graph structure as an extension of LLVM IR that contains two additional operations which significantly simplify the modeling of pointer aliasing. Further, by processing nodes in the opposite direction of SSA def-use chains, we are able to minimize the tree width of the resulting graph. This allows us to employ efficient tree traversal algorithms in order to resolve graph reachability. We present a value-flow analysis framework,DFI, implementing our approach. We compare DFI against two state-of-the-art value-flow analysis frameworks, Phasar and SVF, to extract value-flows from 4 real-world software projects. Given 32GB of memory, Phasar and SVF are unable to complete analysis of larger projects such as OpenSSL or FFmpeg, while DFI is able to complete all evaluations. For the subset of benchmarks that Phasar and SVF do handle, DFI requires significantly less memory (1.5% of Phasar's, 6.4% of SVF's memory footprint on average) and runs significantly faster (23x speedup over Phasar, 57x compared to SVF). Our analysis shows that, in contrast to previous approaches, DFI's memory and runtime requirements scale almost linearly with the number of analyzed instructions. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2201.04804 [pdf, other]

A Highly Scalable, Hybrid, Cross-Platform Timing Analysis Framework Providing Accurate Differential Throughput Estimation via Instruction-Level Tracing

Authors: Min-Yih Hsu, Felicitas Hetzelt, David Gens, Michael Maitland, Michael Franz

Abstract: Estimating instruction-level throughput is critical for many applications: multimedia, low-latency networking, medical, automotive, avionic, and industrial control systems all rely on tightly calculable and accurate timing bounds of their software. Unfortunately, how long a program may run - or if it may indeed stop at all - cannot be answered in the general case. This is why state-of-the-art thro… ▽ More Estimating instruction-level throughput is critical for many applications: multimedia, low-latency networking, medical, automotive, avionic, and industrial control systems all rely on tightly calculable and accurate timing bounds of their software. Unfortunately, how long a program may run - or if it may indeed stop at all - cannot be answered in the general case. This is why state-of-the-art throughput estimation tools usually focus on a subset of operations and make several simplifying assumptions. Correctly identifying these sets of constraints and regions of interest in the program typically requires source code, specialized tools, and dedicated expert knowledge. Whenever a single instruction is modified, this process must be repeated, incurring high costs when iteratively developing timing sensitive code in practice. In this paper, we present MCAD, a novel and lightweight timing analysis framework that can identify the effects of code changes on the microarchitectural level for binary programs. MCAD provides accurate differential throughput estimates by emulating whole program execution using QEMU and forwarding traces to LLVM for instruction-level analysis. This allows developers to iterate quickly, with low overhead, using common tools: identifying execution paths that are less sensitive to changes over timing-critical paths only takes minutes within MCAD. To the best of our knowledge this represents an entirely new capability that reduces turnaround times for differential throughput estimation by several orders of magnitude compared to state-of-the-art tools. Our detailed evaluation shows that MCAD scales to real-world applications like FFmpeg and Clang with millions of instructions, achieving < 3% geo mean error compared to ground truth timings from hardware-performance counters on x86 and ARM machines. △ Less

Submitted 16 May, 2023; v1 submitted 13 January, 2022; originally announced January 2022.

arXiv:2109.10660 [pdf, other]

VIA: Analyzing Device Interfaces of Protected Virtual Machines

Authors: Felicitas Hetzelt, Martin Radev, Robert Buhren, Mathias Morbitzer, Jean-Pierre Seifert

Abstract: Both AMD and Intel have presented technologies for confidential computing in cloud environments. The proposed solutions - AMD SEV (-ES, -SNP) and Intel TDX - protect Virtual Machines (VMs) against attacks from higher privileged layers through memory encryption and integrity protection. This model of computation draws a new trust boundary between virtual devices and the VM, which in so far lacks th… ▽ More Both AMD and Intel have presented technologies for confidential computing in cloud environments. The proposed solutions - AMD SEV (-ES, -SNP) and Intel TDX - protect Virtual Machines (VMs) against attacks from higher privileged layers through memory encryption and integrity protection. This model of computation draws a new trust boundary between virtual devices and the VM, which in so far lacks thorough examination. In this paper, we therefore present an analysis of the virtual device interface and discuss several attack vectors against a protected VM. Further, we develop and evaluate VIA, an automated analysis tool to detect cases of improper sanitization of input recieved via the virtual device interface. VIA improves upon existing approaches for the automated analysis of device interfaces in the following aspects: (i) support for virtualization relevant buses, (ii) efficient Direct Memory Access (DMA) support and (iii) performance. VIA builds upon the Linux Kernel Library and clang's libfuzzer to fuzz the communication between the driver and the device via MMIO, PIO, and DMA. An evaluation of VIA shows that it performs 570 executions per second on average and improves performance compared to existing approaches by an average factor of 2706. Using VIA, we analyzed 22 drivers in Linux 5.10.0-rc6, thereby uncovering 50 bugs and initiating multiple patches to the virtual device driver interface of Linux. To prove our findings criticality under the threat model of AMD SEV and Intel TDX, we showcase three exemplary attacks based on the bugs found. The attacks enable a malicious hypervisor to corrupt the memory and gain code execution in protected VMs with SEV-ES and are theoretically applicable to SEV-SNP and TDX. △ Less

Submitted 22 September, 2021; originally announced September 2021.

arXiv:1612.01119 [pdf, other]

Security Analysis of Encrypted Virtual Machines

Authors: Felicitas Hetzelt, Robert Buhren

Abstract: Cloud computing has become indispensable in today's computer landscape. The flexibility it offers for customers as well as for providers has become a crucial factor for large parts of the computer industry. Virtualization is the key technology that allows for sharing of hardware resources among different customers. The controlling software component, called hypervisor, provides a virtualized view… ▽ More Cloud computing has become indispensable in today's computer landscape. The flexibility it offers for customers as well as for providers has become a crucial factor for large parts of the computer industry. Virtualization is the key technology that allows for sharing of hardware resources among different customers. The controlling software component, called hypervisor, provides a virtualized view of the computer resources and ensures separation of different guest virtual machines. However, this important cornerstone of cloud computing is not necessarily trustworthy. To mitigate this threat AMD introduced Secure Encrypted Virtualization, short SEV. SEV is a processor extension that encrypts guest memory in order to prevent a potentially malicious hypervisor from accessing guest data. In this paper we analyse whether the proposed features can resist a malicious hypervisor and discuss the trade-offs imposed by additional protection mechanisms. To do so, we developed a model of SEV's security capabilities based on the available documentation as actual silicon implementations are not yet on the market. We found that the currently proposed version of SEV is not up to the task owing to three design shortcomings. First, as with standard AMD-V, under SEV, the virtual machine control block is not encrypted and handled directly by the hypervisor, allowing him to bypass VM memory encryption by executing conveniently chosen gadgets. Secondly, the general purpose registers are not encrypted upon vmexit, leaking potentially sensitive data. Finally, the control of the nested pagetables allows a malicious hypervisor to closely control the execution of a VM and attack it with memory replay attacks. △ Less

Submitted 25 July, 2017; v1 submitted 4 December, 2016; originally announced December 2016.

arXiv:1610.08717 [pdf, other]

Reins to the Cloud: Compromising Cloud Systems via the Data Plane

Authors: Kashyap Thimmaraju, Bhargava Shastry, Tobias Fiebig, Felicitas Hetzelt, Jean-Pierre Seifert, Anja Feldmann, Stefan Schmid

Abstract: Virtual switches have become popular among cloud operating systems to interconnect virtual machines in a more flexible manner. However, this paper demonstrates that virtual switches introduce new attack surfaces in cloud setups, whose effects can be disastrous. Our analysis shows that these vulnerabilities are caused by: (1) inappropriate security assumptions (privileged virtual switch execution i… ▽ More Virtual switches have become popular among cloud operating systems to interconnect virtual machines in a more flexible manner. However, this paper demonstrates that virtual switches introduce new attack surfaces in cloud setups, whose effects can be disastrous. Our analysis shows that these vulnerabilities are caused by: (1) inappropriate security assumptions (privileged virtual switch execution in kernel and user space), (2) the logical centralization of such networks (e.g., OpenStack or SDN), (3) the presence of bi-directional communication channels between data plane systems and the centralized controller, and (4) non-standard protocol parsers. Our work highlights the need to accommodate the data plane(s) in our threat models. In particular, it forces us to revisit today's assumption that the data plane can only be compromised by a sophisticated attacker: we show that compromising the data plane of modern computer networks can actually be performed by a very simple attacker with limited resources only and at low cost (i.e., at the cost of renting a virtual machine in the Cloud). As a case study, we fuzzed only 2\% of the code-base of a production quality virtual switch's packet processor (namely OvS), identifying serious vulnerabilities leading to unauthenticated remote code execution. In particular, we present the "rein worm" which allows us to fully compromise test-setups in less than 100 seconds. We also evaluate the performance overhead of existing mitigations such as ASLR, PIEs, and unconditional stack canaries on OvS. We find that while applying these countermeasures in kernel-space incurs a significant overhead, in user-space the performance overhead is negligible. △ Less

Submitted 10 February, 2017; v1 submitted 27 October, 2016; originally announced October 2016.

Showing 1–5 of 5 results for author: Hetzelt, F