-
Automating Abstract Interpretation of Abstract Machines
Authors:
James Ian Johnson
Abstract:
Static program analysis is a valuable tool for any programming language that people write programs in. The prevalence of scripting languages in the world suggests programming language interpreters are relatively easy to write. Users of these languages lament their inability to analyze their code, therefore programming language analyzers are not easy to write. This thesis investigates a systematic…
▽ More
Static program analysis is a valuable tool for any programming language that people write programs in. The prevalence of scripting languages in the world suggests programming language interpreters are relatively easy to write. Users of these languages lament their inability to analyze their code, therefore programming language analyzers are not easy to write. This thesis investigates a systematic method of creating abstract interpreters from traditional interpreters, called Abstracting Abstract Machines.
Abstract interpreters are difficult to develop due to technical, theoretical, and pragmatic problems. Technical problems include engineering data structures and algorithms. I show that modest and simple changes to the mathematical presentation of abstract machines result in 1000 times better running time - just seconds for moderately sized programs.
In the theoretical realm, abstraction can make correctness difficult to ascertain. I provide proof techniques for proving the correctness of regular, pushdown, and stack-inspecting pushdown models of abstract computation by leaving computational power to an external factor: allocation. Even if we don't trust the proof, we can run models concretely against test suites to better trust them.
In the pragmatic realm, I show that the systematic process of abstracting abstract machines is automatable. I develop a meta-language for expressing abstract machines similar to other semantics engineering languages. The language's special feature is that it provides an interface to abstract allocation. The semantics guarantees that if allocation is finite, then the semantics is a sound and computable approximation of the concrete semantics.
△ Less
Submitted 29 April, 2015;
originally announced April 2015.
-
Pushdown flow analysis with abstract garbage collection
Authors:
J. Ian Johnson,
Ilya Sergey,
Christopher Earl,
Matthew Might,
David Van Horn
Abstract:
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection push the boundaries of what we can learn about programs statically. This work illuminates and poses solutions to theoretical and practical challenges that stand in the way of combining the power of these techniques. Pushdown flow analysis grants unbounded yet computable polyvariance to the analysi…
▽ More
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection push the boundaries of what we can learn about programs statically. This work illuminates and poses solutions to theoretical and practical challenges that stand in the way of combining the power of these techniques. Pushdown flow analysis grants unbounded yet computable polyvariance to the analysis of return-flow in higher-order programs. Abstract garbage collection grants unbounded polyvariance to abstract addresses which become unreachable between invocations of the abstract contexts in which they were created. Pushdown analysis solves the problem of precisely analyzing recursion in higher-order languages; abstract garbage collection is essential in solving the "stickiness" problem. Alone, our benchmarks demonstrate that each method can reduce analysis times and boost precision by orders of magnitude. We combine these methods. The challenge in marrying these techniques is not subtle: computing the reachable control states of a pushdown system relies on limiting access during transition to the top of the stack; abstract garbage collection, on the other hand, needs full access to the entire stack to compute a root set, just as concrete collection does. Conditional pushdown systems were developed for just such a conundrum, but existing methods are ill-suited for the dynamic nature of garbage collection. We show fully precise and approximate solutions to the feasible paths problem for pushdown garbage-collecting control-flow analysis. Experiments reveal synergistic interplay between garbage collection and pushdown techniques, and the fusion demonstrates "better-than-both-worlds" precision.
△ Less
Submitted 19 June, 2014;
originally announced June 2014.
-
Abstracting Abstract Control (Extended)
Authors:
J. Ian Johnson,
David Van Horn
Abstract:
The strength of a dynamic language is also its weakness: run-time flexibility comes at the cost of compile-time predictability. Many of the hallmarks of dynamic languages such as closures, continuations, various forms of reflection, and a lack of static types make many programmers rejoice, while compiler writers, tool developers, and verification engineers lament. The dynamism of these features si…
▽ More
The strength of a dynamic language is also its weakness: run-time flexibility comes at the cost of compile-time predictability. Many of the hallmarks of dynamic languages such as closures, continuations, various forms of reflection, and a lack of static types make many programmers rejoice, while compiler writers, tool developers, and verification engineers lament. The dynamism of these features simply confounds statically reasoning about programs that use them. Consequently, static analyses for dynamic languages are few, far between, and seldom sound.
The "abstracting abstract machines" (AAM) approach to constructing static analyses has recently been proposed as a method to ameliorate the difficulty of designing analyses for such language features. The approach, so called because it derives a function for the sound and computable approximation of program behavior starting from the abstract machine semantics of a language, provides a viable approach to dynamic language analysis since all that is required is a machine description of the interpreter.
The original AAM recipe produces finite state abstractions, which cannot faithfully represent an interpreter's control stack. Recent advances have shown that higher-order programs can be approximated with pushdown systems. However, these automata theoretic models either break down on features that inspect or modify the control stack.
In this paper, we tackle the problem of bringing pushdown flow analysis to the domain of dynamic language features. We revise the abstracting abstract machines technique to target the stronger computational model of pushdown systems. In place of automata theory, we use only abstract machines and memoization. As case studies, we show the technique applies to a language with closures, garbage collection, stack-inspection, and first-class composable continuations.
△ Less
Submitted 14 August, 2014; v1 submitted 14 May, 2013;
originally announced May 2013.
-
Optimizing Abstract Abstract Machines
Authors:
J. Ian Johnson,
Nicholas Labich,
Matthew Might,
David Van Horn
Abstract:
The technique of abstracting abstract machines (AAM) provides a systematic approach for deriving computable approximations of evaluators that are easily proved sound. This article contributes a complementary step-by-step process for subsequently going from a naive analyzer derived under the AAM approach, to an efficient and correct implementation. The end result of the process is a two to three or…
▽ More
The technique of abstracting abstract machines (AAM) provides a systematic approach for deriving computable approximations of evaluators that are easily proved sound. This article contributes a complementary step-by-step process for subsequently going from a naive analyzer derived under the AAM approach, to an efficient and correct implementation. The end result of the process is a two to three order-of-magnitude improvement over the systematically derived analyzer, making it competitive with hand-optimized implementations that compute fundamentally less precise results.
△ Less
Submitted 24 July, 2013; v1 submitted 15 November, 2012;
originally announced November 2012.