-
Command injection attacks, continuations, and the Lambek calculus
Authors:
Hayo Thielecke
Abstract:
This paper shows connections between command injection attacks, continuations, and the Lambek calculus: certain command injections, such as the tautology attack on SQL, are shown to be a form of control effect that can be typed using the Lambek calculus, generalizing the double-negation typing of continuations. Lambek's syntactic calculus is a logic with two implicational connectives taking their…
▽ More
This paper shows connections between command injection attacks, continuations, and the Lambek calculus: certain command injections, such as the tautology attack on SQL, are shown to be a form of control effect that can be typed using the Lambek calculus, generalizing the double-negation typing of continuations. Lambek's syntactic calculus is a logic with two implicational connectives taking their arguments from the left and right, respectively. These connectives describe how strings interact with their left and right contexts when building up syntactic structures. The calculus is a form of propositional logic without structural rules, and so a forerunner of substructural logics like Linear Logic and Separation Logic.
△ Less
Submitted 20 June, 2016;
originally announced June 2016.
-
Static Analysis for Regular Expression Exponential Runtime via Substructural Logics (Extended)
Authors:
Asiri Rathnayake,
Hayo Thielecke
Abstract:
Regular expression matching using backtracking can have exponential runtime, leading to an algorithmic complexity attack known as REDoS in the systems security literature. In this paper, we build on a recently published static analysis that detects whether a given regular expression can have exponential runtime for some inputs. We systematically construct a more accurate analysis by forming powers…
▽ More
Regular expression matching using backtracking can have exponential runtime, leading to an algorithmic complexity attack known as REDoS in the systems security literature. In this paper, we build on a recently published static analysis that detects whether a given regular expression can have exponential runtime for some inputs. We systematically construct a more accurate analysis by forming powers and products of transition relations and thereby reducing the REDoS problem to reachability. The correctness of the analysis is proved using a substructural calculus of search trees, where the branching of the tree causing exponential blowup is characterized as a form of non-linearity.
△ Less
Submitted 13 August, 2017; v1 submitted 27 May, 2014;
originally announced May 2014.
-
Static Analysis for Regular Expression Denial-of-Service Attacks
Authors:
James Kirrage,
Asiri Rathnayake,
Hayo Thielecke
Abstract:
Regular expressions are a concise yet expressive language for expressing patterns. For instance, in networked software, they are used for input validation and intrusion detection. Yet some widely deployed regular expression matchers based on backtracking are themselves vulnerable to denial-of-service attacks, since their runtime can be exponential for certain input strings. This paper presents a s…
▽ More
Regular expressions are a concise yet expressive language for expressing patterns. For instance, in networked software, they are used for input validation and intrusion detection. Yet some widely deployed regular expression matchers based on backtracking are themselves vulnerable to denial-of-service attacks, since their runtime can be exponential for certain input strings. This paper presents a static analysis for detecting such vulnerable regular expressions. The running time of the analysis compares favourably with tools based on fuzzing, that is, randomly generating inputs and measuring how long matching them takes. Unlike fuzzers, the analysis pinpoints the source of the vulnerability and generates possible malicious inputs for programmers to use in security testing. Moreover, the analysis has a firm theoretical foundation in abstract machines. Testing the analysis on two large repositories of regular expressions shows that the analysis is able to find significant numbers of vulnerable regular expressions in a matter of seconds.
△ Less
Submitted 4 January, 2013;
originally announced January 2013.
-
Operational semantics for signal handling
Authors:
Maxim Strygin,
Hayo Thielecke
Abstract:
Signals are a lightweight form of interprocess communication in Unix. When a process receives a signal, the control flow is interrupted and a previously installed signal handler is run. Signal handling is reminiscent both of exception handling and concurrent interleaving of processes. In this paper, we investigate different approaches to formalizing signal handling in operational semantics, and co…
▽ More
Signals are a lightweight form of interprocess communication in Unix. When a process receives a signal, the control flow is interrupted and a previously installed signal handler is run. Signal handling is reminiscent both of exception handling and concurrent interleaving of processes. In this paper, we investigate different approaches to formalizing signal handling in operational semantics, and compare them in a series of examples. We find the big-step style of operational semantics to be well suited to modelling signal handling. We integrate exception handling with our big-step semantics of signal handling, by adopting the exception convention as defined in the Definition of Standard ML. The semantics needs to capture the complex interactions between signal handling and exception handling.
△ Less
Submitted 13 August, 2012;
originally announced August 2012.
-
Regular Expression Matching and Operational Semantics
Authors:
Asiri Rathnayake,
Hayo Thielecke
Abstract:
Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it were a program with some…
▽ More
Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it were a program with some non-deterministic constructs such as the Kleene star. We formalize this implementation technique for regular expression matching using operational semantics. Specifically, we derive a series of abstract machines, moving from the abstract definition of matching to increasingly realistic machines. First a continuation is added to the operational semantics to describe what remains to be matched after the current expression. Next, we represent the expression as a data structure using pointers, which enables redundant searches to be eliminated via testing for pointer equality. From there, we arrive both at Thompson's lockstep construction and a machine that performs some operations in parallel, suitable for implementation on a large number of cores, such as a GPU. We formalize the parallel machine using process algebra and report some preliminary experiments with an implementation on a graphics processor using CUDA.
△ Less
Submitted 15 August, 2011;
originally announced August 2011.