Search | arXiv e-print repository

Analysis of Machine Learning Approaches to Packing Detection

Authors: Charles-Henry Bertrand Van Ouytsel, Thomas Given-Wilson, Jeremy Minet, Julian Roussieau, Axel Legay

Abstract: Packing is an obfuscation technique widely used by malware to hide the content and behavior of a program. Much prior research has explored how to detect whether a program is packed. This research includes a broad variety of approaches such as entropy analysis, syntactic signatures and more recently machine learning classifiers using various features. However, no robust results have indicated which… ▽ More Packing is an obfuscation technique widely used by malware to hide the content and behavior of a program. Much prior research has explored how to detect whether a program is packed. This research includes a broad variety of approaches such as entropy analysis, syntactic signatures and more recently machine learning classifiers using various features. However, no robust results have indicated which algorithms perform best, or which features are most significant. This is complicated by considering how to evaluate the results since accuracy, cost, generalization capabilities, and other measures are all reasonable. This work explores eleven different machine learning approaches using 119 features to understand: which features are most significant for packing detection; which algorithms offer the best performance; and which algorithms are most economical. △ Less

Submitted 2 May, 2021; originally announced May 2021.

arXiv:1509.08562 [pdf, ps, other]

doi 10.4204/EPTCS.194.4

Quantitative Information Flow for Scheduler-Dependent Systems

Authors: Yusuke Kawamoto, Thomas Given-Wilson

Abstract: Quantitative information flow analyses measure how much information on secrets is leaked by publicly observable outputs. One area of interest is to quantify and estimate the information leakage of composed systems. Prior work has focused on running disjoint component systems in parallel and reasoning about the leakage compositionally, but has not explored how the component systems are run in paral… ▽ More Quantitative information flow analyses measure how much information on secrets is leaked by publicly observable outputs. One area of interest is to quantify and estimate the information leakage of composed systems. Prior work has focused on running disjoint component systems in parallel and reasoning about the leakage compositionally, but has not explored how the component systems are run in parallel or how the leakage of composed systems can be minimised. In this paper we consider the manner in which parallel systems can be combined or scheduled. This considers the effects of scheduling channels where resources may be shared, or whether the outputs may be incrementally observed. We also generalise the attacker's capability, of observing outputs of the system, to consider attackers who may be imperfect in their observations, e.g. when outputs may be confused with one another, or when assessing the time taken for an output to appear. Our main contribution is to present how scheduling and observation effect information leakage properties. In particular, that scheduling can hide some leaked information from perfect observers, while some scheduling may reveal secret information that is hidden to imperfect observers. In addition we present an algorithm to construct a scheduler that minimises the min-entropy leakage and min-capacity in the presence of any observer. △ Less

Submitted 28 September, 2015; originally announced September 2015.

Comments: In Proceedings QAPL 2015, arXiv:1509.08169

ACM Class: D.4.6; H.1.1

Journal ref: EPTCS 194, 2015, pp. 48-62

arXiv:1508.04854 [pdf, ps, other]

doi 10.4204/EPTCS.189.9

On the Expressiveness of Joining

Authors: Thomas Given-Wilson, Axel Legay

Abstract: The expressiveness of communication primitives has been explored in a common framework based on the pi-calculus by considering four features: synchronism (asynchronous vs synchronous), arity (monadic vs polyadic data), communication medium (shared dataspaces vs channel-based), and pattern-matching (binding to a name vs testing name equality vs intensionality). Here another dimension coordination i… ▽ More The expressiveness of communication primitives has been explored in a common framework based on the pi-calculus by considering four features: synchronism (asynchronous vs synchronous), arity (monadic vs polyadic data), communication medium (shared dataspaces vs channel-based), and pattern-matching (binding to a name vs testing name equality vs intensionality). Here another dimension coordination is considered that accounts for the number of processes required for an interaction to occur. Coordination generalises binary languages such as pi-calculus to joining languages that combine inputs such as the Join Calculus and general rendezvous calculus. By means of possibility/impossibility of encodings, this paper shows coordination is unrelated to the other features. That is, joining languages are more expressive than binary languages, and no combination of the other features can encode a joining language into a binary language. Further, joining is not able to encode any of the other features unless they could be encoded otherwise. △ Less

Submitted 19 August, 2015; originally announced August 2015.

Comments: In Proceedings ICE 2015, arXiv:1508.04595. arXiv admin note: substantial text overlap with arXiv:1408.1455

ACM Class: F.1.2; D.3.3

Journal ref: EPTCS 189, 2015, pp. 99-113

arXiv:1408.1455 [pdf, ps, other]

doi 10.4204/EPTCS.160.4

On the Expressiveness of Intensional Communication

Authors: Thomas Given-Wilson

Abstract: The expressiveness of communication primitives has been explored in a common framework based on the pi-calculus by considering four features: synchronism (asynchronous vs synchronous), arity (monadic vs polyadic data), communication medium (shared dataspaces vs channel-based), and pattern-matching (binding to a name vs testing name equality). Here pattern-matching is generalised to account for ter… ▽ More The expressiveness of communication primitives has been explored in a common framework based on the pi-calculus by considering four features: synchronism (asynchronous vs synchronous), arity (monadic vs polyadic data), communication medium (shared dataspaces vs channel-based), and pattern-matching (binding to a name vs testing name equality). Here pattern-matching is generalised to account for terms with internal structure such as in recent calculi like Spi calculi, Concurrent Pattern Calculus and Psi calculi. This paper explores intensionality upon terms, in particular communication primitives that can match upon both names and structures. By means of possibility/impossibility of encodings, this paper shows that intensionality alone can encode synchronism, arity, communication-medium, and pattern-matching, yet no combination of these without intensionality can encode any intensional language. △ Less

Submitted 6 August, 2014; originally announced August 2014.

Comments: In Proceedings EXPRESS/SOS 2014, arXiv:1408.1271

Journal ref: EPTCS 160, 2014, pp. 30-46

arXiv:1405.1546 [pdf, ps, other]

doi 10.2168/LMCS-10(3:10)2014

A Concurrent Pattern Calculus

Authors: Thomas Given-Wilson, Daniele Gorla, Barry Jay

Abstract: Concurrent pattern calculus (CPC) drives interaction between processes by comparing data structures, just as sequential pattern calculus drives computation. By generalising from pattern matching to pattern unification, interaction becomes symmetrical, with information flowing in both directions. CPC provides a natural language to express trade where information exchange is pivotal to interaction.… ▽ More Concurrent pattern calculus (CPC) drives interaction between processes by comparing data structures, just as sequential pattern calculus drives computation. By generalising from pattern matching to pattern unification, interaction becomes symmetrical, with information flowing in both directions. CPC provides a natural language to express trade where information exchange is pivotal to interaction. The unification allows some patterns to be more discriminating than others; hence, the behavioural theory must take this aspect into account, so that bisimulation becomes subject to compatibility of patterns. Many popular process calculi can be encoded in CPC; this allows for a gain in expressiveness, formalised through encodings. △ Less

Submitted 20 August, 2014; v1 submitted 7 May, 2014; originally announced May 2014.

Comments: Logical Methods in Computer Science (2014)

Journal ref: Logical Methods in Computer Science, Volume 10, Issue 3 (August 23, 2014) lmcs:774

arXiv:1404.0956 [pdf, ps, other]

Expressiveness via Intensionality and Concurrency

Authors: Thomas Given-Wilson

Abstract: Computation can be considered by taking into account two dimensions: extensional versus intensional, and sequential versus concurrent. Traditionally sequential extensional computation can be captured by the lambda-calculus. However, recent work shows that there are more expressive intensional calculi such as SF-calculus. Traditionally process calculi capture computation by encoding the lambda-calc… ▽ More Computation can be considered by taking into account two dimensions: extensional versus intensional, and sequential versus concurrent. Traditionally sequential extensional computation can be captured by the lambda-calculus. However, recent work shows that there are more expressive intensional calculi such as SF-calculus. Traditionally process calculi capture computation by encoding the lambda-calculus, such as in the pi-calculus. Following this increased expressiveness via intensionality, other recent work has shown that concurrent pattern calculus is more expressive than pi-calculus. This paper formalises the relative expressiveness of all four of these calculi by placing them on a square whose edges are irreversible encodings. This square is representative of a more general result: that expressiveness increases with both intensionality and concurrency. △ Less

Submitted 21 June, 2014; v1 submitted 2 April, 2014; originally announced April 2014.

Comments: 18 pages, to appear in ICTAC 2014

arXiv:1404.0545 [pdf, ps, other]

doi 10.4204/EPTCS.166.4

An Intensional Concurrent Faithful Encoding of Turing Machines

Authors: Thomas Given-Wilson

Abstract: The benchmark for computation is typically given as Turing computability; the ability for a computation to be performed by a Turing Machine. Many languages exploit (indirect) encodings of Turing Machines to demonstrate their ability to support arbitrary computation. However, these encodings are usually by simulating the entire Turing Machine within the language, or by encoding a language that does… ▽ More The benchmark for computation is typically given as Turing computability; the ability for a computation to be performed by a Turing Machine. Many languages exploit (indirect) encodings of Turing Machines to demonstrate their ability to support arbitrary computation. However, these encodings are usually by simulating the entire Turing Machine within the language, or by encoding a language that does an encoding or simulation itself. This second category is typical for process calculi that show an encoding of lambda-calculus (often with restrictions) that in turn simulates a Turing Machine. Such approaches lead to indirect encodings of Turing Machines that are complex, unclear, and only weakly equivalent after computation. This paper presents an approach to encoding Turing Machines into intensional process calculi that is faithful, reduction preserving, and structurally equivalent. The encoding is demonstrated in a simple asymmetric concurrent pattern calculus before generalised to simplify infinite terms, and to show encodings into Concurrent Pattern Calculus and Psi Calculi. △ Less

Submitted 28 October, 2014; v1 submitted 2 April, 2014; originally announced April 2014.

Comments: In Proceedings ICE 2014, arXiv:1410.7013

Journal ref: EPTCS 166, 2014, pp. 21-37

Showing 1–7 of 7 results for author: Given-Wilson, T