-
Causal Learning and Explanation of Deep Neural Networks via Autoencoded Activations
Authors:
Michael Harradon,
Jeff Druce,
Brian Ruttenberg
Abstract:
Deep neural networks are complex and opaque. As they enter application in a variety of important and safety critical domains, users seek methods to explain their output predictions. We develop an approach to explaining deep neural networks by constructing causal models on salient concepts contained in a CNN. We develop methods to extract salient concepts throughout a target network by using autoen…
▽ More
Deep neural networks are complex and opaque. As they enter application in a variety of important and safety critical domains, users seek methods to explain their output predictions. We develop an approach to explaining deep neural networks by constructing causal models on salient concepts contained in a CNN. We develop methods to extract salient concepts throughout a target network by using autoencoders trained to extract human-understandable representations of network activations. We then build a bayesian causal model using these extracted concepts as variables in order to explain image classification. Finally, we use this causal model to identify and visualize features with significant causal influence on final classification.
△ Less
Submitted 1 February, 2018;
originally announced February 2018.
-
Artificial Intelligence Based Malware Analysis
Authors:
Avi Pfeffer,
Brian Ruttenberg,
Lee Kellogg,
Michael Howard,
Catherine Call,
Alison O'Connor,
Glenn Takata,
Scott Neal Reilly,
Terry Patten,
Jason Taylor,
Robert Hall,
Arun Lakhotia,
Craig Miles,
Dan Scofield,
Jared Frank
Abstract:
Artificial intelligence methods have often been applied to perform specific functions or tasks in the cyber-defense realm. However, as adversary methods become more complex and difficult to divine, piecemeal efforts to understand cyber-attacks, and malware-based attacks in particular, are not providing sufficient means for malware analysts to understand the past, present and future characteristics…
▽ More
Artificial intelligence methods have often been applied to perform specific functions or tasks in the cyber-defense realm. However, as adversary methods become more complex and difficult to divine, piecemeal efforts to understand cyber-attacks, and malware-based attacks in particular, are not providing sufficient means for malware analysts to understand the past, present and future characteristics of malware.
In this paper, we present the Malware Analysis and Attributed using Genetic Information (MAAGI) system. The underlying idea behind the MAAGI system is that there are strong similarities between malware behavior and biological organism behavior, and applying biologically inspired methods to corpora of malware can help analysts better understand the ecosystem of malware attacks. Due to the sophistication of the malware and the analysis, the MAAGI system relies heavily on artificial intelligence techniques to provide this capability. It has already yielded promising results over its development life, and will hopefully inspire more integration between the artificial intelligence and cyber--defense communities.
△ Less
Submitted 27 April, 2017;
originally announced April 2017.
-
Structured Factored Inference: A Framework for Automated Reasoning in Probabilistic Programming Languages
Authors:
Avi Pfeffer,
Brian Ruttenberg,
William Kretschmer
Abstract:
Reasoning on large and complex real-world models is a computationally difficult task, yet one that is required for effective use of many AI applications. A plethora of inference algorithms have been developed that work well on specific models or only on parts of general models. Consequently, a system that can intelligently apply these inference algorithms to different parts of a model for fast rea…
▽ More
Reasoning on large and complex real-world models is a computationally difficult task, yet one that is required for effective use of many AI applications. A plethora of inference algorithms have been developed that work well on specific models or only on parts of general models. Consequently, a system that can intelligently apply these inference algorithms to different parts of a model for fast reasoning is highly desirable. We introduce a new framework called structured factored inference (SFI) that provides the foundation for such a system. Using models encoded in a probabilistic programming language, SFI provides a sound means to decompose a model into sub-models, apply an inference algorithm to each sub-model, and combine the resulting information to answer a query. Our results show that SFI is nearly as accurate as exact inference yet retains the benefits of approximate inference methods.
△ Less
Submitted 10 June, 2016;
originally announced June 2016.
-
Probabilistic Programming for Malware Analysis
Authors:
Brian Ruttenberg,
Lee Kellogg,
Avi Pfeffer
Abstract:
Constructing lineages of malware is an important cyber-defense task. Performing this task is difficult, however, due to the amount of malware data and obfuscation techniques by the authors. In this work, we formulate the lineage task as a probabilistic model, and use a novel probabilistic programming solution to jointly infer the lineage and creation times of families of malware.
Constructing lineages of malware is an important cyber-defense task. Performing this task is difficult, however, due to the amount of malware data and obfuscation techniques by the authors. In this work, we formulate the lineage task as a probabilistic model, and use a novel probabilistic programming solution to jointly infer the lineage and creation times of families of malware.
△ Less
Submitted 28 March, 2016;
originally announced March 2016.
-
Lazy Factored Inference for Functional Probabilistic Programming
Authors:
Avi Pfeffer,
Brian Ruttenberg,
Amy Sliva,
Michael Howard,
Glenn Takata
Abstract:
Probabilistic programming provides the means to represent and reason about complex probabilistic models using programming language constructs. Even simple probabilistic programs can produce models with infinitely many variables. Factored inference algorithms are widely used for probabilistic graphical models, but cannot be applied to these programs because all the variables and factors have to be…
▽ More
Probabilistic programming provides the means to represent and reason about complex probabilistic models using programming language constructs. Even simple probabilistic programs can produce models with infinitely many variables. Factored inference algorithms are widely used for probabilistic graphical models, but cannot be applied to these programs because all the variables and factors have to be enumerated. In this paper, we present a new inference framework, lazy factored inference (LFI), that enables factored algorithms to be used for models with infinitely many variables. LFI expands the model to a bounded depth and uses the structure of the program to precisely quantify the effect of the unexpanded part of the model, producing lower and upper bounds to the probability of the query.
△ Less
Submitted 11 September, 2015;
originally announced September 2015.
-
Decision-Making with Complex Data Structures using Probabilistic Programming
Authors:
Brian E. Ruttenberg,
Avi Pfeffer
Abstract:
Existing decision-theoretic reasoning frameworks such as decision networks use simple data structures and processes. However, decisions are often made based on complex data structures, such as social networks and protein sequences, and rich processes involving those structures. We present a framework for representing decision problems with complex data structures using probabilistic programming, a…
▽ More
Existing decision-theoretic reasoning frameworks such as decision networks use simple data structures and processes. However, decisions are often made based on complex data structures, such as social networks and protein sequences, and rich processes involving those structures. We present a framework for representing decision problems with complex data structures using probabilistic programming, allowing probabilistic models to be created with programming language constructs such as data structures and control flow. We provide a way to use arbitrary data types with minimal effort from the user, and an approximate decision-making algorithm that is effective even when the information space is very large or infinite. Experimental results show our algorithm working on problems with very large information spaces.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.
-
A Latent Parameter Node-Centric Model for Spatial Networks
Authors:
Nicholas D. Larusso,
Brian E. Ruttenberg,
Ambuj Singh
Abstract:
Spatial networks, in which nodes and edges are embedded in space, play a vital role in the study of complex systems. For example, many social networks attach geo-location information to each user, allowing the study of not only topological interactions between users, but spatial interactions as well. The defining property of spatial networks is that edge distances are associated with a cost, which…
▽ More
Spatial networks, in which nodes and edges are embedded in space, play a vital role in the study of complex systems. For example, many social networks attach geo-location information to each user, allowing the study of not only topological interactions between users, but spatial interactions as well. The defining property of spatial networks is that edge distances are associated with a cost, which may subtly influence the topology of the network. However, the cost function over distance is rarely known, thus developing a model of connections in spatial networks is a difficult task.
In this paper, we introduce a novel model for capturing the interaction between spatial effects and network structure. Our approach represents a unique combination of ideas from latent variable statistical models and spatial network modeling. In contrast to previous work, we view the ability to form long/short-distance connections to be dependent on the individual nodes involved. For example, a node's specific surroundings (e.g. network structure and node density) may make it more likely to form a long distance link than other nodes with the same degree. To capture this information, we attach a latent variable to each node which represents a node's spatial reach. These variables are inferred from the network structure using a Markov Chain Monte Carlo algorithm.
We experimentally evaluate our proposed model on 4 different types of real-world spatial networks (e.g. transportation, biological, infrastructure, and social). We apply our model to the task of link prediction and achieve up to a 35% improvement over previous approaches in terms of the area under the ROC curve. Additionally, we show that our model is particularly helpful for predicting links between nodes with low degrees. In these cases, we see much larger improvements over previous models.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.
-
Indexing the Earth Mover's Distance Using Normal Distributions
Authors:
Brian E. Ruttenberg,
Ambuj K. Singh
Abstract:
Querying uncertain data sets (represented as probability distributions) presents many challenges due to the large amount of data involved and the difficulties comparing uncertainty between distributions. The Earth Mover's Distance (EMD) has increasingly been employed to compare uncertain data due to its ability to effectively capture the differences between two distributions. Computing the EMD ent…
▽ More
Querying uncertain data sets (represented as probability distributions) presents many challenges due to the large amount of data involved and the difficulties comparing uncertainty between distributions. The Earth Mover's Distance (EMD) has increasingly been employed to compare uncertain data due to its ability to effectively capture the differences between two distributions. Computing the EMD entails finding a solution to the transportation problem, which is computationally intensive. In this paper, we propose a new lower bound to the EMD and an index structure to significantly improve the performance of EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose a new lower bound to the EMD that approximates the EMD on a projection vector. Each distribution is projected onto a vector and approximated by a normal distribution, as well as an accompanying error term. We then represent each normal as a point in a Hough transformed space. We then use the concept of stochastic dominance to implement an efficient index structure in the transformed space. We show that our method significantly decreases K-NN query time on uncertain databases. The index structure also scales well with database cardinality. It is well suited for heterogeneous data sets, helping to keep EMD based queries tractable as uncertain data sets become larger and more complex.
△ Less
Submitted 30 November, 2011;
originally announced November 2011.