-
A Method for Fast Autonomy Transfer in Reinforcement Learning
Authors:
Dinuka Sahabandu,
Bhaskar Ramasubramanian,
Michail Alexiou,
J. Sukarno Mertoguno,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper introduces a novel reinforcement learning (RL) strategy designed to facilitate rapid autonomy transfer by utilizing pre-trained critic value functions from multiple environments. Unlike traditional methods that require extensive retraining or fine-tuning, our approach integrates existing knowledge, enabling an RL agent to adapt swiftly to new settings without requiring extensive computa…
▽ More
This paper introduces a novel reinforcement learning (RL) strategy designed to facilitate rapid autonomy transfer by utilizing pre-trained critic value functions from multiple environments. Unlike traditional methods that require extensive retraining or fine-tuning, our approach integrates existing knowledge, enabling an RL agent to adapt swiftly to new settings without requiring extensive computational resources. Our contributions include development of the Multi-Critic Actor-Critic (MCAC) algorithm, establishing its convergence, and empirical evidence demonstrating its efficacy. Our experimental results show that MCAC significantly outperforms the baseline actor-critic algorithm, achieving up to 22.76x faster autonomy transfer and higher reward accumulation. This advancement underscores the potential of leveraging accumulated knowledge for efficient adaptation in RL applications.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Safety-Critical Online Control with Adversarial Disturbances
Authors:
Bhaskar Ramasubramanian,
Baicen Xiao,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper studies the control of safety-critical dynamical systems in the presence of adversarial disturbances. We seek to synthesize state-feedback controllers to minimize a cost incurred due to the disturbance, while respecting a safety constraint. The safety constraint is given by a bound on an H-inf norm, while the cost is specified as an upper bound on the H-2 norm of the system. We consider…
▽ More
This paper studies the control of safety-critical dynamical systems in the presence of adversarial disturbances. We seek to synthesize state-feedback controllers to minimize a cost incurred due to the disturbance, while respecting a safety constraint. The safety constraint is given by a bound on an H-inf norm, while the cost is specified as an upper bound on the H-2 norm of the system. We consider an online setting where costs at each time are revealed only after the controller at that time is chosen. We propose an iterative approach to the synthesis of the controller by solving a modified discrete-time Riccati equation. Solutions of this equation enforce the safety constraint. We compare the cost of this controller with that of the optimal controller when one has complete knowledge of disturbances and costs in hindsight. We show that the regret function, which is defined as the difference between these costs, varies logarithmically with the time horizon. We validate our approach on a process control setup that is subject to two kinds of adversarial attacks.
△ Less
Submitted 20 September, 2020;
originally announced September 2020.
-
Privacy-Preserving Resilience of Cyber-Physical Systems to Adversaries
Authors:
Bhaskar Ramasubramanian,
Luyao Niu,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
A cyber-physical system (CPS) is expected to be resilient to more than one type of adversary. In this paper, we consider a CPS that has to satisfy a linear temporal logic (LTL) objective in the presence of two kinds of adversaries. The first adversary has the ability to tamper with inputs to the CPS to influence satisfaction of the LTL objective. The interaction of the CPS with this adversary is m…
▽ More
A cyber-physical system (CPS) is expected to be resilient to more than one type of adversary. In this paper, we consider a CPS that has to satisfy a linear temporal logic (LTL) objective in the presence of two kinds of adversaries. The first adversary has the ability to tamper with inputs to the CPS to influence satisfaction of the LTL objective. The interaction of the CPS with this adversary is modeled as a stochastic game. We synthesize a controller for the CPS to maximize the probability of satisfying the LTL objective under any policy of this adversary. The second adversary is an eavesdropper who can observe labeled trajectories of the CPS generated from the previous step. It could then use this information to launch other kinds of attacks. A labeled trajectory is a sequence of labels, where a label is associated to a state and is linked to the satisfaction of the LTL objective at that state. We use differential privacy to quantify the indistinguishability between states that are related to each other when the eavesdropper sees a labeled trajectory. Two trajectories of equal length will be differentially private if they are differentially private at each state along the respective trajectories. We use a skewed Kantorovich metric to compute distances between probability distributions over states resulting from actions chosen according to policies from related states in order to quantify differential privacy. Moreover, we do this in a manner that does not affect the satisfaction probability of the LTL objective. We validate our approach on a simulation of a UAV that has to satisfy an LTL objective in an adversarial environment.
△ Less
Submitted 26 July, 2020;
originally announced July 2020.
-
Secure Control in Partially Observable Environments to Satisfy LTL Specifications
Authors:
Bhaskar Ramasubramanian,
Luyao Niu,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic speci…
▽ More
This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic specification under any adversary policy. The search for policies is limited to the space of finite state controllers, which leads to a tractable approach to determine policies. We relate the satisfaction of the specification to reaching (a subset of) recurrent states of a Markov chain. We present an algorithm to determine a set of defender and adversary finite state controllers of fixed sizes that will satisfy the temporal logic specification, and prove that it is sound. We then propose a value-iteration algorithm to maximize the probability of satisfying the temporal logic specification under finite state controllers of fixed sizes. Lastly, we extend this setting to the scenario where the size of the finite state controller of the defender can be increased to improve the satisfaction probability. We illustrate our approach with an example.
△ Less
Submitted 4 November, 2020; v1 submitted 22 July, 2020;
originally announced July 2020.
-
Stochastic Dynamic Information Flow Tracking Game using Supervised Learning for Detecting Advanced Persistent Threats
Authors:
Shana Moothedath,
Dinuka Sahabandu,
Joey Allen,
Linda Bushnell,
Wenke Lee,
Radha Poovendran
Abstract:
Advanced persistent threats (APTs) are organized prolonged cyberattacks by sophisticated attackers. Although APT activities are stealthy, they interact with the system components and these interactions lead to information flows. Dynamic Information Flow Tracking (DIFT) has been proposed as one of the effective ways to detect APTs using the information flows. However, wide range security analysis u…
▽ More
Advanced persistent threats (APTs) are organized prolonged cyberattacks by sophisticated attackers. Although APT activities are stealthy, they interact with the system components and these interactions lead to information flows. Dynamic Information Flow Tracking (DIFT) has been proposed as one of the effective ways to detect APTs using the information flows. However, wide range security analysis using DIFT results in a significant increase in performance overhead and high rates of false-positives and false-negatives generated by DIFT. In this paper, we model the strategic interaction between APT and DIFT as a non-cooperative stochastic game. The game unfolds on a state space constructed from an information flow graph (IFG) that is extracted from the system log. The objective of the APT in the game is to choose transitions in the IFG to find an optimal path in the IFG from an entry point of the attack to an attack target. On the other hand, the objective of DIFT is to dynamically select nodes in the IFG to perform security analysis for detecting APT. Our game model has imperfect information as the players do not have information about the actions of the opponent. We consider two scenarios of the game (i) when the false-positive and false-negative rates are known to both players and (ii) when the false-positive and false-negative rates are unknown to both players. Case (i) translates to a game model with complete information and we propose a value iteration-based algorithm and prove the convergence. Case (ii) translates to a game with unknown transition probabilities. In this case, we propose Hierarchical Supervised Learning (HSL) algorithm that integrates a neural network, to predict the value vector of the game, with a policy iteration algorithm to compute an approximate equilibrium. We implemented our algorithms on real attack datasets and validated the performance of our approach.
△ Less
Submitted 25 June, 2021; v1 submitted 23 July, 2020;
originally announced July 2020.
-
A Reinforcement Learning Approach for Dynamic Information Flow Tracking Games for Detecting Advanced Persistent Threats
Authors:
Dinuka Sahabandu,
Shana Moothedath,
Joey Allen,
Linda Bushnell,
Wenke Lee,
Radha Poovendran
Abstract:
Advanced Persistent Threats (APTs) are stealthy attacks that threaten the security and privacy of sensitive information. Interactions of APTs with victim system introduce information flows that are recorded in the system logs. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism for detecting APTs. DIFT taints information flows originating at system entities that are suscept…
▽ More
Advanced Persistent Threats (APTs) are stealthy attacks that threaten the security and privacy of sensitive information. Interactions of APTs with victim system introduce information flows that are recorded in the system logs. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism for detecting APTs. DIFT taints information flows originating at system entities that are susceptible to an attack, tracks the propagation of the tainted flows, and authenticates the tainted flows at certain system components according to a pre-defined security policy. Deployment of DIFT to defend against APTs in cyber systems is limited by the heavy resource and performance overhead associated with DIFT. In this paper, we propose a resource-efficient model for DIFT by incorporating the security costs, false-positives, and false-negatives associated with DIFT. Specifically, we develop a game-theoretic framework and provide an analytical model of DIFT that enables the study of trade-off between resource efficiency and the effectiveness of detection. Our game model is a nonzero-sum, infinite-horizon, average reward stochastic game. Our model incorporates the information asymmetry between players that arises from DIFT's inability to distinguish malicious flows from benign flows and APT's inability to know the locations where DIFT performs a security analysis. Additionally, the game has incomplete information as the transition probabilities (false-positive and false-negative rates) are unknown. We propose a multiple-time scale stochastic approximation algorithm to learn an equilibrium solution of the game. We prove that our algorithm converges to an average reward Nash equilibrium. We evaluate our proposed model and algorithm on a real-world ransomware dataset and validate the effectiveness of the proposed approach.
△ Less
Submitted 28 June, 2021; v1 submitted 30 June, 2020;
originally announced July 2020.
-
Dynamic Information Flow Tracking for Detection of Advanced Persistent Threats: A Stochastic Game Approach
Authors:
Shana Moothedath,
Dinuka Sahabandu,
Joey Allen,
Andrew Clark,
Linda Bushnell,
Wenke Lee,
Radha Poovendran
Abstract:
Advanced Persistent Threats (APTs) are stealthy customized attacks by intelligent adversaries. This paper deals with the detection of APTs that infiltrate cyber systems and compromise specifically targeted data and/or infrastructures. Dynamic information flow tracking is an information trace-based detection mechanism against APTs that taints suspicious information flows in the system and generates…
▽ More
Advanced Persistent Threats (APTs) are stealthy customized attacks by intelligent adversaries. This paper deals with the detection of APTs that infiltrate cyber systems and compromise specifically targeted data and/or infrastructures. Dynamic information flow tracking is an information trace-based detection mechanism against APTs that taints suspicious information flows in the system and generates security analysis for unauthorized use of tainted data. In this paper, we develop an analytical model for resource-efficient detection of APTs using an information flow tracking game. The game is a nonzero-sum, turn-based, stochastic game with asymmetric information as the defender cannot distinguish whether an incoming flow is malicious or benign and hence has only partial state observation. We analyze equilibrium of the game and prove that a Nash equilibrium is given by a solution to the minimum capacity cut set problem on a flow-network derived from the system, where the edge capacities are obtained from the cost of performing security analysis. Finally, we implement our algorithm on the real-world dataset for a data exfiltration attack augmented with false-negative and false-positive rates and compute an optimal defender strategy.
△ Less
Submitted 25 June, 2021; v1 submitted 22 June, 2020;
originally announced June 2020.
-
FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback
Authors:
Baicen Xiao,
Qifan Lu,
Bhaskar Ramasubramanian,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the re…
▽ More
Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing.
△ Less
Submitted 19 January, 2020;
originally announced January 2020.
-
Covert Channel-Based Transmitter Authentication in Controller Area Networks
Authors:
Xuhang Ying,
Giuseppe Bernieri,
Mauro Conti,
Linda Bushnell,
Radha Poovendran
Abstract:
In recent years, the security of automotive Cyber-Physical Systems (CPSs) is facing urgent threats due to the widespread use of legacy in-vehicle communication systems. As a representative legacy bus system, the Controller Area Network (CAN) hosts Electronic Control Units (ECUs) that are crucial vehicle functioning. In this scenario, malicious actors can exploit CAN vulnerabilities, such as the la…
▽ More
In recent years, the security of automotive Cyber-Physical Systems (CPSs) is facing urgent threats due to the widespread use of legacy in-vehicle communication systems. As a representative legacy bus system, the Controller Area Network (CAN) hosts Electronic Control Units (ECUs) that are crucial vehicle functioning. In this scenario, malicious actors can exploit CAN vulnerabilities, such as the lack of built-in authentication and encryption schemes, to launch CAN bus attacks with life-threatening consequences (e.g., disabling brakes). In this paper, we present TACAN (Transmitter Authentication in CAN), which provides secure authentication of ECUs on the legacy CAN bus by exploiting the covert channels, without introducing CAN protocol modifications or traffic overheads. TACAN turns upside-down the originally malicious concept of covert channels and exploits it to build an effective defensive technique that facilitates transmitter authentication via a centralized, trusted Monitor Node. TACAN consists of three different covert channels for ECU authentication: 1) the Inter-Arrival Time (IAT)-based; 2) the Least Significant Bit (LSB)-based; and 3) a hybrid covert channel, exploiting the combination of the first two. In order to validate TACAN, we implement the covert channels on the University of Washington (UW) EcoCAR (Chevrolet Camaro 2016) testbed. We further evaluate the bit error, throughput, and detection performance of TACAN through extensive experiments using the EcoCAR testbed and a publicly available dataset collected from Toyota Camry 2010. We demonstrate the feasibility of TACAN and the effectiveness of detecting CAN bus attacks, highlighting no traffic overheads and attesting the regular functionality of ECUs.
△ Less
Submitted 7 December, 2019;
originally announced December 2019.
-
Linear Temporal Logic Satisfaction in Adversarial Environments using Secure Control Barrier Certificates
Authors:
Bhaskar Ramasubramanian,
Luyao Niu,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discrete-time dynamics. The temporal logic specification is given in safe-LTL_F, a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled a…
▽ More
This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discrete-time dynamics. The temporal logic specification is given in safe-LTL_F, a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled as a two-player zero-sum discrete-time dynamic stochastic game with the CPS as defender. We formulate a dynamic programming based approach to determine a stationary defender policy that maximized the probability of satisfaction of a safe-LTL_F formula over a finite time-horizon under any stationary adversary policy. We introduce secure control barrier certificates (S-CBCs), a generalization of barrier certificates and control barrier certificates that accounts for the presence of an adversary, and use S-CBCs to provide a lower bound on the above satisfaction probability. When the dynamics of the evolution of the system state has a specific underlying structure, we present a way to determine an S-CBC as a polynomial in the state variables using sum-of-squares optimization. An illustrative example demonstrates our approach.
△ Less
Submitted 27 October, 2019;
originally announced October 2019.
-
Mitigating Vulnerabilities of Voltage-based Intrusion Detection Systems in Controller Area Networks
Authors:
Sang Uk Sagong,
Radha Poovendran,
Linda Bushnell
Abstract:
Data for controlling a vehicle is exchanged among Electronic Control Units (ECUs) via in-vehicle network protocols such as the Controller Area Network (CAN) protocol. Since these protocols are designed for an isolated network, the protocols do not encrypt data nor authenticate messages. Intrusion Detection Systems (IDSs) are developed to secure the CAN protocol by detecting abnormal deviations in…
▽ More
Data for controlling a vehicle is exchanged among Electronic Control Units (ECUs) via in-vehicle network protocols such as the Controller Area Network (CAN) protocol. Since these protocols are designed for an isolated network, the protocols do not encrypt data nor authenticate messages. Intrusion Detection Systems (IDSs) are developed to secure the CAN protocol by detecting abnormal deviations in physical properties. For instance, a voltage-based IDS (VIDS) exploits voltage characteristics of each ECU to detect an intrusion. An ECU with VIDS must be connected to the CAN bus using extra wires to measure voltages of the CAN bus lines. These extra wires, however, may introduce new attack surfaces to the CAN bus if the ECU with VIDS is compromised. We investigate new vulnerabilities of VIDS and demonstrate that an adversary may damage an ECU with VIDS, block message transmission, and force an ECU to retransmit messages. In order to defend the CAN bus against these attacks, we propose two hardware-based Intrusion Response Systems (IRSs) that disconnect the compromised ECU from the CAN bus once these attacks are detected. We develop four voltage-based attacks by exploiting vulnerabilities of VIDS and evaluate the effectiveness of the proposed IRSs using a CAN bus testbed.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Potential-Based Advice for Stochastic Policy Learning
Authors:
Baicen Xiao,
Bhaskar Ramasubramanian,
Andrew Clark,
Hannaneh Hajishirzi,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies. We show that a potential-based reward shaping scheme is able to preserve optimality of stochastic policies, and demonstrate that the ability of an agent to learn an optimal policy is not affected when this scheme is augmented to…
▽ More
This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies. We show that a potential-based reward shaping scheme is able to preserve optimality of stochastic policies, and demonstrate that the ability of an agent to learn an optimal policy is not affected when this scheme is augmented to soft Q-learning. We propose a method to impart potential based advice schemes to policy gradient algorithms. An algorithm that considers an advantage actor-critic architecture augmented with this scheme is proposed, and we give guarantees on its convergence. Finally, we evaluate our approach on a puddle-jump grid world with indistinguishable states, and the continuous state and action mountain car environment from classical control. Our results indicate that these schemes allow the agent to learn a stochastic optimal policy faster and obtain a higher average reward.
△ Less
Submitted 20 July, 2019;
originally announced July 2019.
-
Detecting ADS-B Spoofing Attacks using Deep Neural Networks
Authors:
Xuhang Ying,
Joanna Mazer,
Giuseppe Bernieri,
Mauro Conti,
Linda Bushnell,
Radha Poovendran
Abstract:
The Automatic Dependent Surveillance-Broadcast (ADS-B) system is a key component of the Next Generation Air Transportation System (NextGen) that manages the increasingly congested airspace. It provides accurate aircraft localization and efficient air traffic management and also improves the safety of billions of current and future passengers. While the benefits of ADS-B are well known, the lack of…
▽ More
The Automatic Dependent Surveillance-Broadcast (ADS-B) system is a key component of the Next Generation Air Transportation System (NextGen) that manages the increasingly congested airspace. It provides accurate aircraft localization and efficient air traffic management and also improves the safety of billions of current and future passengers. While the benefits of ADS-B are well known, the lack of basic security measures like encryption and authentication introduces various exploitable security vulnerabilities. One practical threat is the ADS-B spoofing attack that targets the ADS-B ground station, in which the ground-based or aircraft-based attacker manipulates the International Civil Aviation Organization (ICAO) address (a unique identifier for each aircraft) in the ADS-B messages to fake the appearance of non-existent aircraft or masquerade as a trusted aircraft. As a result, this attack can confuse the pilots or the air traffic control personnel and cause dangerous maneuvers. In this paper, we introduce SODA - a two-stage Deep Neural Network (DNN)-based spoofing detector for ADS-B that consists of a message classifier and an aircraft classifier. It allows a ground station to examine each incoming message based on the PHY-layer features (e.g., IQ samples and phases) and flag suspicious messages. Our experimental results show that SODA detects ground-based spoofing attacks with a probability of 99.34%, while having a very small false alarm rate (i.e., 0.43%). It outperforms other machine learning techniques such as XGBoost, Logistic Regression, and Support Vector Machine. It further identifies individual aircraft with an average F-score of 96.68% and an accuracy of 96.66%, with a significant improvement over the state-of-the-art detector.
△ Less
Submitted 22 April, 2019;
originally announced April 2019.
-
Secure Control under Partial Observability with Temporal Logic Constraints
Authors:
Bhaskar Ramasubramanian,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The search for policies is limited to over the space of finite state controllers, which leads to a…
▽ More
This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The search for policies is limited to over the space of finite state controllers, which leads to a tractable approach to determine policies. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic specification under any adversary policy. We relate the satisfaction of the specification in terms of reaching (a subset of) recurrent states of a Markov chain. We then present a procedure to determine a set of defender and adversary finite state controllers of given sizes that will satisfy the temporal logic specification. We illustrate our approach with an example.
△ Less
Submitted 15 March, 2019;
originally announced March 2019.
-
A Game Theoretic Approach for Dynamic Information Flow Tracking to Detect Multi-Stage Advanced Persistent Threats
Authors:
Shana Moothedath,
Dinuka Sahabandu,
Joey Allen,
Andrew Clark,
Linda Bushnell,
Wenke Lee,
Radha Poovendran
Abstract:
Advanced Persistent Threats (APTs) infiltrate cyber systems and compromise specifically targeted data and/or resources through a sequence of stealthy attacks consisting of multiple stages. Dynamic information flow tracking has been proposed to detect APTs. In this paper, we develop a dynamic information flow tracking game for resource-efficient detection of APTs via multi-stage dynamic games. The…
▽ More
Advanced Persistent Threats (APTs) infiltrate cyber systems and compromise specifically targeted data and/or resources through a sequence of stealthy attacks consisting of multiple stages. Dynamic information flow tracking has been proposed to detect APTs. In this paper, we develop a dynamic information flow tracking game for resource-efficient detection of APTs via multi-stage dynamic games. The game evolves on an information flow graph, whose nodes are processes and objects (e.g. file, network endpoints) in the system and the edges capture the interaction between different processes and objects. Each stage of the game has pre-specified targets which are characterized by a set of nodes of the graph and the goal of the APT is to evade detection and reach a target node of that stage. The goal of the defender is to maximize the detection probability while minimizing performance overhead on the system. The resource costs of the players are different and the information structure is asymmetric resulting in a nonzero-sum imperfect information game. We first calculate the best responses of the players and characterize the set of Nash equilibria for single stage attacks. Subsequently, we provide a polynomial-time algorithm to compute a correlated equilibrium for the multi-stage attack case. Finally, we experiment our model and algorithms on real-world nation state attack data obtained from Refinable Attack Investigation system.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
Shape of the Cloak: Formal Analysis of Clock Skew-Based Intrusion Detection System in Controller Area Networks
Authors:
Xuhang Ying,
Sang Uk Sagong,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
This paper presents a new masquerade attack called the cloaking attack and provides formal analyses for clock skew-based Intrusion Detection Systems (IDSs) that detect masquerade attacks in the Controller Area Network (CAN) in automobiles. In the cloaking attack, the adversary manipulates the message inter-transmission times of spoofed messages by adding delays so as to emulate a desired clock ske…
▽ More
This paper presents a new masquerade attack called the cloaking attack and provides formal analyses for clock skew-based Intrusion Detection Systems (IDSs) that detect masquerade attacks in the Controller Area Network (CAN) in automobiles. In the cloaking attack, the adversary manipulates the message inter-transmission times of spoofed messages by adding delays so as to emulate a desired clock skew and avoid detection. In order to predict and characterize the impact of the cloaking attack in terms of the attack success probability on a given CAN bus and IDS, we develop formal models for two clock skew-based IDSs, i.e., the state-of-the-art (SOTA) IDS and its adaptation to the widely used Network Time Protocol (NTP), using parameters of the attacker, the detector, and the hardware platform. To the best of our knowledge, this is the first paper that provides formal analyses of clock skew-based IDSs in automotive CAN. We implement the cloaking attack on two hardware testbeds, a prototype and a real vehicle (the University of Washington (UW) EcoCAR), and demonstrate its effectiveness against both the SOTA and NTP-based IDSs. We validate our formal analyses through extensive experiments for different messages, IDS settings, and vehicles. By comparing each predicted attack success probability curve against its experimental curve, we find that the average prediction error is within 3.0% for the SOTA IDS and 5.7% for the NTP-based IDS.
△ Less
Submitted 23 January, 2019; v1 submitted 25 July, 2018;
originally announced July 2018.
-
Cloaking the Clock: Emulating Clock Skew in Controller Area Networks
Authors:
Sang Uk Sagong,
Xuhang Ying,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
Automobiles are equipped with Electronic Control Units (ECU) that communicate via in-vehicle network protocol standards such as Controller Area Network (CAN). These protocols are designed under the assumption that separating in-vehicle communications from external networks is sufficient for protection against cyber attacks. This assumption, however, has been shown to be invalid by recent attacks i…
▽ More
Automobiles are equipped with Electronic Control Units (ECU) that communicate via in-vehicle network protocol standards such as Controller Area Network (CAN). These protocols are designed under the assumption that separating in-vehicle communications from external networks is sufficient for protection against cyber attacks. This assumption, however, has been shown to be invalid by recent attacks in which adversaries were able to infiltrate the in-vehicle network. Motivated by these attacks, intrusion detection systems (IDSs) have been proposed for in-vehicle networks that attempt to detect attacks by making use of device fingerprinting using properties such as clock skew of an ECU. In this paper, we propose the cloaking attack, an intelligent masquerade attack in which an adversary modifies the timing of transmitted messages in order to match the clock skew of a targeted ECU. The attack leverages the fact that, while the clock skew is a physical property of each ECU that cannot be changed by the adversary, the estimation of the clock skew by other ECUs is based on network traffic, which, being a cyber component only, can be modified by an adversary. We implement the proposed cloaking attack and test it on two IDSs, namely, the current state-of-the-art IDS and a new IDS that we develop based on the widely-used Network Time Protocol (NTP). We implement the cloaking attack on two hardware testbeds, a prototype and a real connected vehicle, and show that it can always deceive both IDSs. We also introduce a new metric called the Maximum Slackness Index to quantify the effectiveness of the cloaking attack even when the adversary is unable to precisely match the clock skew of the targeted ECU.
△ Less
Submitted 21 March, 2018; v1 submitted 7 October, 2017;
originally announced October 2017.
-
Adaptive Mitigation of Multi-Virus Propagation: A Passivity-Based Approach
Authors:
Phillip Lee,
Andrew Clark,
Basel Alomair,
Linda Bushnell,
Radha Poovendran
Abstract:
Malware propagation poses a growing threat to networked systems such as computer networks and cyber-physical systems. Current approaches to defending against malware propagation are based on patching or filtering susceptible nodes at a fixed rate. When the propagation dynamics are unknown or uncertain, however, the static rate that is chosen may be either insufficient to remove all viruses or too…
▽ More
Malware propagation poses a growing threat to networked systems such as computer networks and cyber-physical systems. Current approaches to defending against malware propagation are based on patching or filtering susceptible nodes at a fixed rate. When the propagation dynamics are unknown or uncertain, however, the static rate that is chosen may be either insufficient to remove all viruses or too high, incurring additional performance cost. In this paper, we formulate adaptive strategies for mitigating multiple malware epidemics when the propagation rate is unknown, using patching and filtering-based defense mechanisms. In order to identify conditions for ensuring that all viruses are asymptotically removed, we show that the malware propagation, patching, and filtering processes can be modeled as coupled passive dynamical systems. We prove that the patching rate required to remove all viruses is bounded above by the passivity index of the coupled system, and formulate the problem of selecting the minimum-cost mitigation strategy. Our results are evaluated through numerical study.
△ Less
Submitted 20 September, 2016; v1 submitted 14 March, 2016;
originally announced March 2016.
-
A Passivity Framework for Modeling and Mitigating Wormhole Attacks on Networked Control Systems
Authors:
Phillip Lee,
Andrew Clark,
Linda Bushnell,
Radha Poovendran
Abstract:
Networked control systems consist of distributed sensors and actuators that communicate via a wireless network. The use of an open wireless medium and unattended deployment leaves these systems vulnerable to intelligent adversaries whose goal is to disrupt the system performance. In this paper, we study the wormhole attack on a networked control system, in which an adversary establishes a link bet…
▽ More
Networked control systems consist of distributed sensors and actuators that communicate via a wireless network. The use of an open wireless medium and unattended deployment leaves these systems vulnerable to intelligent adversaries whose goal is to disrupt the system performance. In this paper, we study the wormhole attack on a networked control system, in which an adversary establishes a link between two distant regions of the network by using either high-gain antennas, as in the out-of-band wormhole, or colluding network nodes as in the in-band wormhole. Wormholes allow the adversary to violate the timing constraints of real-time control systems by delaying or dropping packets, and cannot be detected using cryptographic mechanisms alone. We study the impact of the wormhole attack on the network flows and delays and introduce a passivity-based control-theoretic framework for modeling the wormhole attack. We develop this framework for both the in-band and out-of-band wormhole attacks as well as complex, hereto-unreported wormhole attacks consisting of arbitrary combinations of in-and out-of band wormholes. We integrate existing mitigation strategies into our framework, and analyze the throughput, delay, and stability properties of the overall system. Through simulation study, we show that, by selectively dropping control packets, the wormhole attack can cause disturbances in the physical plant of a networked control system, and demonstrate that appropriate selection of detection parameters mitigates the disturbances due to the wormhole while satisfying the delay constraints of the physical system.
△ Less
Submitted 4 December, 2013;
originally announced December 2013.