-
A Sea of Cyber Threats: Maritime Cybersecurity from the Perspective of Mariners
Authors:
Anna Raymaker,
Akshaya Kumar,
Miuyin Yong Wong,
Ryan Pickren,
Animesh Chhotaray,
Frank Li,
Saman Zonouz,
Raheem Beyah
Abstract:
Maritime systems, including ships and ports, are critical components of global infrastructure, essential for transporting over 80% of the world's goods and supporting internet connectivity. However, these systems face growing cybersecurity threats, as shown by recent attacks disrupting Maersk, one of the world's largest shipping companies, causing widespread impacts on international trade. The uni…
▽ More
Maritime systems, including ships and ports, are critical components of global infrastructure, essential for transporting over 80% of the world's goods and supporting internet connectivity. However, these systems face growing cybersecurity threats, as shown by recent attacks disrupting Maersk, one of the world's largest shipping companies, causing widespread impacts on international trade. The unique challenges of the maritime environment--such as diverse operational conditions, extensive physical access points, fragmented regulatory frameworks, and its deeply interconnected structure--require maritime-specific cybersecurity research. Despite the sector's importance, maritime cybersecurity remains underexplored, leaving significant gaps in understanding its challenges and risks.
To address these gaps, we investigate how maritime system operators perceive and navigate cybersecurity challenges within this complex landscape. We conducted a user study comprising surveys and semi-structured interviews with 21 officer-level mariners. Participants reported direct experiences with shipboard cyber-attacks, including GPS spoofing and logistics-disrupting ransomware, demonstrating the real-world impact of these threats. Our findings reveal systemic and human-centric issues, such as training poorly aligned with maritime needs, insufficient detection and response tools, and serious gaps in mariners' cybersecurity understanding. Our contributions include a categorization of threats identified by mariners and recommendations for improving maritime security, including better training, response protocols, and regulation. These insights aim to guide future research and policy to strengthen the resilience of maritime systems.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Detecting Functional Bugs in Smart Contracts through LLM-Powered and Bug-Oriented Composite Analysis
Authors:
Binbin Zhao,
Xingshuang Lin,
Yuan Tian,
Saman Zonouz,
Na Ruan,
Jiliang Li,
Raheem Beyah,
Shouling Ji
Abstract:
Smart contracts are fundamental pillars of the blockchain, playing a crucial role in facilitating various business transactions. However, these smart contracts are vulnerable to exploitable bugs that can lead to substantial monetary losses. A recent study reveals that over 80% of these exploitable bugs, which are primarily functional bugs, can evade the detection of current tools. The primary issu…
▽ More
Smart contracts are fundamental pillars of the blockchain, playing a crucial role in facilitating various business transactions. However, these smart contracts are vulnerable to exploitable bugs that can lead to substantial monetary losses. A recent study reveals that over 80% of these exploitable bugs, which are primarily functional bugs, can evade the detection of current tools. The primary issue is the significant gap between understanding the high-level logic of the business model and checking the low-level implementations in smart contracts. Furthermore, identifying deeply rooted functional bugs in smart contracts requires the automated generation of effective detection oracles based on various bug features. To address these challenges, we design and implement PROMFUZZ, an automated and scalable system to detect functional bugs, in smart contracts. In PROMFUZZ, we first propose a novel Large Language Model (LLM)-driven analysis framework, which leverages a dual-agent prompt engineering strategy to pinpoint potentially vulnerable functions for further scrutiny. We then implement a dual-stage coupling approach, which focuses on generating invariant checkers that leverage logic information extracted from potentially vulnerable functions. Finally, we design a bug-oriented fuzzing engine, which maps the logical information from the high-level business model to the low-level smart contract implementations, and performs the bug-oriented fuzzing on targeted functions. We compare PROMFUZZ with multiple state-of-the-art methods. The results show that PROMFUZZ achieves 86.96% recall and 93.02% F1-score in detecting functional bugs, marking at least a 50% improvement in both metrics over state-of-the-art methods. Moreover, we perform an in-depth analysis on real-world DeFi projects and detect 30 zero-day bugs. Up to now, 24 zero-day bugs have been assigned CVE IDs.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction
Authors:
Anni Zhou,
Raheem Beyah,
Rishikesan Kamaleswaran
Abstract:
In critical care settings, timely and accurate predictions can significantly impact patient outcomes, especially for conditions like sepsis, where early intervention is crucial. We aim to model patient-specific reward functions in a contextual multi-armed bandit setting. The goal is to leverage patient-specific clinical features to optimize decision-making under uncertainty. This paper proposes Ne…
▽ More
In critical care settings, timely and accurate predictions can significantly impact patient outcomes, especially for conditions like sepsis, where early intervention is crucial. We aim to model patient-specific reward functions in a contextual multi-armed bandit setting. The goal is to leverage patient-specific clinical features to optimize decision-making under uncertainty. This paper proposes NeuroSep-CP-LCB, a novel integration of neural networks with contextual bandits and conformal prediction tailored for early sepsis detection. Unlike the algorithm pool selection problem in the previous paper, where the primary focus was identifying the most suitable pre-trained model for prediction tasks, this work directly models the reward function using a neural network, allowing for personalized and adaptive decision-making. Combining the representational power of neural networks with the robustness of conformal prediction intervals, this framework explicitly accounts for uncertainty in offline data distributions and provides actionable confidence bounds on predictions.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
FirmRCA: Towards Post-Fuzzing Analysis on ARM Embedded Firmware with Efficient Event-based Fault Localization
Authors:
Boyu Chang,
Binbin Zhao,
Qiao Zhang,
Peiyu Liu,
Yuan Tian,
Raheem Beyah,
Shouling Ji
Abstract:
While fuzzing has demonstrated its effectiveness in exposing vulnerabilities within embedded firmware, the discovery of crashing test cases is only the first step in improving the security of these critical systems. The subsequent fault localization process, which aims to precisely identify the root causes of observed crashes, is a crucial yet time-consuming post-fuzzing work. Unfortunately, the a…
▽ More
While fuzzing has demonstrated its effectiveness in exposing vulnerabilities within embedded firmware, the discovery of crashing test cases is only the first step in improving the security of these critical systems. The subsequent fault localization process, which aims to precisely identify the root causes of observed crashes, is a crucial yet time-consuming post-fuzzing work. Unfortunately, the automated root cause analysis on embedded firmware crashes remains an underexplored area, which is challenging from several perspectives: (1) the fuzzing campaign towards the embedded firmware lacks adequate debugging mechanisms, making it hard to automatically extract essential runtime information for analysis; (2) the inherent raw binary nature of embedded firmware often leads to over-tainted and noisy suspicious instructions, which provides limited guidance for analysts in manually investigating the root cause and remediating the underlying vulnerability. To address these challenges, we design and implement FirmRCA, a practical fault localization framework tailored specifically for embedded firmware. FirmRCA introduces an event-based footprint collection approach to aid and significantly expedite reverse execution. Next, to solve the complicated memory alias problem, FirmRCA proposes a history-driven method by tracking data propagation through the execution trace, enabling precise identification of deep crash origins. Finally, FirmRCA proposes a novel strategy to highlight key instructions related to the root cause, providing practical guidance in the final investigation. We evaluate FirmRCA with both synthetic and real-world targets, including 41 crashing test cases across 17 firmware images. The results show that FirmRCA can effectively (92.7% success rate) identify the root cause of crashing test cases within the top 10 instructions.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
SyzTrust: State-aware Fuzzing on Trusted OS Designed for IoT Devices
Authors:
Qinying Wang,
Boyu Chang,
Shouling Ji,
Yuan Tian,
Xuhong Zhang,
Binbin Zhao,
Gaoning Pan,
Chenyang Lyu,
Mathias Payer,
Wenhai Wang,
Raheem Beyah
Abstract:
Trusted Execution Environments (TEEs) embedded in IoT devices provide a deployable solution to secure IoT applications at the hardware level. By design, in TEEs, the Trusted Operating System (Trusted OS) is the primary component. It enables the TEE to use security-based design techniques, such as data encryption and identity authentication. Once a Trusted OS has been exploited, the TEE can no long…
▽ More
Trusted Execution Environments (TEEs) embedded in IoT devices provide a deployable solution to secure IoT applications at the hardware level. By design, in TEEs, the Trusted Operating System (Trusted OS) is the primary component. It enables the TEE to use security-based design techniques, such as data encryption and identity authentication. Once a Trusted OS has been exploited, the TEE can no longer ensure security. However, Trusted OSes for IoT devices have received little security analysis, which is challenging from several perspectives: (1) Trusted OSes are closed-source and have an unfavorable environment for sending test cases and collecting feedback. (2) Trusted OSes have complex data structures and require a stateful workflow, which limits existing vulnerability detection tools. To address the challenges, we present SyzTrust, the first state-aware fuzzing framework for vetting the security of resource-limited Trusted OSes. SyzTrust adopts a hardware-assisted framework to enable fuzzing Trusted OSes directly on IoT devices as well as tracking state and code coverage non-invasively. SyzTrust utilizes composite feedback to guide the fuzzer to effectively explore more states as well as to increase the code coverage. We evaluate SyzTrust on Trusted OSes from three major vendors: Samsung, Tsinglink Cloud, and Ali Cloud. These systems run on Cortex M23/33 MCUs, which provide the necessary abstraction for embedded TEEs. We discovered 70 previously unknown vulnerabilities in their Trusted OSes, receiving 10 new CVEs so far. Furthermore, compared to the baseline, SyzTrust has demonstrated significant improvements, including 66% higher code coverage, 651% higher state coverage, and 31% improved vulnerability-finding capability. We report all discovered new vulnerabilities to vendors and open source SyzTrust.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
UVSCAN: Detecting Third-Party Component Usage Violations in IoT Firmware
Authors:
Binbin Zhao,
Shouling Ji,
Xuhong Zhang,
Yuan Tian,
Qinying Wang,
Yuwen Pu,
Chenyang Lyu,
Raheem Beyah
Abstract:
Nowadays, IoT devices integrate a wealth of third-party components (TPCs) in firmware to shorten the development cycle. TPCs usually have strict usage specifications, e.g., checking the return value of the function. Violating the usage specifications of TPCs can cause serious consequences, e.g., NULL pointer dereference. Therefore, this massive amount of TPC integrations, if not properly implement…
▽ More
Nowadays, IoT devices integrate a wealth of third-party components (TPCs) in firmware to shorten the development cycle. TPCs usually have strict usage specifications, e.g., checking the return value of the function. Violating the usage specifications of TPCs can cause serious consequences, e.g., NULL pointer dereference. Therefore, this massive amount of TPC integrations, if not properly implemented, will lead to pervasive vulnerabilities in IoT devices. Detecting vulnerabilities automatically in TPC integration is challenging from several perspectives: (1) There is a gap between the high-level specifications from TPC documents, and the low-level implementations in the IoT firmware. (2) IoT firmware is mostly the closed-source binary, which loses a lot of information when compiling from the source code and has diverse architectures.
To address these challenges, we design and implement UVScan, an automated and scalable system to detect TPC usage violations in IoT firmware. In UVScan, we first propose a novel natural language processing (NLP)-based rule extraction framework, which extracts API specifications from inconsistently formatted TPC documents. We then design a rule-driven NLP-guided binary analysis engine, which maps the logical information from the high-level TPC document to the low-level binary, and detects TPC usage violations in IoT firmware across different architectures. We evaluate UVScan from four perspectives on four popular TPCs and six ground-truth datasets. The results show that UVScan achieves more than 70% precision and recall, and has a significant performance improvement compared with even the source-level API misuse detectors.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
MINER: A Hybrid Data-Driven Approach for REST API Fuzzing
Authors:
Chenyang Lyu,
Jiacheng Xu,
Shouling Ji,
Xuhong Zhang,
Qinying Wang,
Binbin Zhao,
Gaoning Pan,
Wei Cao,
Raheem Beyah
Abstract:
In recent years, REST API fuzzing has emerged to explore errors on a cloud service. Its performance highly depends on the sequence construction and request generation. However, existing REST API fuzzers have trouble generating long sequences with well-constructed requests to trigger hard-to-reach states in a cloud service, which limits their performance of finding deep errors and security bugs. Fu…
▽ More
In recent years, REST API fuzzing has emerged to explore errors on a cloud service. Its performance highly depends on the sequence construction and request generation. However, existing REST API fuzzers have trouble generating long sequences with well-constructed requests to trigger hard-to-reach states in a cloud service, which limits their performance of finding deep errors and security bugs. Further, they cannot find the specific errors caused by using undefined parameters during request generation. Therefore, in this paper, we propose a novel hybrid data-driven solution, named MINER, with three new designs working together to address the above limitations. First, MINER collects the valid sequences whose requests pass the cloud service's checking as the templates, and assigns more executions to long sequence templates. Second, to improve the generation quality of requests in a sequence template, MINER creatively leverages the state-of-the-art neural network model to predict key request parameters and provide them with appropriate parameter values. Third, MINER implements a new data-driven security rule checker to capture the new kind of errors caused by undefined parameters. We evaluate MINER against the state-of-the-art fuzzer RESTler on GitLab, Bugzilla, and WordPress via 11 REST APIs. The results demonstrate that the average pass rate of MINER is 23.42% higher than RESTler. MINER finds 97.54% more unique errors than RESTler on average and 142.86% more reproducible errors after manual analysis. We have reported all the newly found errors, and 7 of them have been confirmed as logic bugs by the corresponding vendors.
△ Less
Submitted 4 March, 2023;
originally announced March 2023.
-
One Bad Apple Spoils the Barrel: Understanding the Security Risks Introduced by Third-Party Components in IoT Firmware
Authors:
Binbin Zhao,
Shouling Ji,
Jiacheng Xu,
Yuan Tian,
Qiuyang Wei,
Qinying Wang,
Chenyang Lyu,
Xuhong Zhang,
Changting Lin,
Jingzheng Wu,
Raheem Beyah
Abstract:
Currently, the development of IoT firmware heavily depends on third-party components (TPCs) to improve development efficiency. Nevertheless, TPCs are not secure, and the vulnerabilities in TPCs will influence the security of IoT firmware. Existing works pay less attention to the vulnerabilities caused by TPCs, and we still lack a comprehensive understanding of the security impact of TPC vulnerabil…
▽ More
Currently, the development of IoT firmware heavily depends on third-party components (TPCs) to improve development efficiency. Nevertheless, TPCs are not secure, and the vulnerabilities in TPCs will influence the security of IoT firmware. Existing works pay less attention to the vulnerabilities caused by TPCs, and we still lack a comprehensive understanding of the security impact of TPC vulnerability against firmware. To fill in the knowledge gap, we design and implement FirmSec, which leverages syntactical features and control-flow graph features to detect the TPCs in firmware, and then recognizes the corresponding vulnerabilities. Based on FirmSec, we present the first large-scale analysis of the security risks raised by TPCs on $34,136$ firmware images. We successfully detect 584 TPCs and identify 128,757 vulnerabilities caused by 429 CVEs. Our in-depth analysis reveals the diversity of security risks in firmware and discovers some well-known vulnerabilities are still rooted in firmware. Besides, we explore the geographical distribution of vulnerable devices and confirm that the security situation of devices in different regions varies. Our analysis also indicates that vulnerabilities caused by TPCs in firmware keep growing with the boom of the IoT ecosystem. Further analysis shows 2,478 commercial firmware images have potentially violated GPL/AGPL licensing terms.
△ Less
Submitted 28 December, 2022; v1 submitted 28 December, 2022;
originally announced December 2022.
-
MPInspector: A Systematic and Automatic Approach for Evaluating the Security of IoT Messaging Protocols
Authors:
Qinying Wang,
Shouling Ji,
Yuan Tian,
Xuhong Zhang,
Binbin Zhao,
Yuhong Kan,
Zhaowei Lin,
Changting Lin,
Shuiguang Deng,
Alex X. Liu,
Raheem Beyah
Abstract:
Facilitated by messaging protocols (MP), many home devices are connected to the Internet, bringing convenience and accessibility to customers. However, most deployed MPs on IoT platforms are fragmented and are not implemented carefully to support secure communication. To the best of our knowledge, there is no systematic solution to perform automatic security checks on MP implementations yet.
To…
▽ More
Facilitated by messaging protocols (MP), many home devices are connected to the Internet, bringing convenience and accessibility to customers. However, most deployed MPs on IoT platforms are fragmented and are not implemented carefully to support secure communication. To the best of our knowledge, there is no systematic solution to perform automatic security checks on MP implementations yet.
To bridge the gap, we present MPInspector, the first automatic and systematic solution for vetting the security of MP implementations. MPInspector combines model learning with formal analysis and operates in three stages: (a) using parameter semantics extraction and interaction logic extraction to automatically infer the state machine of an MP implementation, (b) generating security properties based on meta properties and the state machine, and (c) applying automatic property based formal verification to identify property violations. We evaluate MPInspector on three popular MPs, including MQTT, CoAP and AMQP, implemented on nine leading IoT platforms. It identifies 252 property violations, leveraging which we further identify eleven types of attacks under two realistic attack scenarios. In addition, we demonstrate that MPInspector is lightweight (the average overhead of end-to-end analysis is ~4.5 hours) and effective with a precision of 100% in identifying property violations.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings
Authors:
Yuhao Mao,
Chong Fu,
Saizhuo Wang,
Shouling Ji,
Xuhong Zhang,
Zhenguang Liu,
Jun Zhou,
Alex X. Liu,
Raheem Beyah,
Ting Wang
Abstract:
One intriguing property of adversarial attacks is their "transferability" -- an adversarial example crafted with respect to one deep neural network (DNN) model is often found effective against other DNNs as well. Intensive research has been conducted on this phenomenon under simplistic controlled conditions. Yet, thus far, there is still a lack of comprehensive understanding about transferability-…
▽ More
One intriguing property of adversarial attacks is their "transferability" -- an adversarial example crafted with respect to one deep neural network (DNN) model is often found effective against other DNNs as well. Intensive research has been conducted on this phenomenon under simplistic controlled conditions. Yet, thus far, there is still a lack of comprehensive understanding about transferability-based attacks ("transfer attacks") in real-world environments.
To bridge this critical gap, we conduct the first large-scale systematic empirical study of transfer attacks against major cloud-based MLaaS platforms, taking the components of a real transfer attack into account. The study leads to a number of interesting findings which are inconsistent to the existing ones, including: (1) Simple surrogates do not necessarily improve real transfer attacks. (2) No dominant surrogate architecture is found in real transfer attacks. (3) It is the gap between posterior (output of the softmax layer) rather than the gap between logit (so-called $κ$ value) that increases transferability. Moreover, by comparing with prior works, we demonstrate that transfer attacks possess many previously unknown properties in real-world environments, such as (1) Model similarity is not a well-defined concept. (2) $L_2$ norm of perturbation can generate high transferability without usage of gradient and is a more powerful source than $L_\infty$ norm. We believe this work sheds light on the vulnerabilities of popular MLaaS platforms and points to a few promising research directions.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers
Authors:
Yuwei Li,
Shouling Ji,
Yuan Chen,
Sizhuang Liang,
Wei-Han Lee,
Yueyao Chen,
Chenyang Lyu,
Chunming Wu,
Raheem Beyah,
Peng Cheng,
Kangjie Lu,
Ting Wang
Abstract:
A flurry of fuzzing tools (fuzzers) have been proposed in the literature, aiming at detecting software vulnerabilities effectively and efficiently. To date, it is however still challenging to compare fuzzers due to the inconsistency of the benchmarks, performance metrics, and/or environments for evaluation, which buries the useful insights and thus impedes the discovery of promising fuzzing primit…
▽ More
A flurry of fuzzing tools (fuzzers) have been proposed in the literature, aiming at detecting software vulnerabilities effectively and efficiently. To date, it is however still challenging to compare fuzzers due to the inconsistency of the benchmarks, performance metrics, and/or environments for evaluation, which buries the useful insights and thus impedes the discovery of promising fuzzing primitives. In this paper, we design and develop UNIFUZZ, an open-source and metrics-driven platform for assessing fuzzers in a comprehensive and quantitative manner. Specifically, UNIFUZZ to date has incorporated 35 usable fuzzers, a benchmark of 20 real-world programs, and six categories of performance metrics. We first systematically study the usability of existing fuzzers, find and fix a number of flaws, and integrate them into UNIFUZZ. Based on the study, we propose a collection of pragmatic performance metrics to evaluate fuzzers from six complementary perspectives. Using UNIFUZZ, we conduct in-depth evaluations of several prominent fuzzers including AFL [1], AFLFast [2], Angora [3], Honggfuzz [4], MOPT [5], QSYM [6], T-Fuzz [7] and VUzzer64 [8]. We find that none of them outperforms the others across all the target programs, and that using a single metric to assess the performance of a fuzzer may lead to unilateral conclusions, which demonstrates the significance of comprehensive metrics. Moreover, we identify and investigate previously overlooked factors that may significantly affect a fuzzer's performance, including instrumentation methods and crash analysis tools. Our empirical results show that they are critical to the evaluation of a fuzzer. We hope that our findings can shed light on reliable fuzzing evaluation, so that we can discover promising fuzzing primitives to effectively facilitate fuzzer designs in the future.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
On Evaluating the Effectiveness of the HoneyBot: A Case Study
Authors:
Celine Irvene,
David Formby,
Raheem Beyah
Abstract:
In recent years, cyber-physical system (CPS) security as applied to robotic systems has become a popular research area. Mainly because robotics systems have traditionally emphasized the completion of a specific objective and lack security oriented design. Our previous work, HoneyBot \cite{celine}, presented the concept and prototype of the first software hybrid interaction honeypot specifically de…
▽ More
In recent years, cyber-physical system (CPS) security as applied to robotic systems has become a popular research area. Mainly because robotics systems have traditionally emphasized the completion of a specific objective and lack security oriented design. Our previous work, HoneyBot \cite{celine}, presented the concept and prototype of the first software hybrid interaction honeypot specifically designed for networked robotic systems. The intuition behind HoneyBot was that it would be a remotely accessible robotic system that could simulate unsafe actions and physically perform safe actions to fool attackers. Unassuming attackers would think they were connected to an ordinary robotic system, believing their exploits were being successfully executed. All the while, the HoneyBot is logging all communications and exploits sent to be used for attacker attribution and threat model creation. In this paper, we present findings from the result of a user study performed to evaluate the effectiveness of the HoneyBot framework and architecture as it applies to real robotic systems. The user study consisted of 40 participants, was conducted over the course of several weeks, and drew from a wide range of participants aged between 18-60 with varying level of technical expertise. From the study we found that research subjects could not tell the difference between the simulated sensor values and the real sensor values coming from the HoneyBot, meaning the HoneyBot convincingly spoofed communications.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
De-Health: All Your Online Health Information Are Belong to Us
Authors:
Shouling Ji,
Qinchen Gu,
Haiqin Weng,
Qianjun Liu,
Qinming He,
Raheem Beyah,
Ting Wang
Abstract:
In this paper, we study the privacy of online health data. We present a novel online health data De-Anonymization (DA) framework, named De-Health. De-Health consists of two phases: Top-K DA, which identifies a candidate set for each anonymized user, and refined DA, which de-anonymizes an anonymized user to a user in its candidate set. By employing both candidate selection and DA verification schem…
▽ More
In this paper, we study the privacy of online health data. We present a novel online health data De-Anonymization (DA) framework, named De-Health. De-Health consists of two phases: Top-K DA, which identifies a candidate set for each anonymized user, and refined DA, which de-anonymizes an anonymized user to a user in its candidate set. By employing both candidate selection and DA verification schemes, De-Health significantly reduces the DA space by several orders of magnitude while achieving promising DA accuracy. Leveraging two real world online health datasets WebMD (89,393 users, 506K posts) and HealthBoards (388,398 users, 4.7M posts), we validate the efficacy of De-Health. Further, when the training data are insufficient, De-Health can still successfully de-anonymize a large portion of anonymized users.
We develop the first analytical framework on the soundness and effectiveness of online health data DA. By analyzing the impact of various data features on the anonymity, we derive the conditions and probabilities for successfully de-anonymizing one user or a group of users in exact DA and Top-K DA. Our analysis is meaningful to both researchers and policy makers in facilitating the development of more effective anonymization techniques and proper privacy polices.
We present a linkage attack framework which can link online health/medical information to real world people. Through a proof-of-concept attack, we link 347 out of 2805 WebMD users to real world people, and find the full names, medical/health information, birthdates, phone numbers, and other sensitive information for most of the re-identified users. This clearly illustrates the fragility of the notion of privacy of those who use online health forums.
△ Less
Submitted 3 June, 2019; v1 submitted 2 February, 2019;
originally announced February 2019.
-
FDI: Quantifying Feature-based Data Inferability
Authors:
Shouling Ji,
Haiqin Weng,
Yiming Wu,
Qinming He,
Raheem Beyah,
Ting Wang
Abstract:
Motivated by many existing security and privacy applications, e.g., network traffic attribution, linkage attacks, private web search, and feature-based data de-anonymization, in this paper, we study the Feature-based Data Inferability (FDI) quantification problem. First, we conduct the FDI quantification under both naive and general data models from both a feature distance perspective and a featur…
▽ More
Motivated by many existing security and privacy applications, e.g., network traffic attribution, linkage attacks, private web search, and feature-based data de-anonymization, in this paper, we study the Feature-based Data Inferability (FDI) quantification problem. First, we conduct the FDI quantification under both naive and general data models from both a feature distance perspective and a feature distribution perspective. Our quantification explicitly shows the conditions to have a desired fraction of the target users to be Top-K inferable (K is an integer parameter). Then, based on our quantification, we evaluate the user inferability in two cases: network traffic attribution in network forensics and feature-based data de-anonymization. Finally, based on the quantification and evaluation, we discuss the implications of this research for existing feature-based inference systems.
△ Less
Submitted 3 June, 2019; v1 submitted 2 February, 2019;
originally announced February 2019.
-
SirenAttack: Generating Adversarial Audio for End-to-End Acoustic Systems
Authors:
Tianyu Du,
Shouling Ji,
Jinfeng Li,
Qinchen Gu,
Ting Wang,
Raheem Beyah
Abstract:
Despite their immense popularity, deep learning-based acoustic systems are inherently vulnerable to adversarial attacks, wherein maliciously crafted audios trigger target systems to misbehave. In this paper, we present SirenAttack, a new class of attacks to generate adversarial audios. Compared with existing attacks, SirenAttack highlights with a set of significant features: (i) versatile -- it is…
▽ More
Despite their immense popularity, deep learning-based acoustic systems are inherently vulnerable to adversarial attacks, wherein maliciously crafted audios trigger target systems to misbehave. In this paper, we present SirenAttack, a new class of attacks to generate adversarial audios. Compared with existing attacks, SirenAttack highlights with a set of significant features: (i) versatile -- it is able to deceive a range of end-to-end acoustic systems under both white-box and black-box settings; (ii) effective -- it is able to generate adversarial audios that can be recognized as specific phrases by target acoustic systems; and (iii) stealthy -- it is able to generate adversarial audios indistinguishable from their benign counterparts to human perception. We empirically evaluate SirenAttack on a set of state-of-the-art deep learning-based acoustic systems (including speech command recognition, speaker recognition and sound event classification), with results showing the versatility, effectiveness, and stealthiness of SirenAttack. For instance, it achieves 99.45% attack success rate on the IEMOCAP dataset against the ResNet18 model, while the generated adversarial audios are also misinterpreted by multiple popular ASR platforms, including Google Cloud Speech, Microsoft Bing Voice, and IBM Speech-to-Text. We further evaluate three potential defense methods to mitigate such attacks, including adversarial training, audio downsampling, and moving average filtering, which leads to promising directions for further research.
△ Less
Submitted 24 July, 2019; v1 submitted 23 January, 2019;
originally announced January 2019.
-
Adversarial CAPTCHAs
Authors:
Chenghui Shi,
Xiaogang Xu,
Shouling Ji,
Kai Bu,
Jianhai Chen,
Raheem Beyah,
Ting Wang
Abstract:
Following the principle of to set one's own spear against one's own shield, we study how to design adversarial CAPTCHAs in this paper. We first identify the similarity and difference between adversarial CAPTCHA generation and existing hot adversarial example (image) generation research. Then, we propose a framework for text-based and image-based adversarial CAPTCHA generation on top of state-of-th…
▽ More
Following the principle of to set one's own spear against one's own shield, we study how to design adversarial CAPTCHAs in this paper. We first identify the similarity and difference between adversarial CAPTCHA generation and existing hot adversarial example (image) generation research. Then, we propose a framework for text-based and image-based adversarial CAPTCHA generation on top of state-of-the-art adversarial image generation techniques. Finally, we design and implement an adversarial CAPTCHA generation and evaluation system, named aCAPTCHA, which integrates 10 image preprocessing techniques, 9 CAPTCHA attacks, 4 baseline adversarial CAPTCHA generation methods, and 8 new adversarial CAPTCHA generation methods. To examine the performance of aCAPTCHA, extensive security and usability evaluations are conducted. The results demonstrate that the generated adversarial CAPTCHAs can significantly improve the security of normal CAPTCHAs while maintaining similar usability. To facilitate the CAPTCHA security research, we also open source the aCAPTCHA system, including the source code, trained models, datasets, and the usability evaluation interfaces.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.
-
Checking is Believing: Event-Aware Program Anomaly Detection in Cyber-Physical Systems
Authors:
Long Cheng,
Ke Tian,
Danfeng Yao,
Lui Sha,
Raheem A. Beyah
Abstract:
Securing cyber-physical systems (CPS) against malicious attacks is of paramount importance because these attacks may cause irreparable damages to physical systems. Recent studies have revealed that control programs running on CPS devices suffer from both control-oriented attacks (e.g., code-injection or code-reuse attacks) and data-oriented attacks (e.g., non-control data attacks). Unfortunately,…
▽ More
Securing cyber-physical systems (CPS) against malicious attacks is of paramount importance because these attacks may cause irreparable damages to physical systems. Recent studies have revealed that control programs running on CPS devices suffer from both control-oriented attacks (e.g., code-injection or code-reuse attacks) and data-oriented attacks (e.g., non-control data attacks). Unfortunately, existing detection mechanisms are insufficient to detect runtime data-oriented exploits, due to the lack of runtime execution semantics checking. In this work, we propose Orpheus, a new security methodology for defending against data-oriented attacks by enforcing cyber-physical execution semantics. We first present a general method for reasoning cyber-physical execution semantics of a control program (i.e., causal dependencies between the physical context and program control flows), including the event identification and dependence analysis. As an instantiation of Orpheus, we then present a new program behavior model, i.e., the event-aware finite-state automaton (eFSA). eFSA takes advantage of the event-driven nature of CPS control programs and incorporates event checking in anomaly detection. It detects data-oriented exploits if a specific physical event is missing along with the corresponding event dependent state transition. We evaluate our prototype's performance by conducting case studies under data-oriented attacks. Results show that eFSA can successfully detect different runtime attacks. Our prototype on Raspberry Pi incurs a low overhead, taking 0.0001s for each state transition integrity checking, and 0.063s~0.211s for the cyber-physical contextual consistency checking.
△ Less
Submitted 24 March, 2019; v1 submitted 30 April, 2018;
originally announced May 2018.
-
Blindsight: Blinding EM Side-Channel Leakage using Built-In Fully Integrated Inductive Voltage Regulator
Authors:
Monodeep Kar,
Arvind Singh,
Sanu Mathew,
Santosh Ghosh,
Anand Rajan,
Vivek De,
Raheem Beyah,
Saibal Mukhopadhyay
Abstract:
Modern high-performance as well as power-constrained System-on-Chips (SoC) are increasingly using hardware accelerated encryption engines to secure computation, memory access, and communication operations. The electromagnetic (EM) emission from a chip leaks information of the underlying logical operations and can be collected using low-cost non-invasive measurements. EM based side-channel attacks…
▽ More
Modern high-performance as well as power-constrained System-on-Chips (SoC) are increasingly using hardware accelerated encryption engines to secure computation, memory access, and communication operations. The electromagnetic (EM) emission from a chip leaks information of the underlying logical operations and can be collected using low-cost non-invasive measurements. EM based side-channel attacks (EMSCA) have emerged as a major threat to security of encryption engines in a SoC. This paper presents the concept of Blindsight where a high-frequency inductive voltage regulator (IVR) integrated on the same chip with an encryption engine is used to increase resistance against EMSCA. High-frequency (~100MHz) IVRs are present in modern microprocessors to improve energy-efficiency. We show that an IVR with a randomized control loop (R-IVR) can reduce EMSCA as the integrated inductance acts as a strong EM emitter and blinds an adversary from EM emission of the encryption engine. The EM measurements are performed on a test-chip containing two architectures of a 128-bit Advanced Encryption Standard (AES) engine powered by a high-frequency R-IVR and under two attack scenarios, one, where an adversary gains complete physical access of the target device and the other, where the adversary is only in proximity of the device. In both attack modes, an adversary can observe information leakage in Test Vector Leakage Assessment (TVLA) test in a baseline IVR (B-IVR, without control loop randomization). However, we show that EM emission from the R-IVR blinds the attacker and significantly reduces SCA vulnerability of the AES engine. A range of practical side-channel analysis including TVLA, Correlation Electromagnetic Analysis (CEMA), and a template based CEMA shows that R-IVR can reduce information leakage and prevent key extraction even against a skilled adversary.
△ Less
Submitted 25 February, 2018;
originally announced February 2018.