Search | arXiv e-print repository

Do Concept Replacement Techniques Really Erase Unacceptable Concepts?

Authors: Anudeep Das, Gurjot Singh, Prach Chantasantitam, N. Asokan

Abstract: Generative models, particularly diffusion-based text-to-image (T2I) models, have demonstrated astounding success. However, aligning them to avoid generating content with unacceptable concepts (e.g., offensive or copyrighted content, or celebrity likenesses) remains a significant challenge. Concept replacement techniques (CRTs) aim to address this challenge, often by trying to "erase" unacceptable… ▽ More Generative models, particularly diffusion-based text-to-image (T2I) models, have demonstrated astounding success. However, aligning them to avoid generating content with unacceptable concepts (e.g., offensive or copyrighted content, or celebrity likenesses) remains a significant challenge. Concept replacement techniques (CRTs) aim to address this challenge, often by trying to "erase" unacceptable concepts from models. Recently, model providers have started offering image editing services which accept an image and a text prompt as input, to produce an image altered as specified by the prompt. These are known as image-to-image (I2I) models. In this paper, we first use an I2I model to empirically demonstrate that today's state-of-the-art CRTs do not in fact erase unacceptable concepts. Existing CRTs are thus likely to be ineffective in emerging I2I scenarios, despite their proven ability to remove unwanted concepts in T2I pipelines, highlighting the need to understand this discrepancy between T2I and I2I settings. Next, we argue that a good CRT, while replacing unacceptable concepts, should preserve other concepts specified in the inputs to generative models. We call this fidelity. Prior work on CRTs have neglected fidelity in the case of unacceptable concepts. Finally, we propose the use of targeted image-editing techniques to achieve both effectiveness and fidelity. We present such a technique, AntiMirror, and demonstrate its viability. △ Less

Submitted 10 June, 2025; originally announced June 2025.

arXiv:2504.14654 [pdf, other]

BLACKOUT: Data-Oblivious Computation with Blinded Capabilities

Authors: Hossam ElAtali, Merve Gülmez, Thomas Nyman, N. Asokan

Abstract: Lack of memory-safety and exposure to side channels are two prominent, persistent challenges for the secure implementation of software. Memory-safe programming languages promise to significantly reduce the prevalence of memory-safety bugs, but make it more difficult to implement side-channel-resistant code. We aim to address both memory-safety and side-channel resistance by augmenting memory-safe… ▽ More Lack of memory-safety and exposure to side channels are two prominent, persistent challenges for the secure implementation of software. Memory-safe programming languages promise to significantly reduce the prevalence of memory-safety bugs, but make it more difficult to implement side-channel-resistant code. We aim to address both memory-safety and side-channel resistance by augmenting memory-safe hardware with the ability for data-oblivious programming. We describe an extension to the CHERI capability architecture to provide blinded capabilities that allow data-oblivious computation to be carried out by userspace tasks. We also present BLACKOUT, our realization of blinded capabilities on a FPGA softcore based on the speculative out-of-order CHERI-Toooba processor and extend the CHERI-enabled Clang/LLVM compiler and the CheriBSD operating system with support for blinded capabilities. BLACKOUT makes writing side-channel-resistant code easier by making non-data-oblivious operations via blinded capabilities explicitly fault. Through rigorous evaluation we show that BLACKOUT ensures memory operated on through blinded capabilities is securely allocated, used, and reclaimed and demonstrate that, in benchmarks comparable to those used by previous work, BLACKOUT imposes only a small performance degradation (1.5% geometric mean) compared to the baseline CHERI-Toooba processor. △ Less

Submitted 27 May, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

arXiv:2411.09776 [pdf, ps, other]

Combining Machine Learning Defenses without Conflicts

Authors: Vasisht Duddu, Rui Zhang, N. Asokan

Abstract: Machine learning (ML) defenses protect against various risks to security, privacy, and fairness. Real-life models need simultaneous protection against multiple different risks which necessitates combining multiple defenses. But combining defenses with conflicting interactions in an ML model can be ineffective, incurring a significant drop in the effectiveness of one or more defenses being combined… ▽ More Machine learning (ML) defenses protect against various risks to security, privacy, and fairness. Real-life models need simultaneous protection against multiple different risks which necessitates combining multiple defenses. But combining defenses with conflicting interactions in an ML model can be ineffective, incurring a significant drop in the effectiveness of one or more defenses being combined. Practitioners need a way to determine if a given combination can be effective. Experimentally identifying effective combinations can be time-consuming and expensive, particularly when multiple defenses need to be combined. We need an inexpensive, easy-to-use combination technique to identify effective combinations. Ideally, a combination technique should be (a) accurate (correctly identifies whether a combination is effective or not), (b) scalable (allows combining multiple defenses), (c) non-invasive (requires no change to the defenses being combined), and (d) general (is applicable to different types of defenses). Prior works have identified several ad-hoc techniques but none satisfy all the requirements above. We propose a principled combination technique, Def\Con, to identify effective defense combinations. Def\Con meets all requirements, achieving 90% accuracy on eight combinations explored in prior work and 81% in 30 previously unexplored combinations that we empirically evaluate in this paper. △ Less

Submitted 14 November, 2024; originally announced November 2024.

arXiv:2406.17548 [pdf, other]

Laminator: Verifiable ML Property Cards using Hardware-assisted Attestations

Authors: Vasisht Duddu, Oskari Järvinen, Lachlan J Gunn, N Asokan

Abstract: Regulations increasingly call for various assurances from machine learning (ML) model providers about their training data, training process, and model behavior. For better transparency, industry (e.g., Huggingface and Google) has adopted model cards and datasheets to describe various properties of training datasets and models. In the same vein, we introduce the notion of inference cards to describ… ▽ More Regulations increasingly call for various assurances from machine learning (ML) model providers about their training data, training process, and model behavior. For better transparency, industry (e.g., Huggingface and Google) has adopted model cards and datasheets to describe various properties of training datasets and models. In the same vein, we introduce the notion of inference cards to describe the properties of a given inference (e.g., binding of the output to the model and its corresponding input). We coin the term ML property cards to collectively refer to these various types of cards. To prevent a malicious model provider from including false information in ML property cards, they need to be verifiable. We show how to construct verifiable ML property cards using property attestation, technical mechanisms by which a prover (e.g., a model provider) can attest to various ML properties to a verifier (e.g., an auditor). Since prior attestation mechanisms based purely on cryptography are often narrowly focused (lacking versatility) and inefficient, we need an efficient mechanism to attest different types of properties across the entire ML model pipeline. Emerging widespread support for confidential computing has made it possible to run and even train models inside hardware-assisted trusted execution environments (TEEs), which provide highly efficient attestation mechanisms. We propose Laminator, which uses TEEs to provide the first framework for verifiable ML property cards via hardware-assisted ML property attestations. Laminator is efficient in terms of overhead, scalable to large numbers of verifiers, and versatile with respect to the properties it can prove during training or inference. △ Less

Submitted 5 March, 2025; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: ACM Conference on Data and Application Security and Privacy (CODASPY), 2025

arXiv:2406.15302 [pdf, other]

BliMe Linter

Authors: Hossam ElAtali, Xiaohe Duan, Hans Liljestrand, Meng Xu, N. Asokan

Abstract: Outsourced computation presents a risk to the confidentiality of clients' sensitive data since they have to trust that the service providers will not mishandle this data. Blinded Memory (BliMe) is a set of hardware extensions that addresses this problem by using hardware-based taint tracking to keep track of sensitive client data and enforce a security policy that prevents software from leaking th… ▽ More Outsourced computation presents a risk to the confidentiality of clients' sensitive data since they have to trust that the service providers will not mishandle this data. Blinded Memory (BliMe) is a set of hardware extensions that addresses this problem by using hardware-based taint tracking to keep track of sensitive client data and enforce a security policy that prevents software from leaking this data, either directly or through side channels. Since programs can leak sensitive data through timing channels and memory access patterns when this data is used in control-flow or memory access instructions, BliMe prohibits such unsafe operations and only allows constant-time code to operate on sensitive data. The question is how a developer can confirm that their code will run correctly on BliMe. While a program can be manually checked to see if it is constant-time, this process is tedious and error-prone. In this paper, we introduce the BliMe linter, a set of compiler extensions built on top of SVF that analyze LLVM bitcode to identify possible BliMe violations. We evaluate the BliMe linter analytically and empirically and show that it is sound. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.12110 [pdf, other]

CacheSquash: Making caches speculation-aware

Authors: Hossam ElAtali, N. Asokan

Abstract: Speculation is key to achieving high CPU performance, yet it enables risks like Spectre attacks which remain a significant challenge to mitigate without incurring substantial performance overheads. These attacks typically unfold in three stages: access, transmit, and receive. Typically, they exploit a cache timing side channel during the transmit and receive phases: speculatively accessing sensiti… ▽ More Speculation is key to achieving high CPU performance, yet it enables risks like Spectre attacks which remain a significant challenge to mitigate without incurring substantial performance overheads. These attacks typically unfold in three stages: access, transmit, and receive. Typically, they exploit a cache timing side channel during the transmit and receive phases: speculatively accessing sensitive data (access), altering cache state (transmit), and then utilizing a cache timing attack (e.g., Flush+Reload) to extract the secret (receive). Our key observation is that Spectre attacks only require the transmit instruction to execute and dispatch a request to the cache hierarchy. It need not complete before a misprediction is detected (and mis-speculated instructions squashed) because responses from memory that arrive at the cache after squashing still alter cache state. We propose a novel mitigation, CacheSquash, that cancels mis-speculated memory accesses. Immediately upon squashing, a cancellation is sent to the cache hierarchy, propagating downstream and preventing any changes to caches that have not yet received a response. This minimizes cache state changes, thereby reducing the likelihood of Spectre attacks succeeding. We implement CacheSquash on gem5 and show that it thwarts practical Spectre attacks, with near-zero performance overheads. △ Less

Submitted 8 May, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

arXiv:2404.19227 [pdf, other]

Espresso: Robust Concept Filtering in Text-to-Image Models

Authors: Anudeep Das, Vasisht Duddu, Rui Zhang, N. Asokan

Abstract: Diffusion based text-to-image models are trained on large datasets scraped from the Internet, potentially containing unacceptable concepts (e.g., copyright-infringing or unsafe). We need concept removal techniques (CRTs) which are i) effective in preventing the generation of images with unacceptable concepts, ii) utility-preserving on acceptable concepts, and, iii) robust against evasion with adve… ▽ More Diffusion based text-to-image models are trained on large datasets scraped from the Internet, potentially containing unacceptable concepts (e.g., copyright-infringing or unsafe). We need concept removal techniques (CRTs) which are i) effective in preventing the generation of images with unacceptable concepts, ii) utility-preserving on acceptable concepts, and, iii) robust against evasion with adversarial prompts. No prior CRT satisfies all these requirements simultaneously. We introduce Espresso, the first robust concept filter based on Contrastive Language-Image Pre-Training (CLIP). We identify unacceptable concepts by using the distance between the embedding of a generated image to the text embeddings of both unacceptable and acceptable concepts. This lets us fine-tune for robustness by separating the text embeddings of unacceptable and acceptable concepts while preserving utility. We present a pipeline to evaluate various CRTs to show that Espresso is more effective and robust than prior CRTs, while retaining utility. △ Less

Submitted 26 February, 2025; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: ACM Conference on Data and Application Security and Privacy (CODASPY), 2025

arXiv:2402.03373 [pdf, other]

SeMalloc: Semantics-Informed Memory Allocator

Authors: Ruizhe Wang, Meng Xu, N. Asokan

Abstract: Use-after-free (UAF) is a critical and prevalent problem in memory unsafe languages. While many solutions have been proposed, balancing security, run-time cost, and memory overhead (an impossible trinity) is hard. In this paper, we show one way to balance the trinity by passing more semantics about the heap object to the allocator for it to make informed allocation decisions. More specifically,… ▽ More Use-after-free (UAF) is a critical and prevalent problem in memory unsafe languages. While many solutions have been proposed, balancing security, run-time cost, and memory overhead (an impossible trinity) is hard. In this paper, we show one way to balance the trinity by passing more semantics about the heap object to the allocator for it to make informed allocation decisions. More specifically, we propose a new notion of thread-, context-, and flow-sensitive "type", SemaType, to capture the semantics and prototype a SemaType-based allocator that aims for the best trade-off amongst the impossible trinity. In SeMalloc, only heap objects allocated from the same call site and via the same function call stack can possibly share a virtual memory address, which effectively stops type-confusion attacks and makes UAF vulnerabilities harder to exploit. Through extensive empirical evaluation, we show that SeMalloc is realistic: (a) SeMalloc is effective in thwarting all real-world vulnerabilities we tested; (b) benchmark programs run even slightly faster with SeMalloc than the default heap allocator, at a memory overhead averaged from 41% to 84%; and (c) SeMalloc balances security and overhead strictly better than other closely related works. △ Less

Submitted 22 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to ACM CCS 2024, camera-ready version under preparation

arXiv:2402.01894 [pdf, other]

S2malloc: Statistically Secure Allocator for Use-After-Free Protection And More

Authors: Ruizhe Wang, Meng Xu, N. Asokan

Abstract: Attacks on heap memory, encompassing memory overflow, double and invalid free, use-after-free (UAF), and various heap spraying techniques are ever-increasing. Existing entropy-based secure memory allocators provide statistical defenses against virtually all of these attack vectors. Although they claim protections against UAF attacks, their designs are not tailored to detect (failed) attempts. Cons… ▽ More Attacks on heap memory, encompassing memory overflow, double and invalid free, use-after-free (UAF), and various heap spraying techniques are ever-increasing. Existing entropy-based secure memory allocators provide statistical defenses against virtually all of these attack vectors. Although they claim protections against UAF attacks, their designs are not tailored to detect (failed) attempts. Consequently, to beat this entropy-based protection, an attacker can simply launch the same attack repeatedly with the potential use of heap spraying to further improve their chance of success. We introduce S2malloc, aiming to enhance UAF-attempt detection without compromising other security guarantees or introducing significant performance overhead. To achieve this, we use three innovative constructs in secure allocator design: free block canaries (FBC) to detect UAF attempts, random in-block offset (RIO) to stop the attacker from accurately overwriting the victim object, and random bag layout (RBL) to impede attackers from estimating the block size based on its address. We show that (a) by reserving 25% of the object size for the RIO offset, an 8-byte canary offers a 69% protection rate if the attacker reuses the same pointer and 96% protection rate if the attacker does not, against UAF exploitation attempts targeting a 64 bytes object, with equal or higher security guarantees against all other attacks; and (b) S2malloc is practical, with only a 2.8% run-time overhead on PARSEC and an 11.5% overhead on SPEC. Compared to state-of-the-art entropy-based allocators, S2malloc improves UAF-protection without incurring additional performance overhead. Compared to UAF-mitigating allocators, S2malloc trades off a minuscule probability of failed protection for significantly lower overhead. △ Less

Submitted 29 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted at DIMVA 2024, this is the extended version

arXiv:2401.16583 [pdf, other]

doi 10.1109/HOST55342.2024.10545398

Data-Oblivious ML Accelerators using Hardware Security Extensions

Authors: Hossam ElAtali, John Z. Jekel, Lachlan J. Gunn, N. Asokan

Abstract: Outsourced computation can put client data confidentiality at risk. Existing solutions are either inefficient or insufficiently secure: cryptographic techniques like fully-homomorphic encryption incur significant overheads, even with hardware assistance, while the complexity of hardware-assisted trusted execution environments has been exploited to leak secret data. Recent proposals such as BliMe… ▽ More Outsourced computation can put client data confidentiality at risk. Existing solutions are either inefficient or insufficiently secure: cryptographic techniques like fully-homomorphic encryption incur significant overheads, even with hardware assistance, while the complexity of hardware-assisted trusted execution environments has been exploited to leak secret data. Recent proposals such as BliMe and OISA show how dynamic information flow tracking (DIFT) enforced in hardware can protect client data efficiently. They are designed to protect CPU-only workloads. However, many outsourced computing applications, like machine learning, make extensive use of accelerators. We address this gap with Dolma, which applies DIFT to the Gemmini matrix multiplication accelerator, efficiently guaranteeing client data confidentiality, even in the presence of malicious/vulnerable software and side channel attacks on the server. We show that accelerators can allow DIFT logic optimizations that significantly reduce area overhead compared with general-purpose processor architectures. Dolma is integrated with the BliMe framework to achieve end-to-end security guarantees. We evaluate Dolma on an FPGA using a ResNet-50 DNN model and show that it incurs low overheads for large configurations ($4.4\%$, $16.7\%$, $16.5\%$ for performance, resource usage and power, respectively, with a 32x32 configuration). △ Less

Submitted 29 January, 2024; originally announced January 2024.

Journal ref: IEEE International Symposium on Hardware Oriented Security and Trust (HOST), 2024, pp. 373-377

arXiv:2401.15828 [pdf, other]

The Spectre of Surveillance and Censorship in Future Internet Architectures

Authors: Michael Wrana, Diogo Barradas, N. Asokan

Abstract: Recent initiatives known as Future Internet Architectures (FIAs) seek to redesign the Internet to improve performance, scalability, and security. However, some governments perceive Internet access as a threat to their political standing and engage in widespread network surveillance and censorship. In this paper, we provide an in-depth analysis of the design principles of prominent FIAs in terms of… ▽ More Recent initiatives known as Future Internet Architectures (FIAs) seek to redesign the Internet to improve performance, scalability, and security. However, some governments perceive Internet access as a threat to their political standing and engage in widespread network surveillance and censorship. In this paper, we provide an in-depth analysis of the design principles of prominent FIAs in terms of their packet structure, addressing and naming schemes, and routing protocols to foster discussion on how these new systems interact with censorship and surveillance apparatuses. Further, we assess the extent to which existing surveillance and censorship mechanisms can successfully target FIA users while discussing privacy enhancing technologies to counter these mechanisms. We conclude by providing guidelines for future research into novel FIA-based privacy-enhancing technologies, and recommendations to guide the evaluation of these technologies. △ Less

Submitted 29 January, 2025; v1 submitted 28 January, 2024; originally announced January 2024.

arXiv:2312.04542 [pdf, other]

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Authors: Vasisht Duddu, Sebastian Szyller, N. Asokan

Abstract: Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framew… ▽ More Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures. △ Less

Submitted 4 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: IEEE Symposium on Security and Privacy (S&P) 2024

arXiv:2308.09552 [pdf, other]

Attesting Distributional Properties of Training Data for Machine Learning

Authors: Vasisht Duddu, Anudeep Das, Nora Khayata, Hossein Yalame, Thomas Schneider, N. Asokan

Abstract: The success of machine learning (ML) has been accompanied by increased concerns about its trustworthiness. Several jurisdictions are preparing ML regulatory frameworks. One such concern is ensuring that model training data has desirable distributional properties for certain sensitive attributes. For example, draft regulations indicate that model trainers are required to show that training datasets… ▽ More The success of machine learning (ML) has been accompanied by increased concerns about its trustworthiness. Several jurisdictions are preparing ML regulatory frameworks. One such concern is ensuring that model training data has desirable distributional properties for certain sensitive attributes. For example, draft regulations indicate that model trainers are required to show that training datasets have specific distributional properties, such as reflecting diversity of the population. We propose the notion of property attestation allowing a prover (e.g., model trainer) to demonstrate relevant distributional properties of training data to a verifier (e.g., a customer) without revealing the data. We present an effective hybrid property attestation combining property inference with cryptographic mechanisms. △ Less

Submitted 9 April, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

Comments: European Symposium on Research in Computer Security (ESORICS), 2024

arXiv:2308.06587 [pdf, other]

doi 10.1145/3597503.3639154

A User-centered Security Evaluation of Copilot

Authors: Owura Asare, Meiyappan Nagappan, N. Asokan

Abstract: Code generation tools driven by artificial intelligence have recently become more popular due to advancements in deep learning and natural language processing that have increased their capabilities. The proliferation of these tools may be a double-edged sword because while they can increase developer productivity by making it easier to write code, research has shown that they can also generate ins… ▽ More Code generation tools driven by artificial intelligence have recently become more popular due to advancements in deep learning and natural language processing that have increased their capabilities. The proliferation of these tools may be a double-edged sword because while they can increase developer productivity by making it easier to write code, research has shown that they can also generate insecure code. In this paper, we perform a user-centered evaluation GitHub's Copilot to better understand its strengths and weaknesses with respect to code security. We conduct a user study where participants solve programming problems (with and without Copilot assistance) that have potentially vulnerable solutions. The main goal of the user study is to determine how the use of Copilot affects participants' security performance. In our set of participants (n=25), we find that access to Copilot accompanies a more secure solution when tackling harder problems. For the easier problem, we observe no effect of Copilot access on the security of solutions. We also observe no disproportionate impact of Copilot use on particular kinds of vulnerabilities. Our results indicate that there are potential security benefits to using Copilot, but more research is warranted on the effects of the use of code generation tools on technically complex problems with security requirements. △ Less

Submitted 5 January, 2024; v1 submitted 12 August, 2023; originally announced August 2023.

Comments: To be published in ICSE 2024 Research Track

arXiv:2307.14751 [pdf, other]

FLARE: Fingerprinting Deep Reinforcement Learning Agents using Universal Adversarial Masks

Authors: Buse G. A. Tekgul, N. Asokan

Abstract: We propose FLARE, the first fingerprinting mechanism to verify whether a suspected Deep Reinforcement Learning (DRL) policy is an illegitimate copy of another (victim) policy. We first show that it is possible to find non-transferable, universal adversarial masks, i.e., perturbations, to generate adversarial examples that can successfully transfer from a victim policy to its modified versions but… ▽ More We propose FLARE, the first fingerprinting mechanism to verify whether a suspected Deep Reinforcement Learning (DRL) policy is an illegitimate copy of another (victim) policy. We first show that it is possible to find non-transferable, universal adversarial masks, i.e., perturbations, to generate adversarial examples that can successfully transfer from a victim policy to its modified versions but not to independently trained policies. FLARE employs these masks as fingerprints to verify the true ownership of stolen DRL policies by measuring an action agreement value over states perturbed by such masks. Our empirical evaluations show that FLARE is effective (100% action agreement on stolen copies) and does not falsely accuse independent policies (no false positives). FLARE is also robust to model modification attacks and cannot be easily evaded by more informed adversaries without negatively impacting agent performance. We also show that not all universal adversarial masks are suitable candidates for fingerprints due to the inherent characteristics of DRL policies. The spatio-temporal dynamics of DRL problems and sequential decision-making process make characterizing the decision boundary of DRL policies more difficult, as well as searching for universal masks that capture the geometry of it. △ Less

Submitted 25 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

Comments: Will appear in the proceedings of ACSAC 2023; 14 pages, 6 figures, 8 tables

arXiv:2306.05007 [pdf, other]

Parallel and Asynchronous Smart Contract Execution

Authors: Jian Liu, Peilun Li, Raymond~Cheng, N. Asokan, Dawn Song

Abstract: Today's blockchains suffer from low throughput and high latency, which impedes their widespread adoption of more complex applications like smart contracts. In this paper, we propose a novel paradigm for smart contract execution. It distinguishes between consensus nodes and execution nodes: different groups of execution nodes can execute transactions in parallel; meanwhile, consensus nodes can asyn… ▽ More Today's blockchains suffer from low throughput and high latency, which impedes their widespread adoption of more complex applications like smart contracts. In this paper, we propose a novel paradigm for smart contract execution. It distinguishes between consensus nodes and execution nodes: different groups of execution nodes can execute transactions in parallel; meanwhile, consensus nodes can asynchronously order transactions and process execution results. Moreover, it requires no coordination among execution nodes and can effectively prevent livelocks. We show two ways of applying this paradigm to blockchains. First, we show how we can make Ethereum support parallel and asynchronous contract execution \emph{without hard-forks}. Then, we propose a new public, permissionless blockchain. Our benchmark shows that, with a fast consensus layer, it can provide a high throughput even for complex transactions like Cryptokitties gene mixing. It can also protect simple transactions from being starved by complex transactions. △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2304.08566 [pdf, other]

GrOVe: Ownership Verification of Graph Neural Networks using Embeddings

Authors: Asim Waheed, Vasisht Duddu, N. Asokan

Abstract: Graph neural networks (GNNs) have emerged as a state-of-the-art approach to model and draw inferences from large scale graph-structured data in various application settings such as social networking. The primary goal of a GNN is to learn an embedding for each graph node in a dataset that encodes both the node features and the local graph structure around the node. Embeddings generated by a GNN for… ▽ More Graph neural networks (GNNs) have emerged as a state-of-the-art approach to model and draw inferences from large scale graph-structured data in various application settings such as social networking. The primary goal of a GNN is to learn an embedding for each graph node in a dataset that encodes both the node features and the local graph structure around the node. Embeddings generated by a GNN for a graph node are unique to that GNN. Prior work has shown that GNNs are prone to model extraction attacks. Model extraction attacks and defenses have been explored extensively in other non-graph settings. While detecting or preventing model extraction appears to be difficult, deterring them via effective ownership verification techniques offer a potential defense. In non-graph settings, fingerprinting models, or the data used to build them, have shown to be a promising approach toward ownership verification. We present GrOVe, a state-of-the-art GNN model fingerprinting scheme that, given a target model and a suspect model, can reliably determine if the suspect model was trained independently of the target model or if it is a surrogate of the target model obtained via model extraction. We show that GrOVe can distinguish between surrogate and independent models even when the independent model uses the same training dataset and architecture as the original target model. Using six benchmark datasets and three model architectures, we show that consistently achieves low false-positive and false-negative rates. We demonstrate that is robust against known fingerprint evasion techniques while remaining computationally efficient. △ Less

Submitted 1 September, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

Comments: To appear in the IEEE Symposium on Security and Privacy, 2024. 12 pages, 5 figures

arXiv:2304.06607 [pdf, other]

False Claims against Model Ownership Resolution

Authors: Jian Liu, Rui Zhang, Sebastian Szyller, Kui Ren, N. Asokan

Abstract: Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as… ▽ More Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as a watermark or fingerprint, to show that the suspect model was stolen or derived from a source model owned by the accuser. Most of the existing MOR schemes prioritize robustness against malicious suspects, ensuring that the accuser will win if the suspect model is indeed a stolen model. In this paper, we show that common MOR schemes in the literature are vulnerable to a different, equally important but insufficiently explored, robustness concern: a malicious accuser. We show how malicious accusers can successfully make false claims against independent suspect models that were not stolen. Our core idea is that a malicious accuser can deviate (without detection) from the specified MOR process by finding (transferable) adversarial examples that successfully serve as evidence against independent suspect models. To this end, we first generalize the procedures of common MOR schemes and show that, under this generalization, defending against false claims is as challenging as preventing (transferable) adversarial examples. Via systematic empirical evaluation, we show that our false claim attacks always succeed in the MOR schemes that follow our generalization, including in a real-world model: Amazon's Rekognition API. △ Less

Submitted 9 April, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: 13pages,3 figures. To appear in the 33rd USENIX Security Symposium (USENIX Security '24)

arXiv:2210.13631 [pdf, other]

On the Robustness of Dataset Inference

Authors: Sebastian Szyller, Rui Zhang, Jian Liu, N. Asokan

Abstract: Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise. Thus, they constitute valuable intellectual property that needs protection from adversaries wanting to steal them. Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from… ▽ More Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise. Thus, they constitute valuable intellectual property that needs protection from adversaries wanting to steal them. Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from theirs. Although a number of ownership verification techniques based on watermarking or fingerprinting have been proposed, most of them fall short either in terms of security guarantees (well-equipped adversaries can evade verification) or computational cost. A fingerprinting technique, Dataset Inference (DI), has been shown to offer better robustness and efficiency than prior methods. The authors of DI provided a correctness proof for linear (suspect) models. However, in a subspace of the same setting, we prove that DI suffers from high false positives (FPs) -- it can incorrectly identify an independent model trained with non-overlapping data from the same distribution as stolen. We further prove that DI also triggers FPs in realistic, non-linear suspect models. We then confirm empirically that DI in the black-box setting leads to FPs, with high confidence. Second, we show that DI also suffers from false negatives (FNs) -- an adversary can fool DI (at the cost of incurring some accuracy loss) by regularising a stolen model's decision boundaries using adversarial training, thereby leading to an FN. To this end, we demonstrate that black-box DI fails to identify a model adversarially trained from a stolen dataset -- the setting where DI is the hardest to evade. Finally, we discuss the implications of our findings, the viability of fingerprinting-based ownership verification in general, and suggest directions for future work. △ Less

Submitted 19 June, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

Comments: 19 pages; Accepted to Transactions on Machine Learning Research 06/2023

arXiv:2210.11340 [pdf, ps, other]

Towards cryptographically-authenticated in-memory data structures

Authors: Setareh Ghorshi, Lachlan J. Gunn, Hans Liljestrand, N. Asokan

Abstract: Modern processors include high-performance cryptographic functionalities such as Intel's AES-NI and ARM's Pointer Authentication that allow programs to efficiently authenticate data held by the program. Pointer Authentication is already used to protect return addresses in recent Apple devices, but as yet these structures have seen little use for the protection of general program data. In this pa… ▽ More Modern processors include high-performance cryptographic functionalities such as Intel's AES-NI and ARM's Pointer Authentication that allow programs to efficiently authenticate data held by the program. Pointer Authentication is already used to protect return addresses in recent Apple devices, but as yet these structures have seen little use for the protection of general program data. In this paper, we show how cryptographically-authenticated data structures can be used to protect against attacks based on memory corruption, and show how they can be efficiently realized using widely available hardware-assisted cryptographic mechanisms. We present realizations of secure stacks and queues with minimal overall performance overhead (3.4%-6.4% slowdown of the OpenCV core performance tests), and provide proofs of correctness. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2207.01991 [pdf, other]

Conflicting Interactions Among Protection Mechanisms for Machine Learning Models

Authors: Sebastian Szyller, N. Asokan

Abstract: Nowadays, systems based on machine learning (ML) are widely used in different domains. Given their popularity, ML models have become targets for various attacks. As a result, research at the intersection of security/privacy and ML has flourished. Typically such work has focused on individual types of security/privacy concerns and mitigations thereof. However, in real-life deployments, an ML model… ▽ More Nowadays, systems based on machine learning (ML) are widely used in different domains. Given their popularity, ML models have become targets for various attacks. As a result, research at the intersection of security/privacy and ML has flourished. Typically such work has focused on individual types of security/privacy concerns and mitigations thereof. However, in real-life deployments, an ML model will need to be protected against several concerns simultaneously. A protection mechanism optimal for one security or privacy concern may interact negatively with mechanisms intended to address other concerns. Despite its practical relevance, the potential for such conflicts has not been studied adequately. We first provide a framework for analyzing such "conflicting interactions". We then focus on systematically analyzing pairwise interactions between protection mechanisms for one concern, model and data ownership verification, with two other classes of ML protection mechanisms: differentially private training, and robustness against model evasion. We find that several pairwise interactions result in conflicts. We explore potential approaches for avoiding such conflicts. First, we study the effect of hyperparameter relaxations, finding that there is no sweet spot balancing the performance of both protection mechanisms. Second, we explore if modifying one type of protection mechanism (ownership verification) so as to decouple it from factors that may be impacted by a conflicting mechanism (differentially private training or robustness to model evasion) can avoid conflict. We show that this approach can avoid the conflict between ownership verification mechanisms when combined with differentially private training, but has no effect on robustness to model evasion. Finally, we identify the gaps in the landscape of studying interactions between other types of ML protection mechanisms. △ Less

Submitted 21 November, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: To appear in AAAI 2023; this is an extended technical report. 11 tables, 3 figures

arXiv:2204.09649 [pdf, other]

doi 10.14722/ndss.2024.24105

BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking

Authors: Hossam ElAtali, Lachlan J. Gunn, Hans Liljestrand, N. Asokan

Abstract: Outsourced computing is widely used today. However, current approaches for protecting client data in outsourced computing fall short: use of cryptographic techniques like fully-homomorphic encryption incurs substantial costs, whereas use of hardware-assisted trusted execution environments has been shown to be vulnerable to run-time and side-channel attacks. We present Blinded Memory (BliMe), an… ▽ More Outsourced computing is widely used today. However, current approaches for protecting client data in outsourced computing fall short: use of cryptographic techniques like fully-homomorphic encryption incurs substantial costs, whereas use of hardware-assisted trusted execution environments has been shown to be vulnerable to run-time and side-channel attacks. We present Blinded Memory (BliMe), an architecture to realize efficient and secure outsourced computation. BliMe consists of a novel and minimal set of instruction set architecture (ISA) extensions implementing a taint-tracking policy to ensure the confidentiality of client data even in the presence of server vulnerabilities. To secure outsourced computation, the BliMe extensions can be used together with an attestable, fixed-function hardware security module (HSM) and an encryption engine that provides atomic decrypt-and-taint and encrypt-and-untaint operations. Clients rely on remote attestation and key agreement with the HSM to ensure that their data can be transferred securely to and from the encryption engine and will always be protected by BliMe's taint-tracking policy while at the server. We provide an RTL implementation BliMe-BOOM based on the BOOM RISC-V core. BliMe-BOOM requires no reduction in clock frequency relative to unmodified BOOM, and has minimal power ($<\!1.5\%$) and FPGA resource ($\leq\!9.0\%$) overheads. Various implementations of BliMe incur only moderate performance overhead ($8--25\%$). We also provide a machine-checked security proof of a simplified model ISA with BliMe extensions. △ Less

Submitted 29 November, 2023; v1 submitted 20 April, 2022; originally announced April 2022.

Comments: Accepted for publication at the Network and Distributed System Security (NDSS) Symposium 2024

arXiv:2204.04741 [pdf, other]

Is GitHub's Copilot as Bad as Humans at Introducing Vulnerabilities in Code?

Authors: Owura Asare, Meiyappan Nagappan, N. Asokan

Abstract: Several advances in deep learning have been successfully applied to the software development process. Of recent interest is the use of neural language models to build tools, such as Copilot, that assist in writing code. In this paper we perform a comparative empirical analysis of Copilot-generated code from a security perspective. The aim of this study is to determine if Copilot is as bad as human… ▽ More Several advances in deep learning have been successfully applied to the software development process. Of recent interest is the use of neural language models to build tools, such as Copilot, that assist in writing code. In this paper we perform a comparative empirical analysis of Copilot-generated code from a security perspective. The aim of this study is to determine if Copilot is as bad as human developers. We investigate whether Copilot is just as likely to introduce the same software vulnerabilities as human developers. Using a dataset of C/C++ vulnerabilities, we prompt Copilot to generate suggestions in scenarios that led to the introduction of vulnerabilities by human developers. The suggestions are inspected and categorized in a 2-stage process based on whether the original vulnerability or fix is reintroduced. We find that Copilot replicates the original vulnerable code about 33% of the time while replicating the fixed code at a 25% rate. However this behaviour is not consistent: Copilot is more likely to introduce some types of vulnerabilities than others and is also more likely to generate vulnerable code in response to prompts that correspond to older vulnerabilities. Overall, given that in a significant number of cases it did not replicate the vulnerabilities previously introduced by human developers, we conclude that Copilot, despite performing differently across various vulnerability types, is not as bad as human developers at introducing vulnerabilities in code. △ Less

Submitted 5 January, 2024; v1 submitted 10 April, 2022; originally announced April 2022.

Comments: Accepted for publication in Empirical Software Engineering

arXiv:2204.03781 [pdf, other]

Color My World: Deterministic Tagging for Memory Safety

Authors: Hans Liljestrand, Carlos Chinea, Rémi Denis-Courmont, Jan-Erik Ekberg, N. Asokan

Abstract: Hardware-assisted memory protection features are increasingly being deployed in COTS processors. ARMv8.5 Memory Tagging Extensions (MTE) is a recent example, which has been used to provide probabilistic checks for memory safety. This use of MTE is not secure against the standard adversary with arbitrary read/write access to memory. Consequently MTE is used as a software development tool. In this p… ▽ More Hardware-assisted memory protection features are increasingly being deployed in COTS processors. ARMv8.5 Memory Tagging Extensions (MTE) is a recent example, which has been used to provide probabilistic checks for memory safety. This use of MTE is not secure against the standard adversary with arbitrary read/write access to memory. Consequently MTE is used as a software development tool. In this paper we present the first design for deterministic memory protection using MTE that can resist the standard adversary, and hence is suitable for post-deployment memory safety. We describe our compiler extensions for LLVM Clang implementing static analysis and subsequent MTE instrumentation. Via a comprehensive evaluation we show that our scheme is effective. △ Less

Submitted 25 October, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.00162 [pdf, other]

Do Transformers know symbolic rules, and would we know if they did?

Authors: Tommi Gröndahl, Yujia Guo, N. Asokan

Abstract: To improve the explainability of leading Transformer networks used in NLP, it is important to tease apart genuine symbolic rules from merely associative input-output patterns. However, we identify several inconsistencies in how ``symbolicity'' has been construed in recent NLP literature. To mitigate this problem, we propose two criteria to be the most relevant, one pertaining to a system's interna… ▽ More To improve the explainability of leading Transformer networks used in NLP, it is important to tease apart genuine symbolic rules from merely associative input-output patterns. However, we identify several inconsistencies in how ``symbolicity'' has been construed in recent NLP literature. To mitigate this problem, we propose two criteria to be the most relevant, one pertaining to a system's internal architecture and the other to the dissociation between abstract rules and specific input identities. From this perspective, we critically examine prior work on the symbolic capacities of Transformers, and deem the results to be fundamentally inconclusive for reasons inherent in experiment design. We further maintain that there is no simple fix to this problem, since it arises -- to an extent -- in all end-to-end settings. Nonetheless, we emphasize the need for more robust evaluation of whether non-symbolic explanations exist for success in seemingly symbolic tasks. To facilitate this, we experiment on four sequence modelling tasks on the T5 Transformer in two experiment settings: zero-shot generalization, and generalization across class-specific vocabularies flipped between the training and test set. We observe that T5's generalization is markedly stronger in sequence-to-sequence tasks than in comparable classification tasks. Based on this, we propose a thus far overlooked analysis, where the Transformer itself does not need to be symbolic to be part of a symbolic architecture as the processor, operating on the input and output as external memory components. △ Less

Submitted 1 March, 2023; v1 submitted 19 February, 2022; originally announced March 2022.

Comments: 15 pages, 1 figure

arXiv:2202.12506 [pdf, other]

doi 10.1145/3510548.3519376

On the Effectiveness of Dataset Watermarking in Adversarial Settings

Authors: Buse Gul Atli Tekgul, N. Asokan

Abstract: In a data-driven world, datasets constitute a significant economic value. Dataset owners who spend time and money to collect and curate the data are incentivized to ensure that their datasets are not used in ways that they did not authorize. When such misuse occurs, dataset owners need technical mechanisms for demonstrating their ownership of the dataset in question. Dataset watermarking provides… ▽ More In a data-driven world, datasets constitute a significant economic value. Dataset owners who spend time and money to collect and curate the data are incentivized to ensure that their datasets are not used in ways that they did not authorize. When such misuse occurs, dataset owners need technical mechanisms for demonstrating their ownership of the dataset in question. Dataset watermarking provides one approach for ownership demonstration which can, in turn, deter unauthorized use. In this paper, we investigate a recently proposed data provenance method, radioactive data, to assess if it can be used to demonstrate ownership of (image) datasets used to train machine learning (ML) models. The original paper reported that radioactive data is effective in white-box settings. We show that while this is true for large datasets with many classes, it is not as effective for datasets where the number of classes is low $(\leq 30)$ or the number of samples per class is low $(\leq 500)$. We also show that, counter-intuitively, the black-box verification technique is effective for all datasets used in this paper, even when white-box verification is not. Given this observation, we show that the confidence in white-box verification can be improved by using watermarked samples directly during the verification process. We also highlight the need to assess the robustness of radioactive data if it were to be used for ownership demonstration since it is an adversarial setting unlike provenance identification. Compared to dataset watermarking, ML model watermarking has been explored more extensively in recent literature. However, most of the model watermarking techniques can be defeated via model extraction. We show that radioactive data can effectively survive model extraction attacks, which raises the possibility that it can be used for ML model ownership verification robust against model extraction. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: 7 pages, 2 figures. Will appear in the proceedings of CODASPY-IWSPA 2022

ACM Class: I.2.0; I.4.9

arXiv:2112.02230 [pdf, other]

SHAPr: An Efficient and Versatile Membership Privacy Risk Metric for Machine Learning

Authors: Vasisht Duddu, Sebastian Szyller, N. Asokan

Abstract: Data used to train machine learning (ML) models can be sensitive. Membership inference attacks (MIAs), attempting to determine whether a particular data record was used to train an ML model, risk violating membership privacy. ML model builders need a principled definition of a metric to quantify the membership privacy risk of (a) individual training data records, (b) computed independently of spec… ▽ More Data used to train machine learning (ML) models can be sensitive. Membership inference attacks (MIAs), attempting to determine whether a particular data record was used to train an ML model, risk violating membership privacy. ML model builders need a principled definition of a metric to quantify the membership privacy risk of (a) individual training data records, (b) computed independently of specific MIAs, (c) which assesses susceptibility to different MIAs, (d) can be used for different applications, and (e) efficiently. None of the prior membership privacy risk metrics simultaneously meet all these requirements. We present SHAPr, a membership privacy metric based on Shapley values which is a leave-one-out (LOO) technique, originally intended to measure the contribution of a training data record on model utility. We conjecture that contribution to model utility can act as a proxy for memorization, and hence represent membership privacy risk. Using ten benchmark datasets, we show that SHAPr is indeed effective in estimating susceptibility of training data records to MIAs. We also show that, unlike prior work, SHAPr is significantly better in estimating susceptibility to newer, and more effective MIA. We apply SHAPr to evaluate the efficacy of several defenses against MIAs: using regularization and removing high risk training data records. Moreover, SHAPr is versatile: it can be used for estimating vulnerability of different subgroups to MIAs, and inherits applications of Shapley values (e.g., data valuation). We show that SHAPr has an acceptable computational cost (compared to naive LOO), varying from a few minutes for the smallest dataset to ~92 minutes for the largest dataset. △ Less

Submitted 5 September, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

arXiv:2106.08746 [pdf, other]

Real-time Adversarial Perturbations against Deep Reinforcement Learning Policies: Attacks and Defenses

Authors: Buse G. A. Tekgul, Shelly Wang, Samuel Marchal, N. Asokan

Abstract: Deep reinforcement learning (DRL) is vulnerable to adversarial perturbations. Adversaries can mislead the policies of DRL agents by perturbing the state of the environment observed by the agents. Existing attacks are feasible in principle, but face challenges in practice, either by being too slow to fool DRL policies in real time or by modifying past observations stored in the agent's memory. We s… ▽ More Deep reinforcement learning (DRL) is vulnerable to adversarial perturbations. Adversaries can mislead the policies of DRL agents by perturbing the state of the environment observed by the agents. Existing attacks are feasible in principle, but face challenges in practice, either by being too slow to fool DRL policies in real time or by modifying past observations stored in the agent's memory. We show that Universal Adversarial Perturbations (UAP), independent of the individual inputs to which they are applied, can fool DRL policies effectively and in real time. We introduce three attack variants leveraging UAP. Via an extensive evaluation using three Atari 2600 games, we show that our attacks are effective, as they fully degrade the performance of three different DRL agents (up to 100%, even when the $l_\infty$ bound on the perturbation is as small as 0.01). It is faster than the frame rate (60 Hz) of image capture and considerably faster than prior attacks ($\approx 1.8$ms). Our attack technique is also efficient, incurring an online computational cost of $\approx 0.027$ms. Using two tasks involving robotic movement, we confirm that our results generalize to complex DRL tasks. Furthermore, we demonstrate that the effectiveness of known defenses diminishes against universal perturbations. We introduce an effective technique that detects all known adversarial perturbations against DRL policies, including all universal perturbations presented in this paper. △ Less

Submitted 23 September, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: Will appear in the proceedings of ESORICS 2022; 13 pages, 6 figures, 6 tables

arXiv:2104.12623 [pdf, other]

Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models

Authors: Sebastian Szyller, Vasisht Duddu, Tommi Gröndahl, N. Asokan

Abstract: Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ with comparable functionality. Recent research has shown successful model extraction of image classification, and natural language… ▽ More Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ with comparable functionality. Recent research has shown successful model extraction of image classification, and natural language processing models. In this paper, we show the first model extraction attack against real-world generative adversarial network (GAN) image translation models. We present a framework for conducting such attacks, and show that an adversary can successfully extract functional surrogate models by querying $F_V$ using data from the same domain as the training data for $F_V$. The adversary need not know $F_V$'s architecture or any other information about it beyond its intended task. We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo (image style transfer), and (3) Super-Resolution (super resolution). Using standard performance metrics for GANs, we show that our attacks are effective. Furthermore, we conducted a large scale (125 participants) user study on Selfie-to-Anime and Monet-to-Photo to show that human perception of the images produced by $F_V$ and $F_A$ can be considered equivalent, within an equivalence bound of Cohen's d = 0.3. Finally, we show that existing defenses against model extraction attacks (watermarking, adversarial examples, poisoning) do not extend to image translation models. △ Less

Submitted 28 February, 2023; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: 19 pages

arXiv:2009.12344 [pdf, other]

A little goes a long way: Improving toxic language classification despite data scarcity

Authors: Mika Juuti, Tommi Gröndahl, Adrian Flanagan, N. Asokan

Abstract: Detection of some types of toxic language is hampered by extreme scarcity of labeled training data. Data augmentation - generating new synthetic data from a labeled seed dataset - can help. The efficacy of data augmentation on toxic language classification has not been fully explored. We present the first systematic study on how data augmentation techniques impact performance across toxic language… ▽ More Detection of some types of toxic language is hampered by extreme scarcity of labeled training data. Data augmentation - generating new synthetic data from a labeled seed dataset - can help. The efficacy of data augmentation on toxic language classification has not been fully explored. We present the first systematic study on how data augmentation techniques impact performance across toxic language classifiers, ranging from shallow logistic regression architectures to BERT - a state-of-the-art pre-trained Transformer network. We compare the performance of eight techniques on very scarce seed datasets. We show that while BERT performed the best, shallow classifiers performed comparably when trained on data augmented with a combination of three techniques, including GPT-2-generated sentences. We discuss the interplay of performance and computational overhead, which can inform the choice of techniques under different constraints. △ Less

Submitted 24 October, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

Comments: To appear in Findings of ACL: EMNLP 2020

arXiv:2008.07298 [pdf, other]

WAFFLE: Watermarking in Federated Learning

Authors: Buse Gul Atli, Yuxi Xia, Samuel Marchal, N. Asokan

Abstract: Federated learning is a distributed learning technique where machine learning models are trained on client devices in which the local training data resides. The training is coordinated via a central server which is, typically, controlled by the intended owner of the resulting model. By avoiding the need to transport the training data to the central server, federated learning improves privacy and e… ▽ More Federated learning is a distributed learning technique where machine learning models are trained on client devices in which the local training data resides. The training is coordinated via a central server which is, typically, controlled by the intended owner of the resulting model. By avoiding the need to transport the training data to the central server, federated learning improves privacy and efficiency. But it raises the risk of model theft by clients because the resulting model is available on every client device. Even if the application software used for local training may attempt to prevent direct access to the model, a malicious client may bypass any such restrictions by reverse engineering the application software. Watermarking is a well-known deterrence method against model theft by providing the means for model owners to demonstrate ownership of their models. Several recent deep neural network (DNN) watermarking techniques use backdooring: training the models with additional mislabeled data. Backdooring requires full access to the training data and control of the training process. This is feasible when a single party trains the model in a centralized manner, but not in a federated learning setting where the training process and training data are distributed among several client devices. In this paper, we present WAFFLE, the first approach to watermark DNN models trained using federated learning. It introduces a retraining step at the server after each aggregation of local models into the global model. We show that WAFFLE efficiently embeds a resilient watermark into models incurring only negligible degradation in test accuracy (-0.17%), and does not require access to training data. We also introduce a novel technique to generate the backdoor used as a watermark. It outperforms prior techniques, imposing no communication, and low computational (+3.2%) overhead. △ Less

Submitted 22 July, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

Comments: Will appear in the proceedings of SRDS 2021; 14 pages, 11 figures, 10 tables

arXiv:1910.05429 [pdf, other]

Extraction of Complex DNN Models: Real Threat or Boogeyman?

Authors: Buse Gul Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal, N. Asokan

Abstract: Recently, machine learning (ML) has introduced advanced solutions to many domains. Since ML models provide business advantage to model owners, protecting intellectual property of ML models has emerged as an important consideration. Confidentiality of ML models can be protected by exposing them to clients only via prediction APIs. However, model extraction attacks can steal the functionality of ML… ▽ More Recently, machine learning (ML) has introduced advanced solutions to many domains. Since ML models provide business advantage to model owners, protecting intellectual property of ML models has emerged as an important consideration. Confidentiality of ML models can be protected by exposing them to clients only via prediction APIs. However, model extraction attacks can steal the functionality of ML models using the information leaked to clients through the results returned via the API. In this work, we question whether model extraction is a serious threat to complex, real-life ML models. We evaluate the current state-of-the-art model extraction attack (Knockoff nets) against complex models. We reproduce and confirm the results in the original paper. But we also show that the performance of this attack can be limited by several factors, including ML model architecture and the granularity of API response. Furthermore, we introduce a defense based on distinguishing queries used for Knockoff nets from benign queries. Despite the limitations of the Knockoff nets, we show that a more realistic adversary can effectively steal complex ML models and evade known defenses. △ Less

Submitted 27 May, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

Comments: 16 pages, 1 figure, Accepted for publication in AAAI-20 Workshop on Engineering Dependable and Secure Machine Learning Systems (AAAI-EDSMLS 2020)

arXiv:1909.05747 [pdf, other]

Protecting the stack with PACed canaries

Authors: Hans Liljestrand, Zaheer Gauhar, Thomas Nyman, Jan-Erik Ekberg, N. Asokan

Abstract: Stack canaries remain a widely deployed defense against memory corruption attacks. Despite their practical usefulness, canaries are vulnerable to memory disclosure and brute-forcing attacks. We propose PCan, a new approach based on ARMv8.3-A pointer authentication (PA), that uses dynamically-generated canaries to mitigate these weaknesses and show that it provides more fine-grained protection with… ▽ More Stack canaries remain a widely deployed defense against memory corruption attacks. Despite their practical usefulness, canaries are vulnerable to memory disclosure and brute-forcing attacks. We propose PCan, a new approach based on ARMv8.3-A pointer authentication (PA), that uses dynamically-generated canaries to mitigate these weaknesses and show that it provides more fine-grained protection with minimal performance overhead. △ Less

Submitted 12 September, 2019; originally announced September 2019.

arXiv:1906.03397 [pdf, other]

doi 10.1145/3338501.3357366

Making targeted black-box evasion attacks effective and efficient

Authors: Mika Juuti, Buse Gul Atli, N. Asokan

Abstract: We investigate how an adversary can optimally use its query budget for targeted evasion attacks against deep neural networks in a black-box setting. We formalize the problem setting and systematically evaluate what benefits the adversary can gain by using substitute models. We show that there is an exploration-exploitation tradeoff in that query efficiency comes at the cost of effectiveness. We pr… ▽ More We investigate how an adversary can optimally use its query budget for targeted evasion attacks against deep neural networks in a black-box setting. We formalize the problem setting and systematically evaluate what benefits the adversary can gain by using substitute models. We show that there is an exploration-exploitation tradeoff in that query efficiency comes at the cost of effectiveness. We present two new attack strategies for using substitute models and show that they are as effective as previous query-only techniques but require significantly fewer queries, by up to three orders of magnitude. We also show that an agile adversary capable of switching through different attack techniques can achieve pareto-optimal efficiency. We demonstrate our attack against Google Cloud Vision showing that the difficulty of black-box attacks against real-world prediction APIs is significantly easier than previously thought (requiring approximately 500 queries instead of approximately 20,000 as in previous works). △ Less

Submitted 8 June, 2019; originally announced June 2019.

Comments: 12 pages, 10 figures

Journal ref: AISec 2019: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security

arXiv:1906.00830 [pdf, other]

DAWN: Dynamic Adversarial Watermarking of Neural Networks

Authors: Sebastian Szyller, Buse Gul Atli, Samuel Marchal, N. Asokan

Abstract: Training machine learning (ML) models is expensive in terms of computational power, amounts of labeled data and human expertise. Thus, ML models constitute intellectual property (IP) and business value for their owners. Embedding digital watermarks during model training allows a model owner to later identify their models in case of theft or misuse. However, model functionality can also be stolen v… ▽ More Training machine learning (ML) models is expensive in terms of computational power, amounts of labeled data and human expertise. Thus, ML models constitute intellectual property (IP) and business value for their owners. Embedding digital watermarks during model training allows a model owner to later identify their models in case of theft or misuse. However, model functionality can also be stolen via model extraction, where an adversary trains a surrogate model using results returned from a prediction API of the original model. Recent work has shown that model extraction is a realistic threat. Existing watermarking schemes are ineffective against IP theft via model extraction since it is the adversary who trains the surrogate model. In this paper, we introduce DAWN (Dynamic Adversarial Watermarking of Neural Networks), the first approach to use watermarking to deter model extraction IP theft. Unlike prior watermarking schemes, DAWN does not impose changes to the training process but it operates at the prediction API of the protected model, by dynamically changing the responses for a small subset of queries (e.g., <0.5%) from API clients. This set is a watermark that will be embedded in case a client uses its queries to train a surrogate model. We show that DAWN is resilient against two state-of-the-art model extraction attacks, effectively watermarking all extracted surrogate models, allowing model owners to reliably demonstrate ownership (with confidence $>1- 2^{-64}$), incurring negligible loss of prediction accuracy (0.03-0.5%). △ Less

Submitted 16 July, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

Comments: Shorter version of this work to appear in Proceedings of the ACM Multimedia 2021; 16 pages, 3 figures

arXiv:1905.13464 [pdf, other]

Effective writing style imitation via combinatorial paraphrasing

Authors: Tommi Gröndahl, N. Asokan

Abstract: Stylometry can be used to profile or deanonymize authors against their will based on writing style. Style transfer provides a defence. Current techniques typically use either encoder-decoder architectures or rule-based algorithms. Crucially, style transfer must reliably retain original semantic content to be actually deployable. We conduct a multifaceted evaluation of three state-of-the-art encode… ▽ More Stylometry can be used to profile or deanonymize authors against their will based on writing style. Style transfer provides a defence. Current techniques typically use either encoder-decoder architectures or rule-based algorithms. Crucially, style transfer must reliably retain original semantic content to be actually deployable. We conduct a multifaceted evaluation of three state-of-the-art encoder-decoder style transfer techniques, and show that all fail at semantic retainment. In particular, they do not produce appropriate paraphrases, but only retain original content in the trivial case of exactly reproducing the text. To mitigate this problem we propose ParChoice: a technique based on the combinatorial application of multiple paraphrasing algorithms. ParChoice strongly outperforms the encoder-decoder baselines in semantic retainment. Additionally, compared to baselines that achieve non-negligible semantic retainment, ParChoice has superior style transfer performance. We also apply ParChoice to multi-author style imitation (not considered by prior work), where we achieve up to 75% imitation success among five authors. Furthermore, when compared to two state-of-the-art rule-based style transfer techniques, ParChoice has markedly better semantic retainment. Combining ParChoice with the best performing rule-based baseline (Mutant-X) also reaches the highest style transfer success on the Brennan-Greenstadt and Extended-Brennan-Greenstadt corpora, with much less impact on original meaning than when using the rule-based baseline techniques alone. Finally, we highlight a critical problem that afflicts all current style transfer techniques: the adversary can use the same technique for thwarting style transfer via adversarial training. We show that adding randomness to style transfer helps to mitigate the effectiveness of adversarial training. △ Less

Submitted 3 July, 2020; v1 submitted 31 May, 2019; originally announced May 2019.

Comments: 16 pages, 1 figure, Accepted for publication in Privacy Enhancing Technologies (PETS2020)

arXiv:1905.10255 [pdf, other]

Making Speculative BFT Resilient with Trusted Monotonic Counters

Authors: Lachlan J. Gunn, Jian Liu, Bruno Vavala, N. Asokan

Abstract: Consensus mechanisms used by popular distributed ledgers are highly scalable but notoriously inefficient. Byzantine fault tolerance (BFT) protocols are efficient but far less scalable. Speculative BFT protocols such as Zyzzyva and Zyzzyva5 are efficient and scalable but require a trade-off: Zyzzyva requires only $3f + 1$ replicas to tolerate $f$ faults, but even a single slow replica will make Zyz… ▽ More Consensus mechanisms used by popular distributed ledgers are highly scalable but notoriously inefficient. Byzantine fault tolerance (BFT) protocols are efficient but far less scalable. Speculative BFT protocols such as Zyzzyva and Zyzzyva5 are efficient and scalable but require a trade-off: Zyzzyva requires only $3f + 1$ replicas to tolerate $f$ faults, but even a single slow replica will make Zyzzyva fall back to more expensive non-speculative operation. Zyzzyva5 does not require a non-speculative fallback, but requires $5f + 1$ replicas in order to tolerate $f$ faults. BFT variants using hardware-assisted trusted components can tolerate a greater proportion of faults, but require that every replica have this hardware. We present SACZyzzyva, addressing these concerns: resilience to slow replicas and requiring only $3f + 1$ replicas, with only one replica needing an active monotonic counter at any given time. We experimentally evaluate our protocols, demonstrating low latency and high scalability. We prove that SACZyzzyva is optimally robust and that trusted components cannot increase fault tolerance unless they are present in greater than two-thirds of replicas. △ Less

Submitted 13 October, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:1905.10242 [pdf, other]

PACStack: an Authenticated Call Stack

Authors: Hans Liljestrand, Thomas Nyman, Lachlan J. Gunn, Jan-Erik Ekberg, N. Asokan

Abstract: A popular run-time attack technique is to compromise the control-flow integrity of a program by modifying function return addresses on the stack. So far, shadow stacks have proven to be essential for comprehensively preventing return address manipulation. Shadow stacks record return addresses in integrity-protected memory secured with hardware-assistance or software access control. Software shadow… ▽ More A popular run-time attack technique is to compromise the control-flow integrity of a program by modifying function return addresses on the stack. So far, shadow stacks have proven to be essential for comprehensively preventing return address manipulation. Shadow stacks record return addresses in integrity-protected memory secured with hardware-assistance or software access control. Software shadow stacks incur high overheads or trade off security for efficiency. Hardware-assisted shadow stacks are efficient and secure, but require the deployment of special-purpose hardware. We present authenticated call stack (ACS), an approach that uses chained message authentication codes (MACs). Our prototype, PACStack, uses the ARM general purpose hardware mechanism for pointer authentication (PA) to implement ACS. Via a rigorous security analysis, we show that PACStack achieves security comparable to hardware-assisted shadow stacks without requiring dedicated hardware. We demonstrate that PACStack's performance overhead is small (~3%). △ Less

Submitted 15 October, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Author's version of article to appear in USENIX Security '21

arXiv:1902.09381 [pdf, other]

EAT: a simple and versatile semantic representation format for multi-purpose NLP

Authors: Tommi Gröndahl

Abstract: Semantic representations are central in many NLP tasks that require human-interpretable data. The conjunctivist framework - primarily developed by Pietroski (2005, 2018) - obtains expressive representations with only a few basic semantic types and relations systematically linked to syntactic positions. While representational simplicity is crucial for computational applications, such findings have… ▽ More Semantic representations are central in many NLP tasks that require human-interpretable data. The conjunctivist framework - primarily developed by Pietroski (2005, 2018) - obtains expressive representations with only a few basic semantic types and relations systematically linked to syntactic positions. While representational simplicity is crucial for computational applications, such findings have not yet had major influence on NLP. We present the first generic semantic representation format for NLP directly based on these insights. We name the format EAT due to its basis in the Event-, Agent-, and Theme arguments in Neo-Davidsonian logical forms. It builds on the idea that similar tripartite argument relations are ubiquitous across categories, and can be constructed from grammatical structure without additional lexical information. We present a detailed exposition of EAT and how it relates to other prevalent formats used in prior work, such as Abstract Meaning Representation (AMR) and Minimal Recursion Semantics (MRS). EAT stands out in two respects: simplicity and versatility. Uniquely, EAT discards semantic metapredicates, and instead represents semantic roles entirely via positional encoding. This is made possible by limiting the number of roles to only three; a major decrease from the many dozens recognized in e.g. AMR and MRS. EAT's simplicity makes it exceptionally versatile in application. First, we show that drastically reducing semantic roles based on EAT benefits text generation from MRS in the test settings of Hajdik et al. (2019). Second, we implement the derivation of EAT from a syntactic parse, and apply this for parallel corpus generation between grammatical classes. Third, we train an encoder-decoder LSTM network to map EAT to English. Finally, we use both the encoder-decoder network and a rule-based alternative to conduct grammatical transformation from EAT-input. △ Less

Submitted 12 March, 2021; v1 submitted 25 February, 2019; originally announced February 2019.

Comments: 34 pages

arXiv:1902.08939 [pdf, ps, other]

Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace?

Authors: Tommi Gröndahl, N. Asokan

Abstract: Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to g… ▽ More Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to generalize across divergent semantic domains. We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide superior means of classifying texts as potentially deceptive. Additionally, we discuss forms of deception beyond semantic content, focusing on hiding author identity by writing style obfuscation. Surveying the literature on both author identification and obfuscation techniques, we conclude that current style transformation methods fail to achieve reliable obfuscation while simultaneously ensuring semantic faithfulness to the original text. We propose that future work in style transformation should pay particular attention to disallowing semantically drastic changes. △ Less

Submitted 26 February, 2019; v1 submitted 24 February, 2019; originally announced February 2019.

Comments: 35 pages To appear in ACM Computing Surveys (CSUR)

arXiv:1902.08359 [pdf, other]

Exploitation Techniques and Defenses for Data-Oriented Attacks

Authors: Long Cheng, Hans Liljestrand, Thomas Nyman, Yu Tsung Lee, Danfeng Yao, Trent Jaeger, N. Asokan

Abstract: Data-oriented attacks manipulate non-control data to alter a program's benign behavior without violating its control-flow integrity. It has been shown that such attacks can cause significant damage even in the presence of control-flow defense mechanisms. However, these threats have not been adequately addressed. In this SoK paper, we first map data-oriented exploits, including Data-Oriented Progra… ▽ More Data-oriented attacks manipulate non-control data to alter a program's benign behavior without violating its control-flow integrity. It has been shown that such attacks can cause significant damage even in the presence of control-flow defense mechanisms. However, these threats have not been adequately addressed. In this SoK paper, we first map data-oriented exploits, including Data-Oriented Programming (DOP) attacks, to their assumptions/requirements and attack capabilities. We also compare known defenses against these attacks, in terms of approach, detection capabilities, overhead, and compatibility. Then, we experimentally assess the feasibility of a detection approach that is based on the Intel Processor Trace (PT) technology. PT only traces control flows, thus, is generally believed to be not useful for data-oriented security. However, our work reveals that data-oriented attacks (in particular the recent DOP attacks) may generate side-effects on control-flow behavior in multiple dimensions, which manifest in PT traces. Based on this evaluation, we discuss challenges for building deployable data-oriented defenses and open research questions. △ Less

Submitted 24 March, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

arXiv:1811.09189 [pdf, other]

PAC it up: Towards Pointer Integrity using ARM Pointer Authentication

Authors: Hans Liljestrand, Thomas Nyman, Kui Wang, Carlos Chinea Perez, Jan-Erik Ekberg, N. Asokan

Abstract: Run-time attacks against programs written in memory-unsafe programming languages (e.g., C and C++) remain a prominent threat against computer systems. The prevalence of techniques like return-oriented programming (ROP) in attacking real-world systems has prompted major processor manufacturers to design hardware-based countermeasures against specific classes of run-time attacks. An example is the r… ▽ More Run-time attacks against programs written in memory-unsafe programming languages (e.g., C and C++) remain a prominent threat against computer systems. The prevalence of techniques like return-oriented programming (ROP) in attacking real-world systems has prompted major processor manufacturers to design hardware-based countermeasures against specific classes of run-time attacks. An example is the recently added support for pointer authentication (PA) in the ARMv8-A processor architecture, commonly used in devices like smartphones. PA is a low-cost technique to authenticate pointers so as to resist memory vulnerabilities. It has been shown to enable practical protection against memory vulnerabilities that corrupt return addresses or function pointers. However, so far, PA has received very little attention as a general purpose protection mechanism to harden software against various classes of memory attacks. In this paper, we use PA to build novel defenses against various classes of run-time attacks, including the first PA-based mechanism for data pointer integrity. We present PARTS, an instrumentation framework that integrates our PA-based defenses into the LLVM compiler and the GNU/Linux operating system and show, via systematic evaluation, that PARTS provides better protection than current solutions at a reasonable performance overhead △ Less

Submitted 24 May, 2019; v1 submitted 22 November, 2018; originally announced November 2018.

Comments: Author's version of article to appear in USENIX Security 2019

arXiv:1810.06080 [pdf, other]

doi 10.1145/3338466.3358916

S-FaaS: Trustworthy and Accountable Function-as-a-Service using Intel SGX

Authors: Fritz Alder, N. Asokan, Arseny Kurnikov, Andrew Paverd, Michael Steiner

Abstract: Function-as-a-Service (FaaS) is a recent and already very popular paradigm in cloud computing. The function provider need only specify the function to be run, usually in a high-level language like JavaScript, and the service provider orchestrates all the necessary infrastructure and software stacks. The function provider is only billed for the actual computational resources used by the function in… ▽ More Function-as-a-Service (FaaS) is a recent and already very popular paradigm in cloud computing. The function provider need only specify the function to be run, usually in a high-level language like JavaScript, and the service provider orchestrates all the necessary infrastructure and software stacks. The function provider is only billed for the actual computational resources used by the function invocation. Compared to previous cloud paradigms, FaaS requires significantly more fine-grained resource measurement mechanisms, e.g. to measure compute time and memory usage of a single function invocation with sub-second accuracy. Thanks to the short duration and stateless nature of functions, and the availability of multiple open-source frameworks, FaaS enables non-traditional service providers e.g. individuals or data centers with spare capacity. However, this exacerbates the challenge of ensuring that resource consumption is measured accurately and reported reliably. It also raises the issues of ensuring computation is done correctly and minimizing the amount of information leaked to service providers. To address these challenges, we introduce S-FaaS, the first architecture and implementation of FaaS to provide strong security and accountability guarantees backed by Intel SGX. To match the dynamic event-driven nature of FaaS, our design introduces a new key distribution enclave and a novel transitive attestation protocol. A core contribution of S-FaaS is our set of resource measurement mechanisms that securely measure compute time inside an enclave, and actual memory allocations. We have integrated S-FaaS into the popular OpenWhisk FaaS framework. We evaluate the security of our architecture, the accuracy of our resource measurement mechanisms, and the performance of our implementation, showing that our resource measurement mechanisms add less than 6.3% latency on standardized benchmarks. △ Less

Submitted 14 October, 2018; originally announced October 2018.

arXiv:1808.09115 [pdf, ps, other]

All You Need is "Love": Evading Hate-speech Detection

Authors: Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, N. Asokan

Abstract: With the spread of social networks and their unfortunate use for hate speech, automatic detection of the latter has become a pressing problem. In this paper, we reproduce seven state-of-the-art hate speech detection models from prior work, and show that they perform well only when tested on the same type of data they were trained on. Based on these results, we argue that for successful hate speech… ▽ More With the spread of social networks and their unfortunate use for hate speech, automatic detection of the latter has become a pressing problem. In this paper, we reproduce seven state-of-the-art hate speech detection models from prior work, and show that they perform well only when tested on the same type of data they were trained on. Based on these results, we argue that for successful hate speech detection, model architecture is less important than the type of data and labeling criteria. We further show that all proposed detection techniques are brittle against adversaries who can (automatically) insert typos, change word boundaries or add innocuous words to the original hate speech. A combination of these methods is also effective against Google Perspective -- a cutting-edge solution from industry. Our experiments demonstrate that adversarial training does not completely mitigate the attacks, and using character-level features makes the models systematically more attack-resistant than using word-level features. △ Less

Submitted 5 November, 2018; v1 submitted 28 August, 2018; originally announced August 2018.

Comments: 11 pages, Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (AISec) 2018

arXiv:1807.05002 [pdf, other]

doi 10.1109/TCAD.2018.2858422

ASSURED: Architecture for Secure Software Update of Realistic Embedded Devices

Authors: N. Asokan, Thomas Nyman, Norrathep Rattanavipanon, Ahmad-Reza Sadeghi, Gene Tsudik

Abstract: Secure firmware update is an important stage in the IoT device life-cycle. Prior techniques, designed for other computational settings, are not readily suitable for IoT devices, since they do not consider idiosyncrasies of a realistic large-scale IoT deployment. This motivates our design of ASSURED, a secure and scalable update framework for IoT. ASSURED includes all stakeholders in a typical IoT… ▽ More Secure firmware update is an important stage in the IoT device life-cycle. Prior techniques, designed for other computational settings, are not readily suitable for IoT devices, since they do not consider idiosyncrasies of a realistic large-scale IoT deployment. This motivates our design of ASSURED, a secure and scalable update framework for IoT. ASSURED includes all stakeholders in a typical IoT update ecosystem, while providing end-to-end security between manufacturers and devices. To demonstrate its feasibility and practicality, ASSURED is instantiated and experimentally evaluated on two commodity hardware platforms. Results show that ASSURED is considerably faster than current update mechanisms in realistic settings. △ Less

Submitted 18 October, 2018; v1 submitted 13 July, 2018; originally announced July 2018.

Comments: Author's version of the work that appeared in International Conference on Embedded Software (EMSOFT'18), Octobet 2018, TUrin, Italy. The definitive Version of Record was published in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, Nov. 2018

Journal ref: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, Nov. 2018

arXiv:1805.02628 [pdf, other]

PRADA: Protecting against DNN Model Stealing Attacks

Authors: Mika Juuti, Sebastian Szyller, Samuel Marchal, N. Asokan

Abstract: Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined… ▽ More Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44 percentage points, pp), and prediction accuracy (up to +46 pp) on two datasets. We provide take-aways on how to perform effective model extraction attacks. We then propose PRADA, the first step towards generic and effective detection of DNN model extraction attacks. It analyzes the distribution of consecutive API queries and raises an alarm when this distribution deviates from benign behavior. We show that PRADA can detect all prior model extraction attacks with no false positives. △ Less

Submitted 31 March, 2019; v1 submitted 7 May, 2018; originally announced May 2018.

Comments: 17 pages, 7 figures, 9 tables. Accepted for publication in the 4th IEEE European Symposium on Security and Privacy (EuroS&P 2019)

arXiv:1805.02400 [pdf, other]

Stay On-Topic: Generating Context-specific Fake Restaurant Reviews

Authors: Mika Juuti, Bo Sun, Tatsuya Mori, N. Asokan

Abstract: Automatically generated fake restaurant reviews are a threat to online review systems. Recent research has shown that users have difficulties in detecting machine-generated fake reviews hiding among real restaurant reviews. The method used in this work (char-LSTM ) has one drawback: it has difficulties staying in context, i.e. when it generates a review for specific target entity, the resulting re… ▽ More Automatically generated fake restaurant reviews are a threat to online review systems. Recent research has shown that users have difficulties in detecting machine-generated fake reviews hiding among real restaurant reviews. The method used in this work (char-LSTM ) has one drawback: it has difficulties staying in context, i.e. when it generates a review for specific target entity, the resulting review may contain phrases that are unrelated to the target, thus increasing its detectability. In this work, we present and evaluate a more sophisticated technique based on neural machine translation (NMT) with which we can generate reviews that stay on-topic. We test multiple variants of our technique using native English speakers on Amazon Mechanical Turk. We demonstrate that reviews generated by the best variant have almost optimal undetectability (class-averaged F-score 47%). We conduct a user study with skeptical users and show that our method evades detection more frequently compared to the state-of-the-art (average evasion 3.2/4 vs 1.5/4) with statistical significance, at level α = 1% (Section 4.3). We develop very effective detection tools and reach average F-score of 97% in classifying these. Although fake reviews are very effective in fooling people, effective automatic detection is still feasible. △ Less

Submitted 28 June, 2018; v1 submitted 7 May, 2018; originally announced May 2018.

Comments: 21 pages, 5 figures, 6 tables. Accepted for publication in the European Symposium on Research in Computer Security (ESORICS) 2018

arXiv:1804.08569 [pdf, other]

doi 10.1145/3230833.3234518

Keys in the Clouds: Auditable Multi-device Access to Cryptographic Credentials

Authors: Arseny Kurnikov, Andrew Paverd, Mohammad Mannan, N. Asokan

Abstract: Personal cryptographic keys are the foundation of many secure services, but storing these keys securely is a challenge, especially if they are used from multiple devices. Storing keys in a centralized location, like an Internet-accessible server, raises serious security concerns (e.g. server compromise). Hardware-based Trusted Execution Environments (TEEs) are a well-known solution for protecting… ▽ More Personal cryptographic keys are the foundation of many secure services, but storing these keys securely is a challenge, especially if they are used from multiple devices. Storing keys in a centralized location, like an Internet-accessible server, raises serious security concerns (e.g. server compromise). Hardware-based Trusted Execution Environments (TEEs) are a well-known solution for protecting sensitive data in untrusted environments, and are now becoming available on commodity server platforms. Although the idea of protecting keys using a server-side TEE is straight-forward, in this paper we validate this approach and show that it enables new desirable functionality. We describe the design, implementation, and evaluation of a TEE-based Cloud Key Store (CKS), an online service for securely generating, storing, and using personal cryptographic keys. Using remote attestation, users receive strong assurance about the behaviour of the CKS, and can authenticate themselves using passwords while avoiding typical risks of password-based authentication like password theft or phishing. In addition, this design allows users to i) define policy-based access controls for keys; ii) delegate keys to other CKS users for a specified time and/or a limited number of uses; and iii) audit all key usages via a secure audit log. We have implemented a proof of concept CKS using Intel SGX and integrated this into GnuPG on Linux and OpenKeychain on Android. Our CKS implementation performs approximately 6,000 signature operations per second on a single desktop PC. The latency is in the same order of magnitude as using locally-stored keys, and 20x faster than smart cards. △ Less

Submitted 1 June, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

Comments: Extended version of a paper to appear in the 3rd Workshop on Security, Privacy, and Identity Management in the Cloud (SECPID) 2018

arXiv:1804.07474 [pdf, other]

DÏoT: A Federated Self-learning Anomaly Detection System for IoT

Authors: Thien Duc Nguyen, Samuel Marchal, Markus Miettinen, Hossein Fereidooni, N. Asokan, Ahmad-Reza Sadeghi

Abstract: IoT devices are increasingly deployed in daily life. Many of these devices are, however, vulnerable due to insecure design, implementation, and configuration. As a result, many networks already have vulnerable IoT devices that are easy to compromise. This has led to a new category of malware specifically targeting IoT devices. However, existing intrusion detection techniques are not effective in d… ▽ More IoT devices are increasingly deployed in daily life. Many of these devices are, however, vulnerable due to insecure design, implementation, and configuration. As a result, many networks already have vulnerable IoT devices that are easy to compromise. This has led to a new category of malware specifically targeting IoT devices. However, existing intrusion detection techniques are not effective in detecting compromised IoT devices given the massive scale of the problem in terms of the number of different types of devices and manufacturers involved. In this paper, we present DÏoT, an autonomous self-learning distributed system for detecting compromised IoT devices effectively. In contrast to prior work, DÏoT uses a novel self-learning approach to classify devices into device types and build normal communication profiles for each of these that can subsequently be used to detect anomalous deviations in communication patterns. DÏoT utilizes a federated learning approach for aggregating behavior profiles efficiently. To the best of our knowledge, it is the first system to employ a federated learning approach to anomaly-detection-based intrusion detection. Consequently, DÏoT can cope with emerging new and unknown attacks. We systematically and extensively evaluated more than 30 off-the-shelf IoT devices over a long term and show that DÏoT is highly effective (95.6% detection rate) and fast (~257 ms) at detecting devices compromised by, for instance, the infamous Mirai malware. DÏoT reported no false alarms when evaluated in a real-world smart home deployment setting. △ Less

Submitted 10 May, 2019; v1 submitted 20 April, 2018; originally announced April 2018.

Comments: Accepted version of paper to appear at ICDCS 2019, Dallas, TX, USA, July 2019

Journal ref: Proceedings of the 39th IEEE International Conference on Distributed Computing Systems (ICDCS), 2019

arXiv:1803.11021 [pdf, other]

doi 10.1109/DSN.2018.00031

Migrating SGX Enclaves with Persistent State

Authors: Fritz Alder, Arseny Kurnikov, Andrew Paverd, N. Asokan

Abstract: Hardware-supported security mechanisms like Intel Software Guard Extensions (SGX) provide strong security guarantees, which are particularly relevant in cloud settings. However, their reliance on physical hardware conflicts with cloud practices, like migration of VMs between physical platforms. For instance, the SGX trusted execution environment (enclave) is bound to a single physical CPU. Altho… ▽ More Hardware-supported security mechanisms like Intel Software Guard Extensions (SGX) provide strong security guarantees, which are particularly relevant in cloud settings. However, their reliance on physical hardware conflicts with cloud practices, like migration of VMs between physical platforms. For instance, the SGX trusted execution environment (enclave) is bound to a single physical CPU. Although prior work has proposed an effective mechanism to migrate an enclave's data memory, it overlooks the migration of persistent state, including sealed data and monotonic counters; the former risks data loss whilst the latter undermines the SGX security guarantees. We show how this can be exploited to mount attacks, and then propose an improved enclave migration approach guaranteeing the consistency of persistent state. Our software-only approach enables migratable sealed data and monotonic counters, maintains all SGX security guarantees, minimizes developer effort, and incurs negligible performance overhead. △ Less

Submitted 29 March, 2018; originally announced March 2018.

Showing 1–50 of 79 results for author: Asokan, N