-
ASRJam: Human-Friendly AI Speech Jamming to Prevent Automated Phone Scams
Authors:
Freddie Grabovski,
Gilad Gressel,
Yisroel Mirsky
Abstract:
Large Language Models (LLMs), combined with Text-to-Speech (TTS) and Automatic Speech Recognition (ASR), are increasingly used to automate voice phishing (vishing) scams. These systems are scalable and convincing, posing a significant security threat. We identify the ASR transcription step as the most vulnerable link in the scam pipeline and introduce ASRJam, a proactive defence framework that inj…
▽ More
Large Language Models (LLMs), combined with Text-to-Speech (TTS) and Automatic Speech Recognition (ASR), are increasingly used to automate voice phishing (vishing) scams. These systems are scalable and convincing, posing a significant security threat. We identify the ASR transcription step as the most vulnerable link in the scam pipeline and introduce ASRJam, a proactive defence framework that injects adversarial perturbations into the victim's audio to disrupt the attacker's ASR. This breaks the scam's feedback loop without affecting human callers, who can still understand the conversation. While prior adversarial audio techniques are often unpleasant and impractical for real-time use, we also propose EchoGuard, a novel jammer that leverages natural distortions, such as reverberation and echo, that are disruptive to ASR but tolerable to humans. To evaluate EchoGuard's effectiveness and usability, we conducted a 39-person user study comparing it with three state-of-the-art attacks. Results show that EchoGuard achieved the highest overall utility, offering the best combination of ASR disruption and human listening experience.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
PaniCar: Securing the Perception of Advanced Driving Assistance Systems Against Emergency Vehicle Lighting
Authors:
Elad Feldman,
Jacob Shams,
Dudi Biton,
Alfred Chen,
Shaoyuan Xie,
Satoru Koda,
Yisroel Mirsky,
Asaf Shabtai,
Yuval Elovici,
Ben Nassi
Abstract:
The safety of autonomous cars has come under scrutiny in recent years, especially after 16 documented incidents involving Teslas (with autopilot engaged) crashing into parked emergency vehicles (police cars, ambulances, and firetrucks). While previous studies have revealed that strong light sources often introduce flare artifacts in the captured image, which degrade the image quality, the impact o…
▽ More
The safety of autonomous cars has come under scrutiny in recent years, especially after 16 documented incidents involving Teslas (with autopilot engaged) crashing into parked emergency vehicles (police cars, ambulances, and firetrucks). While previous studies have revealed that strong light sources often introduce flare artifacts in the captured image, which degrade the image quality, the impact of flare on object detection performance remains unclear. In this research, we unveil PaniCar, a digital phenomenon that causes an object detector's confidence score to fluctuate below detection thresholds when exposed to activated emergency vehicle lighting. This vulnerability poses a significant safety risk, and can cause autonomous vehicles to fail to detect objects near emergency vehicles. In addition, this vulnerability could be exploited by adversaries to compromise the security of advanced driving assistance systems (ADASs). We assess seven commercial ADASs (Tesla Model 3, "manufacturer C", HP, Pelsee, AZDOME, Imagebon, Rexing), four object detectors (YOLO, SSD, RetinaNet, Faster R-CNN), and 14 patterns of emergency vehicle lighting to understand the influence of various technical and environmental factors. We also evaluate four SOTA flare removal methods and show that their performance and latency are insufficient for real-time driving constraints. To mitigate this risk, we propose Caracetamol, a robust framework designed to enhance the resilience of object detectors against the effects of activated emergency vehicle lighting. Our evaluation shows that on YOLOv3 and Faster RCNN, Caracetamol improves the models' average confidence of car detection by 0.20, the lower confidence bound by 0.33, and reduces the fluctuation range by 0.33. In addition, Caracetamol is capable of processing frames at a rate of between 30-50 FPS, enabling real-time ADAS car detection.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
Memory Backdoor Attacks on Neural Networks
Authors:
Eden Luzon,
Guy Amit,
Roy Weiss,
Yisroel Mirsky
Abstract:
Neural networks, such as image classifiers, are frequently trained on proprietary and confidential datasets. It is generally assumed that once deployed, the training data remains secure, as adversaries are limited to query response interactions with the model, where at best, fragments of arbitrary data can be inferred without any guarantees on their authenticity. In this paper, we propose the memo…
▽ More
Neural networks, such as image classifiers, are frequently trained on proprietary and confidential datasets. It is generally assumed that once deployed, the training data remains secure, as adversaries are limited to query response interactions with the model, where at best, fragments of arbitrary data can be inferred without any guarantees on their authenticity. In this paper, we propose the memory backdoor attack, where a model is covertly trained to memorize specific training samples and later selectively output them when triggered with an index pattern. What makes this attack unique is that it (1) works even when the tasks conflict (making a classifier output images), (2) enables the systematic extraction of training samples from deployed models and (3) offers guarantees on the extracted authenticity of the data. We demonstrate the attack on image classifiers, segmentation models, and a large language model (LLM). We demonstrate the attack on image classifiers, segmentation models, and a large language model (LLM). With this attack, it is possible to hide thousands of images and texts in modern vision architectures and LLMs respectively, all while maintaining model performance. The memory back door attack poses a significant threat not only to conventional model deployments but also to federated learning paradigms and other modern frameworks. Therefore, we suggest an efficient and effective countermeasure that can be immediately applied and advocate for further work on the topic.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
Efficient Model Extraction via Boundary Sampling
Authors:
Maor Biton Dor,
Yisroel Mirsky
Abstract:
This paper introduces a novel data-free model extraction attack that significantly advances the current state-of-the-art in terms of efficiency, accuracy, and effectiveness. Traditional black-box methods rely on using the victim's model as an oracle to label a vast number of samples within high-confidence areas. This approach not only requires an extensive number of queries but also results in a l…
▽ More
This paper introduces a novel data-free model extraction attack that significantly advances the current state-of-the-art in terms of efficiency, accuracy, and effectiveness. Traditional black-box methods rely on using the victim's model as an oracle to label a vast number of samples within high-confidence areas. This approach not only requires an extensive number of queries but also results in a less accurate and less transferable model. In contrast, our method innovates by focusing on sampling low-confidence areas (along the decision boundaries) and employing an evolutionary algorithm to optimize the sampling process. These novel contributions allow for a dramatic reduction in the number of queries needed by the attacker by a factor of 10x to 600x while simultaneously improving the accuracy of the stolen model. Moreover, our approach improves boundary alignment, resulting in better transferability of adversarial examples from the stolen model to the victim's model (increasing the attack success rate from 60\% to 82\% on average). Finally, we accomplish all of this with a strict black-box assumption on the victim, with no knowledge of the target's architecture or dataset.
We demonstrate our attack on three datasets with increasingly larger resolutions and compare our performance to four state-of-the-art model extraction attacks.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
PEAS: A Strategy for Crafting Transferable Adversarial Examples
Authors:
Bar Avraham,
Yisroel Mirsky
Abstract:
Black box attacks, where adversaries have limited knowledge of the target model, pose a significant threat to machine learning systems. Adversarial examples generated with a substitute model often suffer from limited transferability to the target model. While recent work explores ranking perturbations for improved success rates, these methods see only modest gains. We propose a novel strategy call…
▽ More
Black box attacks, where adversaries have limited knowledge of the target model, pose a significant threat to machine learning systems. Adversarial examples generated with a substitute model often suffer from limited transferability to the target model. While recent work explores ranking perturbations for improved success rates, these methods see only modest gains. We propose a novel strategy called PEAS that can boost the transferability of existing black box attacks. PEAS leverages the insight that samples which are perceptually equivalent exhibit significant variability in their adversarial transferability. Our approach first generates a set of images from an initial sample via subtle augmentations. We then evaluate the transferability of adversarial perturbations on these images using a set of substitute models. Finally, the most transferable adversarial example is selected and used for the attack. Our experiments show that PEAS can double the performance of existing attacks, achieving a 2.5x improvement in attack success rates on average over current ranking methods. We thoroughly evaluate PEAS on ImageNet and CIFAR-10, analyze hyperparameter impacts, and provide an ablation study to isolate each component's importance.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
The Best Defense is a Good Offense: Countering LLM-Powered Cyberattacks
Authors:
Daniel Ayzenshteyn,
Roy Weiss,
Yisroel Mirsky
Abstract:
As large language models (LLMs) continue to evolve, their potential use in automating cyberattacks becomes increasingly likely. With capabilities such as reconnaissance, exploitation, and command execution, LLMs could soon become integral to autonomous cyber agents, capable of launching highly sophisticated attacks. In this paper, we introduce novel defense strategies that exploit the inherent vul…
▽ More
As large language models (LLMs) continue to evolve, their potential use in automating cyberattacks becomes increasingly likely. With capabilities such as reconnaissance, exploitation, and command execution, LLMs could soon become integral to autonomous cyber agents, capable of launching highly sophisticated attacks. In this paper, we introduce novel defense strategies that exploit the inherent vulnerabilities of attacking LLMs. By targeting weaknesses such as biases, trust in input, memory limitations, and their tunnel-vision approach to problem-solving, we develop techniques to mislead, delay, or neutralize these autonomous agents. We evaluate our defenses under black-box conditions, starting with single prompt-response scenarios and progressing to real-world tests using custom-built CTF machines. Our results show defense success rates of up to 90\%, demonstrating the effectiveness of turning LLM vulnerabilities into defensive strategies against LLM-driven cyber threats.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
Are You Human? An Adversarial Benchmark to Expose LLMs
Authors:
Gilad Gressel,
Rahul Pankajakshan,
Yisroel Mirsky
Abstract:
Large Language Models (LLMs) have demonstrated an alarming ability to impersonate humans in conversation, raising concerns about their potential misuse in scams and deception. Humans have a right to know if they are conversing to an LLM. We evaluate text-based prompts designed as challenges to expose LLM imposters in real-time. To this end we compile and release an open-source benchmark dataset th…
▽ More
Large Language Models (LLMs) have demonstrated an alarming ability to impersonate humans in conversation, raising concerns about their potential misuse in scams and deception. Humans have a right to know if they are conversing to an LLM. We evaluate text-based prompts designed as challenges to expose LLM imposters in real-time. To this end we compile and release an open-source benchmark dataset that includes 'implicit challenges' that exploit an LLM's instruction-following mechanism to cause role deviation, and 'exlicit challenges' that test an LLM's ability to perform simple tasks typically easy for humans but difficult for LLMs. Our evaluation of 9 leading models from the LMSYS leaderboard revealed that explicit challenges successfully detected LLMs in 78.4% of cases, while implicit challenges were effective in 22.9% of instances. User studies validate the real-world applicability of our methods, with humans outperforming LLMs on explicit challenges (78% vs 22% success rate). Our framework unexpectedly revealed that many study participants were using LLMs to complete tasks, demonstrating its effectiveness in detecting both AI impostors and human misuse of AI tools. This work addresses the critical need for reliable, real-time LLM detection methods in high-stakes conversations.
△ Less
Submitted 20 December, 2024; v1 submitted 12 October, 2024;
originally announced October 2024.
-
Back-in-Time Diffusion: Unsupervised Detection of Medical Deepfakes
Authors:
Fred Grabovski,
Lior Yasur,
Guy Amit,
Yisroel Mirsky
Abstract:
Recent progress in generative models has made it easier for a wide audience to edit and create image content, raising concerns about the proliferation of deepfakes, especially in healthcare. Despite the availability of numerous techniques for detecting manipulated images captured by conventional cameras, their applicability to medical images is limited. This limitation stems from the distinctive f…
▽ More
Recent progress in generative models has made it easier for a wide audience to edit and create image content, raising concerns about the proliferation of deepfakes, especially in healthcare. Despite the availability of numerous techniques for detecting manipulated images captured by conventional cameras, their applicability to medical images is limited. This limitation stems from the distinctive forensic characteristics of medical images, a result of their imaging process.
In this work we propose a novel anomaly detector for medical imagery based on diffusion models. Normally, diffusion models are used to generate images. However, we show how a similar process can be used to detect synthetic content by making a model reverse the diffusion on a suspected image. We evaluate our method on the task of detecting fake tumors injected and removed from CT and MRI scans. Our method significantly outperforms other state of the art unsupervised detectors with an increased AUC of 0.9 from 0.79 for injection and of 0.96 from 0.91 for removal on average. We also explore our hypothesis using AI explainability tools and publish our code and new medical deepfake datasets to encourage further research into this domain.
△ Less
Submitted 21 October, 2024; v1 submitted 21 July, 2024;
originally announced July 2024.
-
Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks
Authors:
Roey Bokobza,
Yisroel Mirsky
Abstract:
Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples. Unlike traditional preprocessing defences that rely on sanitizing input samples, our stateless strategy counters the attack process itself. For every query we evaluate a counter-sample instead, where the counter-sample is the original sample optimized…
▽ More
Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples. Unlike traditional preprocessing defences that rely on sanitizing input samples, our stateless strategy counters the attack process itself. For every query we evaluate a counter-sample instead, where the counter-sample is the original sample optimized against the attacker's objective. By countering every black box query with a targeted white box optimization, our strategy effectively introduces an asymmetry to the game to the defender's advantage. This defence not only effectively misleads the attacker's search for an adversarial example, it also preserves the model's accuracy on legitimate inputs and is generic to multiple types of attacks.
We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets. Additionally, we also show that the proposed defence is robust against strong adversaries as well.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
What Was Your Prompt? A Remote Keylogging Attack on AI Assistants
Authors:
Roy Weiss,
Daniel Ayzenshteyn,
Guy Amit,
Yisroel Mirsky
Abstract:
AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. We found that many vendors, including OpenAI and Microsoft, have this side-channel.
However, inferring the content of a…
▽ More
AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. We found that many vendors, including OpenAI and Microsoft, have this side-channel.
However, inferring the content of a response from a token-length sequence alone proves challenging. This is because tokens are akin to words, and responses can be several sentences long leading to millions of grammatically correct sentences. In this paper, we show how this can be overcome by (1) utilizing the power of a large language model (LLM) to translate these sequences, (2) providing the LLM with inter-sentence context to narrow the search space and (3) performing a known-plaintext attack by fine-tuning the model on the target model's writing style.
Using these methods, we were able to accurately reconstruct 29\% of an AI assistant's responses and successfully infer the topic from 55\% of them. To demonstrate the threat, we performed the attack on OpenAI's ChatGPT-4 and Microsoft's Copilot on both browser and API traffic.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Transpose Attack: Stealing Datasets with Bidirectional Training
Authors:
Guy Amit,
Mosh Levy,
Yisroel Mirsky
Abstract:
Deep neural networks are normally executed in the forward direction. However, in this work, we identify a vulnerability that enables models to be trained in both directions and on different tasks. Adversaries can exploit this capability to hide rogue models within seemingly legitimate models. In addition, in this work we show that neural networks can be taught to systematically memorize and retrie…
▽ More
Deep neural networks are normally executed in the forward direction. However, in this work, we identify a vulnerability that enables models to be trained in both directions and on different tasks. Adversaries can exploit this capability to hide rogue models within seemingly legitimate models. In addition, in this work we show that neural networks can be taught to systematically memorize and retrieve specific samples from datasets. Together, these findings expose a novel method in which adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models. We focus on the data exfiltration attack and show that modern architectures can be used to secretly exfiltrate tens of thousands of samples with high fidelity, high enough to compromise data privacy and even train new models. Moreover, to mitigate this threat we propose a novel approach for detecting infected models.
△ Less
Submitted 17 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Discussion Paper: The Threat of Real Time Deepfakes
Authors:
Guy Frankovits,
Yisroel Mirsky
Abstract:
Generative deep learning models are able to create realistic audio and video. This technology has been used to impersonate the faces and voices of individuals. These ``deepfakes'' are being used to spread misinformation, enable scams, perform fraud, and blackmail the innocent. The technology continues to advance and today attackers have the ability to generate deepfakes in real-time. This new capa…
▽ More
Generative deep learning models are able to create realistic audio and video. This technology has been used to impersonate the faces and voices of individuals. These ``deepfakes'' are being used to spread misinformation, enable scams, perform fraud, and blackmail the innocent. The technology continues to advance and today attackers have the ability to generate deepfakes in real-time. This new capability poses a significant threat to society as attackers begin to exploit the technology in advances social engineering attacks. In this paper, we discuss the implications of this emerging threat, identify the challenges with preventing these attacks and suggest a better direction for researching stronger defences.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
Deepfake CAPTCHA: A Method for Preventing Fake Calls
Authors:
Lior Yasur,
Guy Frankovits,
Fred M. Grabovski,
Yisroel Mirsky
Abstract:
Deep learning technology has made it possible to generate realistic content of specific individuals. These `deepfakes' can now be generated in real-time which enables attackers to impersonate people over audio and video calls. Moreover, some methods only need a few images or seconds of audio to steal an identity. Existing defenses perform passive analysis to detect fake content. However, with the…
▽ More
Deep learning technology has made it possible to generate realistic content of specific individuals. These `deepfakes' can now be generated in real-time which enables attackers to impersonate people over audio and video calls. Moreover, some methods only need a few images or seconds of audio to steal an identity. Existing defenses perform passive analysis to detect fake content. However, with the rapid progress of deepfake quality, this may be a losing game.
In this paper, we propose D-CAPTCHA: an active defense against real-time deepfakes. The approach is to force the adversary into the spotlight by challenging the deepfake model to generate content which exceeds its capabilities. By doing so, passive detection becomes easier since the content will be distorted. In contrast to existing CAPTCHAs, we challenge the AI's ability to create content as opposed to its ability to classify content. In this work we focus on real-time audio deepfakes and present preliminary results on video.
In our evaluation we found that D-CAPTCHA outperforms state-of-the-art audio deepfake detectors with an accuracy of 91-100% depending on the challenge (compared to 71% without challenges). We also performed a study on 41 volunteers to understand how threatening current real-time deepfake attacks are. We found that the majority of the volunteers could not tell the difference between real and fake audio.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Transferability Ranking of Adversarial Examples
Authors:
Mosh Levy,
Guy Amit,
Yuval Elovici,
Yisroel Mirsky
Abstract:
Adversarial transferability in black-box scenarios presents a unique challenge: while attackers can employ surrogate models to craft adversarial examples, they lack assurance on whether these examples will successfully compromise the target model. Until now, the prevalent method to ascertain success has been trial and error-testing crafted samples directly on the victim model. This approach, howev…
▽ More
Adversarial transferability in black-box scenarios presents a unique challenge: while attackers can employ surrogate models to craft adversarial examples, they lack assurance on whether these examples will successfully compromise the target model. Until now, the prevalent method to ascertain success has been trial and error-testing crafted samples directly on the victim model. This approach, however, risks detection with every attempt, forcing attackers to either perfect their first try or face exposure. Our paper introduces a ranking strategy that refines the transfer attack process, enabling the attacker to estimate the likelihood of success without repeated trials on the victim's system. By leveraging a set of diverse surrogate models, our method can predict transferability of adversarial examples. This strategy can be used to either select the best sample to use in an attack or the best perturbation to apply to a specific sample. Using our strategy, we were able to raise the transferability of adversarial examples from a mere 20% - akin to random selection-up to near upper-bound levels, with some scenarios even witnessing a 100% success rate. This substantial improvement not only sheds light on the shared susceptibilities across diverse architectures but also demonstrates that attackers can forego the detectable trial-and-error tactics raising increasing the threat of surrogate-based attacks.
△ Less
Submitted 18 April, 2024; v1 submitted 23 August, 2022;
originally announced August 2022.
-
DF-Captcha: A Deepfake Captcha for Preventing Fake Calls
Authors:
Yisroel Mirsky
Abstract:
Social engineering (SE) is a form of deception that aims to trick people into giving access to data, information, networks and even money. For decades SE has been a key method for attackers to gain access to an organization, virtually skipping all lines of defense. Attackers also regularly use SE to scam innocent people by making threatening phone calls which impersonate an authority or by sending…
▽ More
Social engineering (SE) is a form of deception that aims to trick people into giving access to data, information, networks and even money. For decades SE has been a key method for attackers to gain access to an organization, virtually skipping all lines of defense. Attackers also regularly use SE to scam innocent people by making threatening phone calls which impersonate an authority or by sending infected emails which look like they have been sent from a loved one. SE attacks will likely remain a top attack vector for criminals because humans are the weakest link in cyber security.
Unfortunately, the threat will only get worse now that a new technology called deepfakes as arrived. A deepfake is believable media (e.g., videos) created by an AI. Although the technology has mostly been used to swap the faces of celebrities, it can also be used to `puppet' different personas. Recently, researchers have shown how this technology can be deployed in real-time to clone someone's voice in a phone call or reenact a face in a video call. Given that any novice user can download this technology to use it, it is no surprise that criminals have already begun to monetize it to perpetrate their SE attacks.
In this paper, we propose a lightweight application which can protect organizations and individuals from deepfake SE attacks. Through a challenge and response approach, we leverage the technical and theoretical limitations of deepfake technologies to expose the attacker. Existing defence solutions are too heavy as an end-point solution and can be evaded by a dynamic attacker. In contrast, our approach is lightweight and breaks the reactive arms race, putting the attacker at a disadvantage.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
The Security of Deep Learning Defences for Medical Imaging
Authors:
Moshe Levy,
Guy Amit,
Yuval Elovici,
Yisroel Mirsky
Abstract:
Deep learning has shown great promise in the domain of medical image analysis. Medical professionals and healthcare providers have been adopting the technology to speed up and enhance their work. These systems use deep neural networks (DNN) which are vulnerable to adversarial samples; images with imperceivable changes that can alter the model's prediction. Researchers have proposed defences which…
▽ More
Deep learning has shown great promise in the domain of medical image analysis. Medical professionals and healthcare providers have been adopting the technology to speed up and enhance their work. These systems use deep neural networks (DNN) which are vulnerable to adversarial samples; images with imperceivable changes that can alter the model's prediction. Researchers have proposed defences which either make a DNN more robust or detect the adversarial samples before they do harm. However, none of these works consider an informed attacker which can adapt to the defence mechanism. We show that an informed attacker can evade five of the current state of the art defences while successfully fooling the victim's deep learning model, rendering these defences useless. We then suggest better alternatives for securing healthcare DNNs from such attacks: (1) harden the system's security and (2) use digital signatures.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
The Threat of Offensive AI to Organizations
Authors:
Yisroel Mirsky,
Ambra Demontis,
Jaidip Kotak,
Ram Shankar,
Deng Gelei,
Liu Yang,
Xiangyu Zhang,
Wenke Lee,
Yuval Elovici,
Battista Biggio
Abstract:
AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI (such as machine learning) to enhance their attacks and expand their campaigns.
Although offensive AI has been di…
▽ More
AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI (such as machine learning) to enhance their attacks and expand their campaigns.
Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations. For example, how does an AI-capable adversary impact the cyber kill chain? Does AI benefit the attacker more than the defender? What are the most significant AI threats facing organizations today and what will be their impact on the future?
In this survey, we explore the threat of offensive AI on organizations. First, we present the background and discuss how AI changes the adversary's methods, strategies, goals, and overall attack model. Then, through a literature review, we identify 33 offensive AI capabilities which adversaries can use to enhance their attacks. Finally, through a user study spanning industry and academia, we rank the AI threats and provide insights on the adversaries.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
IPatch: A Remote Adversarial Patch
Authors:
Yisroel Mirsky
Abstract:
Applications such as autonomous vehicles and medical screening use deep learning models to localize and identify hundreds of objects in a single frame. In the past, it has been shown how an attacker can fool these models by placing an adversarial patch within a scene. However, these patches must be placed in the target location and do not explicitly alter the semantics elsewhere in the image.
In…
▽ More
Applications such as autonomous vehicles and medical screening use deep learning models to localize and identify hundreds of objects in a single frame. In the past, it has been shown how an attacker can fool these models by placing an adversarial patch within a scene. However, these patches must be placed in the target location and do not explicitly alter the semantics elsewhere in the image.
In this paper, we introduce a new type of adversarial patch which alters a model's perception of an image's semantics. These patches can be placed anywhere within an image to change the classification or semantics of locations far from the patch. We call this new class of adversarial examples `remote adversarial patches' (RAP).
We implement our own RAP called IPatch and perform an in-depth analysis on image segmentation RAP attacks using five state-of-the-art architectures with eight different encoders on the CamVid street view dataset. Moreover, we demonstrate that the attack can be extended to object recognition models with preliminary results on the popular YOLOv3 model. We found that the patch can change the classification of a remote target region with a success rate of up to 93% on average.
△ Less
Submitted 2 June, 2021; v1 submitted 30 April, 2021;
originally announced May 2021.
-
Lightweight Collaborative Anomaly Detection for the IoT using Blockchain
Authors:
Yisroel Mirsky,
Tomer Golomb,
Yuval Elovici
Abstract:
Due to their rapid growth and deployment, the Internet of things (IoT) have become a central aspect of our daily lives. Unfortunately, IoT devices tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can be used to secure these devices in a plug-and-protect manner.
However, anomaly detection models must be trained for a long…
▽ More
Due to their rapid growth and deployment, the Internet of things (IoT) have become a central aspect of our daily lives. Unfortunately, IoT devices tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can be used to secure these devices in a plug-and-protect manner.
However, anomaly detection models must be trained for a long time in order to capture all benign behaviors. Furthermore, the anomaly detection model is vulnerable to adversarial attacks since, during the training phase, all observations are assumed to be benign. In this paper, we propose (1) a novel approach for anomaly detection and (2) a lightweight framework that utilizes the blockchain to ensemble an anomaly detection model in a distributed environment.
Blockchain framework incrementally updates a trusted anomaly detection model via self-attestation and consensus among the IoT devices. We evaluate our method on a distributed IoT simulation platform, which consists of 48 Raspberry Pis. The simulation demonstrates how the approach can enhance the security of each device and the security of the network as a whole.
△ Less
Submitted 18 June, 2020;
originally announced June 2020.
-
The Creation and Detection of Deepfakes: A Survey
Authors:
Yisroel Mirsky,
Wenke Lee
Abstract:
Generative deep learning algorithms have progressed to a point where it is difficult to tell the difference between what is real and what is fake. In 2018, it was discovered how easy it is to use this technology for unethical and malicious applications, such as the spread of misinformation, impersonation of political leaders, and the defamation of innocent individuals. Since then, these `deepfakes…
▽ More
Generative deep learning algorithms have progressed to a point where it is difficult to tell the difference between what is real and what is fake. In 2018, it was discovered how easy it is to use this technology for unethical and malicious applications, such as the spread of misinformation, impersonation of political leaders, and the defamation of innocent individuals. Since then, these `deepfakes' have advanced significantly.
In this paper, we explore the creation and detection of deepfakes and provide an in-depth view of how these architectures work. The purpose of this survey is to provide the reader with a deeper understanding of (1) how deepfakes are created and detected, (2) the current trends and advancements in this domain, (3) the shortcomings of the current defense solutions, and (4) the areas which require further research and attention.
△ Less
Submitted 13 September, 2020; v1 submitted 23 April, 2020;
originally announced April 2020.
-
DANTE: A framework for mining and monitoring darknet traffic
Authors:
Dvir Cohen,
Yisroel Mirsky,
Yuval Elovici,
Rami Puzis,
Manuel Kamp,
Tobias Martin,
Asaf Shabtai
Abstract:
Trillions of network packets are sent over the Internet to destinations which do not exist. This 'darknet' traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. Howe…
▽ More
Trillions of network packets are sent over the Internet to destinations which do not exist. This 'darknet' traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. However, by observing how network ports (services) are used, it is possible to capture the intent of each transmission. In this paper, we present DANTE: a framework and algorithm for mining darknet traffic. DANTE learns the meaning of targeted network ports by applying Word2Vec to observed port sequences. Then, when a host sends a new sequence, DANTE represents the transmission as the average embedding of the ports found that sequence. Finally, DANTE uses a novel and incremental time-series cluster tracking algorithm on observed sequences to detect recurring behaviors and new emerging threats. To evaluate the system, we ran DANTE on a full year of darknet traffic (over three Tera-Bytes) collected by the largest telecommunications provider in Europe, Deutsche Telekom and analyzed the results. DANTE discovered 1,177 new emerging threats and was able to track malicious campaigns over time. We also compared DANTE to the current best approach and found DANTE to be more practical and effective at detecting darknet traffic patterns.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
The Security of IP-based Video Surveillance Systems
Authors:
Naor Kalbo,
Yisroel Mirsky,
Asaf Shabtai,
Yuval Elovici
Abstract:
IP-based Surveillance systems protect industrial facilities, railways, gas stations, and even one's own home. Therefore, unauthorized access to these systems has serious security implications. In this survey, we analyze the system's (1) threat agents, (2) attack goals, (3) practical attacks, (4) possible attack outcomes, and (5) provide example attack vectors.
IP-based Surveillance systems protect industrial facilities, railways, gas stations, and even one's own home. Therefore, unauthorized access to these systems has serious security implications. In this survey, we analyze the system's (1) threat agents, (2) attack goals, (3) practical attacks, (4) possible attack outcomes, and (5) provide example attack vectors.
△ Less
Submitted 23 October, 2019;
originally announced October 2019.
-
Physical Layer Encryption using a Vernam Cipher
Authors:
Yisroel Mirsky,
Benjamin Fedidat,
Yoram Haddad
Abstract:
Secure communication is a necessity. However, encryption is commonly only applied to the upper layers of the protocol stack. This exposes network information to eavesdroppers, including the channel's type, data rate, protocol, and routing information. This may be solved by encrypting the physical layer, thereby securing all subsequent layers. In order for this method to be practical, the encryptio…
▽ More
Secure communication is a necessity. However, encryption is commonly only applied to the upper layers of the protocol stack. This exposes network information to eavesdroppers, including the channel's type, data rate, protocol, and routing information. This may be solved by encrypting the physical layer, thereby securing all subsequent layers. In order for this method to be practical, the encryption must be quick, preserve bandwidth, and must also deal with the issues of noise mitigation and synchronization.
In this paper, we present the Vernam Physical Signal Cipher (VPSC): a novel cipher which can encrypt the harmonic composition of any analog waveform. The VPSC accomplished this by applying a modified Vernam cipher to the signal's frequency magnitudes and phases. This approach is fast and preserves the signal's bandwidth. In the paper, we offer methods for noise mitigation and synchronization, and evaluate the VPSC over a noisy wireless channel with multi-path propagation interference.
△ Less
Submitted 18 October, 2019;
originally announced October 2019.
-
Online Budgeted Learning for Classifier Induction
Authors:
Eran Fainman,
Bracha Shapira,
Lior Rokach,
Yisroel Mirsky
Abstract:
In real-world machine learning applications, there is a cost associated with sampling of different features. Budgeted learning can be used to select which feature-values to acquire from each instance in a dataset, such that the best model is induced under a given constraint. However, this approach is not possible in the domain of online learning since one may not retroactively acquire feature-valu…
▽ More
In real-world machine learning applications, there is a cost associated with sampling of different features. Budgeted learning can be used to select which feature-values to acquire from each instance in a dataset, such that the best model is induced under a given constraint. However, this approach is not possible in the domain of online learning since one may not retroactively acquire feature-values from past instances. In online learning, the challenge is to find the optimum set of features to be acquired from each instance upon arrival from a data stream. In this paper we introduce the issue of online budgeted learning and describe a general framework for addressing this challenge. We propose two types of feature value acquisition policies based on the multi-armed bandit problem: random and adaptive. Adaptive policies perform online adjustments according to new information coming from a data stream, while random policies are not sensitive to the information that arrives from the data stream. Our comparative study on five real-world datasets indicates that adaptive policies outperform random policies for most budget limitations and datasets. Furthermore, we found that in some cases adaptive policies achieve near-optimal results.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
CT-GAN: Malicious Tampering of 3D Medical Imagery using Deep Learning
Authors:
Yisroel Mirsky,
Tom Mahler,
Ilan Shelef,
Yuval Elovici
Abstract:
In 2018, clinics and hospitals were hit with numerous attacks leading to significant data breaches and interruptions in medical services. An attacker with access to medical records can do much more than hold the data for ransom or sell it on the black market.
In this paper, we show how an attacker can use deep-learning to add or remove evidence of medical conditions from volumetric (3D) medical…
▽ More
In 2018, clinics and hospitals were hit with numerous attacks leading to significant data breaches and interruptions in medical services. An attacker with access to medical records can do much more than hold the data for ransom or sell it on the black market.
In this paper, we show how an attacker can use deep-learning to add or remove evidence of medical conditions from volumetric (3D) medical scans. An attacker may perform this act in order to stop a political candidate, sabotage research, commit insurance fraud, perform an act of terrorism, or even commit murder. We implement the attack using a 3D conditional GAN and show how the framework (CT-GAN) can be automated. Although the body is complex and 3D medical scans are very large, CT-GAN achieves realistic results which can be executed in milliseconds.
To evaluate the attack, we focused on injecting and removing lung cancer from CT scans. We show how three expert radiologists and a state-of-the-art deep learning AI are highly susceptible to the attack. We also explore the attack surface of a modern radiology network and demonstrate one attack vector: we intercepted and manipulated CT scans in an active hospital network with a covert penetration test.
Demo video: https://youtu.be/_mkRAArj-x0
Source code: https://github.com/ymirsky/CT-GAN
△ Less
Submitted 6 June, 2019; v1 submitted 11 January, 2019;
originally announced January 2019.
-
N-BaIoT: Network-based Detection of IoT Botnet Attacks Using Deep Autoencoders
Authors:
Yair Meidan,
Michael Bohadana,
Yael Mathov,
Yisroel Mirsky,
Dominik Breitenbacher,
Asaf Shabtai,
Yuval Elovici
Abstract:
The proliferation of IoT devices which can be more easily compromised than desktop computers has led to an increase in the occurrence of IoT based botnet attacks. In order to mitigate this new threat there is a need to develop new methods for detecting attacks launched from compromised IoT devices and differentiate between hour and millisecond long IoTbased attacks. In this paper we propose and em…
▽ More
The proliferation of IoT devices which can be more easily compromised than desktop computers has led to an increase in the occurrence of IoT based botnet attacks. In order to mitigate this new threat there is a need to develop new methods for detecting attacks launched from compromised IoT devices and differentiate between hour and millisecond long IoTbased attacks. In this paper we propose and empirically evaluate a novel network based anomaly detection method which extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic emanating from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two of the most widely known IoT based botnets, Mirai and BASHLITE. Our evaluation results demonstrated our proposed method's ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices which were part of a botnet.
△ Less
Submitted 9 May, 2018;
originally announced May 2018.
-
CIoTA: Collaborative IoT Anomaly Detection via Blockchain
Authors:
Tomer Golomb,
Yisroel Mirsky,
Yuval Elovici
Abstract:
Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives. However, they tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices. However, an anomaly detection model must be trained for a long time in order to capture all benign be…
▽ More
Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives. However, they tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices. However, an anomaly detection model must be trained for a long time in order to capture all benign behaviors. This approach is vulnerable to adversarial attacks since all observations are assumed to be benign while training the anomaly detection model.
In this paper, we propose CIoTA, a lightweight framework that utilizes the blockchain concept to perform distributed and collaborative anomaly detection for devices with limited resources. CIoTA uses blockchain to incrementally update a trusted anomaly detection model via self-attestation and consensus among IoT devices. We evaluate CIoTA on our own distributed IoT simulation platform, which consists of 48 Raspberry Pis, to demonstrate CIoTA's ability to enhance the security of each device and the security of the network as a whole.
△ Less
Submitted 9 April, 2018; v1 submitted 10 March, 2018;
originally announced March 2018.
-
Vesper: Using Echo-Analysis to Detect Man-in-the-Middle Attacks in LANs
Authors:
Yisroel Mirsky,
Naor Kalbo,
Yuval Elovici,
Asaf Shabtai
Abstract:
The Man-in-the-Middle (MitM) attack is a cyber-attack in which an attacker intercepts traffic, thus harming the confidentiality, integrity, and availability of the network. It remains a popular attack vector due to its simplicity. However, existing solutions are either not portable, suffer from a high false positive rate, or are simply not generic. In this paper, we propose Vesper: a novel plug-an…
▽ More
The Man-in-the-Middle (MitM) attack is a cyber-attack in which an attacker intercepts traffic, thus harming the confidentiality, integrity, and availability of the network. It remains a popular attack vector due to its simplicity. However, existing solutions are either not portable, suffer from a high false positive rate, or are simply not generic. In this paper, we propose Vesper: a novel plug-and-play MitM detector for local area networks. Vesper uses a technique inspired from impulse response analysis used in the domain of acoustic signal processing. Analogous to how echoes in a cave capture the shape and construction of the environment, so to can a short and intense pulse of ICMP echo requests model the link between two network hosts. Vesper uses neural networks called autoencoders to model the normal patterns of the echoed pulses, and detect when the environment changes. Using this technique, Vesper is able to detect MitM attacks with high accuracy while incurring minimal network overhead. We evaluate Vesper on LANs consisting of video surveillance cameras, servers, and PC workstations. We also investigate several possible adversarial attacks against Vesper, and demonstrate how Vesper mitigates these attacks.
△ Less
Submitted 7 March, 2018;
originally announced March 2018.
-
Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection
Authors:
Yisroel Mirsky,
Tomer Doitshman,
Yuval Elovici,
Asaf Shabtai
Abstract:
Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which…
▽ More
Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time.
In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune's core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS.
△ Less
Submitted 27 May, 2018; v1 submitted 25 February, 2018;
originally announced February 2018.
-
HVACKer: Bridging the Air-Gap by Attacking the Air Conditioning System
Authors:
Yisroel Mirsky,
Mordechai Guri,
Yuval Elovici
Abstract:
Modern corporations physically separate their sensitive computational infrastructure from public or other accessible networks in order to prevent cyber-attacks. However, attackers still manage to infect these networks, either by means of an insider or by infiltrating the supply chain. Therefore, an attacker's main challenge is to determine a way to command and control the compromised hosts that ar…
▽ More
Modern corporations physically separate their sensitive computational infrastructure from public or other accessible networks in order to prevent cyber-attacks. However, attackers still manage to infect these networks, either by means of an insider or by infiltrating the supply chain. Therefore, an attacker's main challenge is to determine a way to command and control the compromised hosts that are isolated from an accessible network (e.g., the Internet).
In this paper, we propose a new adversarial model that shows how an air gapped network can receive communications over a covert thermal channel. Concretely, we show how attackers may use a compromised air-conditioning system (connected to the internet) to send commands to infected hosts within an air-gapped network. Since thermal communication protocols are a rather unexplored domain, we propose a novel line-encoding and protocol suitable for this type of channel. Moreover, we provide experimental results to demonstrate the covert channel's feasibility, and to calculate the channel's bandwidth. Lastly, we offer a forensic analysis and propose various ways this channel can be detected and prevented.
We believe that this study details a previously unseen vector of attack that security experts should be aware of.
△ Less
Submitted 30 March, 2017;
originally announced March 2017.
-
9-1-1 DDoS: Threat, Analysis and Mitigation
Authors:
Mordechai Guri,
Yisroel Mirsky,
Yuval Elovici
Abstract:
The 911 emergency service belongs to one of the 16 critical infrastructure sectors in the United States. Distributed denial of service (DDoS) attacks launched from a mobile phone botnet pose a significant threat to the availability of this vital service. In this paper we show how attackers can exploit the cellular network protocols in order to launch an anonymized DDoS attack on 911. The current F…
▽ More
The 911 emergency service belongs to one of the 16 critical infrastructure sectors in the United States. Distributed denial of service (DDoS) attacks launched from a mobile phone botnet pose a significant threat to the availability of this vital service. In this paper we show how attackers can exploit the cellular network protocols in order to launch an anonymized DDoS attack on 911. The current FCC regulations require that all emergency calls be immediately routed regardless of the caller's identifiers (e.g., IMSI and IMEI). A rootkit placed within the baseband firmware of a mobile phone can mask and randomize all cellular identifiers, causing the device to have no genuine identification within the cellular network. Such anonymized phones can issue repeated emergency calls that cannot be blocked by the network or the emergency call centers, technically or legally. We explore the 911 infrastructure and discuss why it is susceptible to this kind of attack. We then implement different forms of the attack and test our implementation on a small cellular network. Finally, we simulate and analyze anonymous attacks on a model of current 911 infrastructure in order to measure the severity of their impact. We found that with less than 6K bots (or $100K hardware), attackers can block emergency services in an entire state (e.g., North Carolina) for days. We believe that this paper will assist the respective organizations, lawmakers, and security professionals in understanding the scope of this issue in order to prevent possible 911-DDoS attacks in the future.
△ Less
Submitted 8 September, 2016;
originally announced September 2016.