-
Effectiveness of Adversarial Benign and Malware Examples in Evasion and Poisoning Attacks
Authors:
Matouš Kozák,
Martin Jureček
Abstract:
Adversarial attacks present significant challenges for malware detection systems. This research investigates the effectiveness of benign and malicious adversarial examples (AEs) in evasion and poisoning attacks on the Portable Executable file domain. A novel focus of this study is on benign AEs, which, although not directly harmful, can increase false positives and undermine trust in antivirus sol…
▽ More
Adversarial attacks present significant challenges for malware detection systems. This research investigates the effectiveness of benign and malicious adversarial examples (AEs) in evasion and poisoning attacks on the Portable Executable file domain. A novel focus of this study is on benign AEs, which, although not directly harmful, can increase false positives and undermine trust in antivirus solutions. We propose modifying existing adversarial malware generators to produce benign AEs and show they are as successful as malware AEs in evasion attacks. Furthermore, our data show that benign AEs have a more decisive influence in poisoning attacks than standard malware AEs, demonstrating their superior ability to decrease the model's performance. Our findings introduce new opportunities for adversaries and further increase the attack surface that needs to be protected by security researchers.
△ Less
Submitted 19 January, 2025;
originally announced January 2025.
-
Updating Windows Malware Detectors: Balancing Robustness and Regression against Adversarial EXEmples
Authors:
Matous Kozak,
Luca Demetrio,
Dmitrijs Trizna,
Fabio Roli
Abstract:
Adversarial EXEmples are carefully-perturbed programs tailored to evade machine learning Windows malware detectors, with an ongoing effort to develop robust models able to address detection effectiveness. However, even if robust models can prevent the majority of EXEmples, to maintain predictive power over time, models are fine-tuned to newer threats, leading either to partial updates or time-cons…
▽ More
Adversarial EXEmples are carefully-perturbed programs tailored to evade machine learning Windows malware detectors, with an ongoing effort to develop robust models able to address detection effectiveness. However, even if robust models can prevent the majority of EXEmples, to maintain predictive power over time, models are fine-tuned to newer threats, leading either to partial updates or time-consuming retraining from scratch. Thus, even if the robustness against adversarial EXEmples is higher, the new models might suffer a regression in performance by misclassifying threats that were previously correctly detected. For these reasons, we study the trade-off between accuracy and regression when updating Windows malware detectors by proposing EXE-scanner, a plugin that can be chained to existing detectors to promptly stop EXEmples without causing regression. We empirically show that previously proposed hardening techniques suffer a regression of accuracy when updating non-robust models, exacerbating the gap when considering low false positives regimes and temporal drifts affecting data. Also, through EXE-scanner we gain evidence on the detectability of adversarial EXEmples, showcasing the presence of artifacts left inside while creating them. Due to its design, EXE-scanner can be chained to any classifier to obtain the best performance without the need for costly retraining. To foster reproducibility, we openly release the source code, along with the dataset of adversarial EXEmples based on state-of-the-art perturbation algorithms.
△ Less
Submitted 4 May, 2025; v1 submitted 4 May, 2024;
originally announced May 2024.
-
A Comparison of Adversarial Learning Techniques for Malware Detection
Authors:
Pavla Louthánová,
Matouš Kozák,
Martin Jureček,
Mark Stamp
Abstract:
Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware de…
▽ More
Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware detection. We use gradient-based, evolutionary algorithm-based, and reinforcement-based methods to generate adversarial samples, and then test the generated samples against selected antivirus products. We compare the selected methods in terms of accuracy and practical applicability. The results show that applying optimized modifications to previously detected malware can lead to incorrect classification of the file as benign. It is also known that generated malware samples can be successfully used against detection models other than those used to generate them and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest average evasion rate of 44.11%. Using the Gym-malware generator in combination with itself improved the evasion rate to 58.35%.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Creating Valid Adversarial Examples of Malware
Authors:
Matouš Kozák,
Martin Jureček,
Mark Stamp,
Fabio Di Troia
Abstract:
Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are incorporating machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this vulnerability has been demonstrated…
▽ More
Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are incorporating machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this vulnerability has been demonstrated for many models in white-box settings, a black-box attack is more applicable in practice for the domain of malware detection. We present a generator of adversarial malware examples using reinforcement learning algorithms. The reinforcement learning agents utilize a set of functionality-preserving modifications, thus creating valid adversarial examples. Using the proximal policy optimization (PPO) algorithm, we achieved an evasion rate of 53.84% against the gradient-boosted decision tree (GBDT) model. The PPO agent previously trained against the GBDT classifier scored an evasion rate of 11.41% against the neural network-based classifier MalConv and an average evasion rate of 2.31% against top antivirus programs. Furthermore, we discovered that random application of our functionality-preserving portable executable modifications successfully evades leading antivirus engines, with an average evasion rate of 11.65%. These findings indicate that machine learning-based models used in malware detection systems are vulnerable to adversarial attacks and that better safeguards need to be taken to protect these systems.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Combining Generators of Adversarial Malware Examples to Increase Evasion Rate
Authors:
Matouš Kozák,
Martin Jureček
Abstract:
Antivirus developers are increasingly embracing machine learning as a key component of malware defense. While machine learning achieves cutting-edge outcomes in many fields, it also has weaknesses that are exploited by several adversarial attack techniques. Many authors have presented both white-box and black-box generators of adversarial malware examples capable of bypassing malware detectors wit…
▽ More
Antivirus developers are increasingly embracing machine learning as a key component of malware defense. While machine learning achieves cutting-edge outcomes in many fields, it also has weaknesses that are exploited by several adversarial attack techniques. Many authors have presented both white-box and black-box generators of adversarial malware examples capable of bypassing malware detectors with varying success. We propose to combine contemporary generators in order to increase their potential. Combining different generators can create more sophisticated adversarial examples that are more likely to evade anti-malware tools. We demonstrated this technique on five well-known generators and recorded promising results. The best-performing combination of AMG-random and MAB-Malware generators achieved an average evasion rate of 15.9% against top-tier antivirus products. This represents an average improvement of more than 36% and 627% over using only the AMG-random and MAB-Malware generators, respectively. The generator that benefited the most from having another generator follow its procedure was the FGSM injection attack, which improved the evasion rate on average between 91.97% and 1,304.73%, depending on the second generator used. These results demonstrate that combining different generators can significantly improve their effectiveness against leading antivirus programs.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Retracted papers by Iranian authors: Causes, journals, time lags, affiliations, collaborations
Authors:
Ali Ghorbi,
Mohsen Fazeli-Varzaneh,
Erfan Ghaderi-Azad,
Marcel Ausloos,
Marcin Kozak
Abstract:
This study aims to analyze 343 retraction notices indexed in the Scopus database, published in 2001-2019, related to scientific articles (co-)written by at least one author affiliated with an Iranian institution. In order to determine reasons for retractions, we merged this database with the database from Retraction Watch. The data were analyzed using Excel 2016 and IBM-SPSS version 24.0, and visu…
▽ More
This study aims to analyze 343 retraction notices indexed in the Scopus database, published in 2001-2019, related to scientific articles (co-)written by at least one author affiliated with an Iranian institution. In order to determine reasons for retractions, we merged this database with the database from Retraction Watch. The data were analyzed using Excel 2016 and IBM-SPSS version 24.0, and visualized using VOSviewer software. Most of the retractions were due to fake peer review (95 retractions) and plagiarism (90). The average time between a publication and its retraction was 591 days. The maximum time-lag (about 3,000 days) occurred for papers retracted due to duplicate publications; the minimum time-lag (fewer than 100 days) was for papers retracted due to ''unspecified cause'' (most of these were conference papers). As many as 48 (14%) of the retracted papers were published in two medical journals: Tumor Biology (25 papers) and Diagnostic Pathology (23 papers). From the institutional point of view, Islamic Azad University was the inglorious leader, contributing to over one-half (53.1%) of retracted papers. Among the 343 retraction notices, 64 papers pertained to international collaborations with researchers from mainly Asian and European countries; Malaysia having the most retractions (22 papers). Since most retractions were due to fake peer review and plagiarism, the peer review system appears to be a weak point of the submission/publication process; if improved, the number of retractions would likely drop because of increased editorial control.
△ Less
Submitted 29 August, 2021;
originally announced August 2021.
-
How have the Eastern European countries of the former Warsaw Pact developed since 1990? A bibliometric study
Authors:
Marcin Kozak,
Lutz Bornmann,
Loet Leydesdorff
Abstract:
Did the demise of the Soviet Union in 1991 influence the scientific performance of the researchers in Eastern European countries? Did this historical event affect international collaboration by researchers from the Eastern European countries with those of Western countries? Did it also change international collaboration among researchers from the Eastern European countries? Trying to answer these…
▽ More
Did the demise of the Soviet Union in 1991 influence the scientific performance of the researchers in Eastern European countries? Did this historical event affect international collaboration by researchers from the Eastern European countries with those of Western countries? Did it also change international collaboration among researchers from the Eastern European countries? Trying to answer these questions, this study aims to shed light on international collaboration by researchers from the Eastern European countries (Russia, Ukraine, Belarus, Moldova, Bulgaria, the Czech Republic, Hungary, Poland, Romania and Slovakia). The number of publications and normalized citation impact values are compared for these countries based on InCites (Thomson Reuters), from 1981 up to 2011. The international collaboration by researchers affiliated to institutions in Eastern European countries at the time points of 1990, 2000 and 2011 was studied with the help of Pajek and VOSviewer software, based on data from the Science Citation Index (Thomson Reuters). Our results show that the breakdown of the communist regime did not lead, on average, to a huge improvement in the publication performance of the Eastern European countries and that the increase in international co-authorship relations by the researchers affiliated to institutions in these countries was smaller than expected. Most of the Eastern European countries are still subject to changes and are still awaiting their boost in scientific development.
△ Less
Submitted 11 December, 2013;
originally announced December 2013.