Search | arXiv e-print repository

doi 10.1007/978-3-031-83157-7_10

Effectiveness of Adversarial Benign and Malware Examples in Evasion and Poisoning Attacks

Abstract: Adversarial attacks present significant challenges for malware detection systems. This research investigates the effectiveness of benign and malicious adversarial examples (AEs) in evasion and poisoning attacks on the Portable Executable file domain. A novel focus of this study is on benign AEs, which, although not directly harmful, can increase false positives and undermine trust in antivirus sol… ▽ More Adversarial attacks present significant challenges for malware detection systems. This research investigates the effectiveness of benign and malicious adversarial examples (AEs) in evasion and poisoning attacks on the Portable Executable file domain. A novel focus of this study is on benign AEs, which, although not directly harmful, can increase false positives and undermine trust in antivirus solutions. We propose modifying existing adversarial malware generators to produce benign AEs and show they are as successful as malware AEs in evasion attacks. Furthermore, our data show that benign AEs have a more decisive influence in poisoning attacks than standard malware AEs, demonstrating their superior ability to decrease the model's performance. Our findings introduce new opportunities for adversaries and further increase the attack surface that needs to be protected by security researchers. △ Less

Submitted 19 January, 2025; originally announced January 2025.

Comments: 24 pages, 6 figures, 4 tables

arXiv:2405.02646 [pdf, other]

doi 10.1016/j.cose.2025.104466

Updating Windows Malware Detectors: Balancing Robustness and Regression against Adversarial EXEmples

Authors: Matous Kozak, Luca Demetrio, Dmitrijs Trizna, Fabio Roli

Abstract: Adversarial EXEmples are carefully-perturbed programs tailored to evade machine learning Windows malware detectors, with an ongoing effort to develop robust models able to address detection effectiveness. However, even if robust models can prevent the majority of EXEmples, to maintain predictive power over time, models are fine-tuned to newer threats, leading either to partial updates or time-cons… ▽ More Adversarial EXEmples are carefully-perturbed programs tailored to evade machine learning Windows malware detectors, with an ongoing effort to develop robust models able to address detection effectiveness. However, even if robust models can prevent the majority of EXEmples, to maintain predictive power over time, models are fine-tuned to newer threats, leading either to partial updates or time-consuming retraining from scratch. Thus, even if the robustness against adversarial EXEmples is higher, the new models might suffer a regression in performance by misclassifying threats that were previously correctly detected. For these reasons, we study the trade-off between accuracy and regression when updating Windows malware detectors by proposing EXE-scanner, a plugin that can be chained to existing detectors to promptly stop EXEmples without causing regression. We empirically show that previously proposed hardening techniques suffer a regression of accuracy when updating non-robust models, exacerbating the gap when considering low false positives regimes and temporal drifts affecting data. Also, through EXE-scanner we gain evidence on the detectability of adversarial EXEmples, showcasing the presence of artifacts left inside while creating them. Due to its design, EXE-scanner can be chained to any classifier to obtain the best performance without the need for costly retraining. To foster reproducibility, we openly release the source code, along with the dataset of adversarial EXEmples based on state-of-the-art perturbation algorithms. △ Less

Submitted 4 May, 2025; v1 submitted 4 May, 2024; originally announced May 2024.

Comments: 17 pages, 4 figures, 13 tables

Journal ref: Computers & Security. 155 (2025) 104466

arXiv:2308.09958 [pdf, other]

doi 10.1007/s11416-024-00519-z

A Comparison of Adversarial Learning Techniques for Malware Detection

Authors: Pavla Louthánová, Matouš Kozák, Martin Jureček, Mark Stamp

Abstract: Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware de… ▽ More Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware detection. We use gradient-based, evolutionary algorithm-based, and reinforcement-based methods to generate adversarial samples, and then test the generated samples against selected antivirus products. We compare the selected methods in terms of accuracy and practical applicability. The results show that applying optimized modifications to previously detected malware can lead to incorrect classification of the file as benign. It is also known that generated malware samples can be successfully used against detection models other than those used to generate them and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest average evasion rate of 44.11%. Using the Gym-malware generator in combination with itself improved the evasion rate to 58.35%. △ Less

Submitted 19 August, 2023; originally announced August 2023.

arXiv:2306.13587 [pdf, other]

doi 10.1007/s11416-024-00516-2

Creating Valid Adversarial Examples of Malware

Authors: Matouš Kozák, Martin Jureček, Mark Stamp, Fabio Di Troia

Abstract: Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are incorporating machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this vulnerability has been demonstrated… ▽ More Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are incorporating machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this vulnerability has been demonstrated for many models in white-box settings, a black-box attack is more applicable in practice for the domain of malware detection. We present a generator of adversarial malware examples using reinforcement learning algorithms. The reinforcement learning agents utilize a set of functionality-preserving modifications, thus creating valid adversarial examples. Using the proximal policy optimization (PPO) algorithm, we achieved an evasion rate of 53.84% against the gradient-boosted decision tree (GBDT) model. The PPO agent previously trained against the GBDT classifier scored an evasion rate of 11.41% against the neural network-based classifier MalConv and an average evasion rate of 2.31% against top antivirus programs. Furthermore, we discovered that random application of our functionality-preserving portable executable modifications successfully evades leading antivirus engines, with an average evasion rate of 11.65%. These findings indicate that machine learning-based models used in malware detection systems are vulnerable to adversarial attacks and that better safeguards need to be taken to protect these systems. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 19 pages, 4 figures

arXiv:2304.07360 [pdf, other]

doi 10.5220/0012127700003555

Combining Generators of Adversarial Malware Examples to Increase Evasion Rate

Authors: Matouš Kozák, Martin Jureček

Abstract: Antivirus developers are increasingly embracing machine learning as a key component of malware defense. While machine learning achieves cutting-edge outcomes in many fields, it also has weaknesses that are exploited by several adversarial attack techniques. Many authors have presented both white-box and black-box generators of adversarial malware examples capable of bypassing malware detectors wit… ▽ More Antivirus developers are increasingly embracing machine learning as a key component of malware defense. While machine learning achieves cutting-edge outcomes in many fields, it also has weaknesses that are exploited by several adversarial attack techniques. Many authors have presented both white-box and black-box generators of adversarial malware examples capable of bypassing malware detectors with varying success. We propose to combine contemporary generators in order to increase their potential. Combining different generators can create more sophisticated adversarial examples that are more likely to evade anti-malware tools. We demonstrated this technique on five well-known generators and recorded promising results. The best-performing combination of AMG-random and MAB-Malware generators achieved an average evasion rate of 15.9% against top-tier antivirus products. This represents an average improvement of more than 36% and 627% over using only the AMG-random and MAB-Malware generators, respectively. The generator that benefited the most from having another generator follow its procedure was the FGSM injection attack, which improved the evasion rate on average between 91.97% and 1,304.73%, depending on the second generator used. These results demonstrate that combining different generators can significantly improve their effectiveness against leading antivirus programs. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: 9 pages, 5 figures, 2 tables. Under review

arXiv:2108.12837 [pdf]

doi 10.1007/s11192-021-04104-9

Retracted papers by Iranian authors: Causes, journals, time lags, affiliations, collaborations

Authors: Ali Ghorbi, Mohsen Fazeli-Varzaneh, Erfan Ghaderi-Azad, Marcel Ausloos, Marcin Kozak

Abstract: This study aims to analyze 343 retraction notices indexed in the Scopus database, published in 2001-2019, related to scientific articles (co-)written by at least one author affiliated with an Iranian institution. In order to determine reasons for retractions, we merged this database with the database from Retraction Watch. The data were analyzed using Excel 2016 and IBM-SPSS version 24.0, and visu… ▽ More This study aims to analyze 343 retraction notices indexed in the Scopus database, published in 2001-2019, related to scientific articles (co-)written by at least one author affiliated with an Iranian institution. In order to determine reasons for retractions, we merged this database with the database from Retraction Watch. The data were analyzed using Excel 2016 and IBM-SPSS version 24.0, and visualized using VOSviewer software. Most of the retractions were due to fake peer review (95 retractions) and plagiarism (90). The average time between a publication and its retraction was 591 days. The maximum time-lag (about 3,000 days) occurred for papers retracted due to duplicate publications; the minimum time-lag (fewer than 100 days) was for papers retracted due to ''unspecified cause'' (most of these were conference papers). As many as 48 (14%) of the retracted papers were published in two medical journals: Tumor Biology (25 papers) and Diagnostic Pathology (23 papers). From the institutional point of view, Islamic Azad University was the inglorious leader, contributing to over one-half (53.1%) of retracted papers. Among the 343 retraction notices, 64 papers pertained to international collaborations with researchers from mainly Asian and European countries; Malaysia having the most retractions (22 papers). Since most retractions were due to fake peer review and plagiarism, the peer review system appears to be a weak point of the submission/publication process; if improved, the number of retractions would likely drop because of increased editorial control. △ Less

Submitted 29 August, 2021; originally announced August 2021.

Comments: 29 pages, 7 figures, 5 tables, 41 references

Journal ref: Scientometrics 126 (2021) 7351-7371

arXiv:1312.3077 [pdf]

How have the Eastern European countries of the former Warsaw Pact developed since 1990? A bibliometric study

Authors: Marcin Kozak, Lutz Bornmann, Loet Leydesdorff

Abstract: Did the demise of the Soviet Union in 1991 influence the scientific performance of the researchers in Eastern European countries? Did this historical event affect international collaboration by researchers from the Eastern European countries with those of Western countries? Did it also change international collaboration among researchers from the Eastern European countries? Trying to answer these… ▽ More Did the demise of the Soviet Union in 1991 influence the scientific performance of the researchers in Eastern European countries? Did this historical event affect international collaboration by researchers from the Eastern European countries with those of Western countries? Did it also change international collaboration among researchers from the Eastern European countries? Trying to answer these questions, this study aims to shed light on international collaboration by researchers from the Eastern European countries (Russia, Ukraine, Belarus, Moldova, Bulgaria, the Czech Republic, Hungary, Poland, Romania and Slovakia). The number of publications and normalized citation impact values are compared for these countries based on InCites (Thomson Reuters), from 1981 up to 2011. The international collaboration by researchers affiliated to institutions in Eastern European countries at the time points of 1990, 2000 and 2011 was studied with the help of Pajek and VOSviewer software, based on data from the Science Citation Index (Thomson Reuters). Our results show that the breakdown of the communist regime did not lead, on average, to a huge improvement in the publication performance of the Eastern European countries and that the increase in international co-authorship relations by the researchers affiliated to institutions in these countries was smaller than expected. Most of the Eastern European countries are still subject to changes and are still awaiting their boost in scientific development. △ Less

Submitted 11 December, 2013; originally announced December 2013.

Showing 1–7 of 7 results for author: Kozak, M