-
Static and Repeated Cooperative Games for the Optimization of the AoI in IoT Networks
Authors:
David Emanuele Corrado Raphael Catania,
Alessandro Buratto,
Giovanni Perin
Abstract:
Wireless sensing and the internet of things (IoT) are nowadays pervasive in 5G and beyond networks, and they are expected to play a crucial role in 6G. However, a centralized optimization of a distributed system is not always possible and cost-efficient. In this paper, we analyze a setting in which two sensors collaboratively update a common server seeking to minimize the age of information (AoI)…
▽ More
Wireless sensing and the internet of things (IoT) are nowadays pervasive in 5G and beyond networks, and they are expected to play a crucial role in 6G. However, a centralized optimization of a distributed system is not always possible and cost-efficient. In this paper, we analyze a setting in which two sensors collaboratively update a common server seeking to minimize the age of information (AoI) of the latest sample of a common physical process. We consider a distributed and uncoordinated setting where each sensor lacks information about whether the other decides to update the server. This strategic setting is modeled through game theory (GT) and two games are defined: i) a static game of complete information with an incentive mechanism for cooperation, and ii) a repeated game over a finite horizon where the static game is played at each stage. We perform a mathematical analysis of the static game finding three Nash Equilibria (NEs) in pure strategies and one in mixed strategies. A numerical simulation of the repeated game is also presented and novel and valuable insight into the setting is given thanks to the definition of a new metric, the price of delayed updates (PoDU), which shows that the decentralized solution provides results close to the centralized optimum.
△ Less
Submitted 12 May, 2025; v1 submitted 27 March, 2025;
originally announced March 2025.
-
LLMs for Domain Generation Algorithm Detection
Authors:
Reynier Leyva La O,
Carlos A. Catania,
Tatiana Parlanti
Abstract:
This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats w…
▽ More
This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments
Authors:
Maria Rigaki,
Carlos Catania,
Sebastian Garcia
Abstract:
Large Language Models (LLMs) have shown remarkable potential across various domains, including cybersecurity. Using commercial cloud-based LLMs may be undesirable due to privacy concerns, costs, and network connectivity constraints. In this paper, we present Hackphyr, a locally fine-tuned LLM to be used as a red-team agent within network security environments. Our fine-tuned 7 billion parameter mo…
▽ More
Large Language Models (LLMs) have shown remarkable potential across various domains, including cybersecurity. Using commercial cloud-based LLMs may be undesirable due to privacy concerns, costs, and network connectivity constraints. In this paper, we present Hackphyr, a locally fine-tuned LLM to be used as a red-team agent within network security environments. Our fine-tuned 7 billion parameter model can run on a single GPU card and achieves performance comparable with much larger and more powerful commercial models such as GPT-4. Hackphyr clearly outperforms other models, including GPT-3.5-turbo, and baselines, such as Q-learning agents in complex, previously unseen scenarios. To achieve this performance, we generated a new task-specific cybersecurity dataset to enhance the base model's capabilities. Finally, we conducted a comprehensive analysis of the agents' behaviors that provides insights into the planning abilities and potential shortcomings of such agents, contributing to the broader understanding of LLM-based agents in cybersecurity contexts
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation
Authors:
Veronica Valeros,
Anna Širokova,
Carlos Catania,
Sebastian Garcia
Abstract:
Understanding cybercrime communications is paramount for cybersecurity defence. This often involves translating communications into English for processing, interpreting, and generating timely intelligence. The problem is that translation is hard. Human translation is slow, expensive, and scarce. Machine translation is inaccurate and biased. We propose using fine-tuned Large Language Models (LLM) t…
▽ More
Understanding cybercrime communications is paramount for cybersecurity defence. This often involves translating communications into English for processing, interpreting, and generating timely intelligence. The problem is that translation is hard. Human translation is slow, expensive, and scarce. Machine translation is inaccurate and biased. We propose using fine-tuned Large Language Models (LLM) to generate translations that can accurately capture the nuances of cybercrime language. We apply our technique to public chats from the NoName057(16) Russian-speaking hacktivist group. Our results show that our fine-tuned LLM model is better, faster, more accurate, and able to capture nuances of the language. Our method shows it is possible to achieve high-fidelity translations and significantly reduce costs by a factor ranging from 430 to 23,000 compared to a human translator.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
LLM in the Shell: Generative Honeypots
Authors:
Muris Sladić,
Veronica Valeros,
Carlos Catania,
Sebastian Garcia
Abstract:
Honeypots are essential tools in cybersecurity for early detection, threat intelligence gathering, and analysis of attacker's behavior. However, most of them lack the required realism to engage and fool human attackers long-term. Being easy to distinguish honeypots strongly hinders their effectiveness. This can happen because they are too deterministic, lack adaptability, or lack deepness. This wo…
▽ More
Honeypots are essential tools in cybersecurity for early detection, threat intelligence gathering, and analysis of attacker's behavior. However, most of them lack the required realism to engage and fool human attackers long-term. Being easy to distinguish honeypots strongly hinders their effectiveness. This can happen because they are too deterministic, lack adaptability, or lack deepness. This work introduces shelLM, a dynamic and realistic software honeypot based on Large Language Models that generates Linux-like shell output. We designed and implemented shelLM using cloud-based LLMs. We evaluated if shelLM can generate output as expected from a real Linux shell. The evaluation was done by asking cybersecurity researchers to use the honeypot and give feedback if each answer from the honeypot was the expected one from a Linux shell. Results indicate that shelLM can create credible and dynamic answers capable of addressing the limitations of current honeypots. ShelLM reached a TNR of 0.90, convincing humans it was consistent with a real Linux shell. The source code and prompts for replicating the experiments have been publicly available.
△ Less
Submitted 23 September, 2024; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments
Authors:
Maria Rigaki,
Ondřej Lukáš,
Carlos A. Catania,
Sebastian Garcia
Abstract:
Large Language Models (LLMs) have gained widespread popularity across diverse domains involving text generation, summarization, and various natural language processing tasks. Despite their inherent limitations, LLM-based designs have shown promising capabilities in planning and navigating open-world scenarios. This paper introduces a novel application of pre-trained LLMs as agents within cybersecu…
▽ More
Large Language Models (LLMs) have gained widespread popularity across diverse domains involving text generation, summarization, and various natural language processing tasks. Despite their inherent limitations, LLM-based designs have shown promising capabilities in planning and navigating open-world scenarios. This paper introduces a novel application of pre-trained LLMs as agents within cybersecurity network environments, focusing on their utility for sequential decision-making processes.
We present an approach wherein pre-trained LLMs are leveraged as attacking agents in two reinforcement learning environments. Our proposed agents demonstrate similar or better performance against state-of-the-art agents trained for thousands of episodes in most scenarios and configurations. In addition, the best LLM agents perform similarly to human testers of the environment without any additional training process. This design highlights the potential of LLMs to efficiently address complex decision-making tasks within cybersecurity.
Furthermore, we introduce a new network security environment named NetSecGame. The environment is designed to eventually support complex multi-agent scenarios within the network security domain. The proposed environment mimics real network attacks and is designed to be highly modular and adaptable for various scenarios.
△ Less
Submitted 28 August, 2023; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Beyond Random Split for Assessing Statistical Model Performance
Authors:
Carlos Catania,
Jorge Guerra,
Juan Manuel Romero,
Gabriel Caffaratti,
Martin Marchetta
Abstract:
Even though a train/test split of the dataset randomly performed is a common practice, could not always be the best approach for estimating performance generalization under some scenarios. The fact is that the usual machine learning methodology can sometimes overestimate the generalization error when a dataset is not representative or when rare and elusive examples are a fundamental aspect of the…
▽ More
Even though a train/test split of the dataset randomly performed is a common practice, could not always be the best approach for estimating performance generalization under some scenarios. The fact is that the usual machine learning methodology can sometimes overestimate the generalization error when a dataset is not representative or when rare and elusive examples are a fundamental aspect of the detection problem. In the present work, we analyze strategies based on the predictors' variability to split in training and testing sets. Such strategies aim at guaranteeing the inclusion of rare or unusual examples with a minimal loss of the population's representativeness and provide a more accurate estimation about the generalization error when the dataset is not representative. Two baseline classifiers based on decision trees were used for testing the four splitting strategies considered. Both classifiers were applied on CTU19 a low-representative dataset for a network security detection problem. Preliminary results showed the importance of applying the three alternative strategies to the Monte Carlo splitting strategy in order to get a more accurate error estimation on different but feasible scenarios.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
Datasets are not Enough: Challenges in Labeling Network Traffic
Authors:
Jorge Guerra,
Carlos Catania,
Eduardo Veas
Abstract:
In contrast to previous surveys, the present work is not focused on reviewing the datasets used in the network security field. The fact is that many of the available public labeled datasets represent the network behavior just for a particular time period. Given the rate of change in malicious behavior and the serious challenge to label, and maintain these datasets, they become quickly obsolete. Th…
▽ More
In contrast to previous surveys, the present work is not focused on reviewing the datasets used in the network security field. The fact is that many of the available public labeled datasets represent the network behavior just for a particular time period. Given the rate of change in malicious behavior and the serious challenge to label, and maintain these datasets, they become quickly obsolete. Therefore, this work is focused on the analysis of current labeling methodologies applied to network-based data. In the field of network security, the process of labeling a representative network traffic dataset is particularly challenging and costly since very specialized knowledge is required to classify network traces. Consequently, most of the current traffic labeling methods are based on the automatic generation of synthetic network traces, which hides many of the essential aspects necessary for a correct differentiation between normal and malicious behavior. Alternatively, a few other methods incorporate non-experts users in the labeling process of real traffic with the help of visual and statistical tools. However, after conducting an in-depth analysis, it seems that all current methods for labeling suffer from fundamental drawbacks regarding the quality, volume, and speed of the resulting dataset. This lack of consistent methods for continuously generating a representative dataset with an accurate and validated methodology must be addressed by the network security research community. Moreover, a consistent label methodology is a fundamental condition for helping in the acceptance of novel detection approaches based on statistical and machine learning techniques.
△ Less
Submitted 30 December, 2021; v1 submitted 12 October, 2021;
originally announced October 2021.
-
DNS Tunneling: A Deep Learning based Lexicographical Detection Approach
Authors:
Franco Palau,
Carlos Catania,
Jorge Guerra,
Sebastian Garcia,
Maria Rigaki
Abstract:
Domain Name Service is a trusted protocol made for name resolution, but during past years some approaches have been developed to use it for data transfer. DNS Tunneling is a method where data is encoded inside DNS queries, allowing information exchange through the DNS. This characteristic is attractive to hackers who exploit DNS Tunneling method to establish bidirectional communication with machin…
▽ More
Domain Name Service is a trusted protocol made for name resolution, but during past years some approaches have been developed to use it for data transfer. DNS Tunneling is a method where data is encoded inside DNS queries, allowing information exchange through the DNS. This characteristic is attractive to hackers who exploit DNS Tunneling method to establish bidirectional communication with machines infected with malware with the objective of exfiltrating data or sending instructions in an obfuscated way. To detect these threats fast and accurately, the present work proposes a detection approach based on a Convolutional Neural Network (CNN) with a minimal architecture complexity. Due to the lack of quality datasets for evaluating DNS Tunneling connections, we also present a detailed construction and description of a novel dataset that contains DNS Tunneling domains generated with five well-known DNS tools. Despite its simple architecture, the resulting CNN model correctly detected more than 92% of total Tunneling domains with a false positive rate close to 0.8%.
△ Less
Submitted 14 June, 2020; v1 submitted 10 June, 2020;
originally announced June 2020.