Search | arXiv e-print repository

DroidTTP: Mapping Android Applications with TTP for Cyber Threat Intelligence

Authors: Dincy R Arikkat, Vinod P., Rafidha Rehiman K. A., Serena Nicolazzo, Marco Arazzi, Antonino Nocera, Mauro Conti

Abstract: The widespread adoption of Android devices for sensitive operations like banking and communication has made them prime targets for cyber threats, particularly Advanced Persistent Threats (APT) and sophisticated malware attacks. Traditional malware detection methods rely on binary classification, failing to provide insights into adversarial Tactics, Techniques, and Procedures (TTPs). Understanding… ▽ More The widespread adoption of Android devices for sensitive operations like banking and communication has made them prime targets for cyber threats, particularly Advanced Persistent Threats (APT) and sophisticated malware attacks. Traditional malware detection methods rely on binary classification, failing to provide insights into adversarial Tactics, Techniques, and Procedures (TTPs). Understanding malware behavior is crucial for enhancing cybersecurity defenses. To address this gap, we introduce DroidTTP, a framework mapping Android malware behaviors to TTPs based on the MITRE ATT&CK framework. Our curated dataset explicitly links MITRE TTPs to Android applications. We developed an automated solution leveraging the Problem Transformation Approach (PTA) and Large Language Models (LLMs) to map applications to both Tactics and Techniques. Additionally, we employed Retrieval-Augmented Generation (RAG) with prompt engineering and LLM fine-tuning for TTP predictions. Our structured pipeline includes dataset creation, hyperparameter tuning, data augmentation, feature selection, model development, and SHAP-based model interpretability. Among LLMs, Llama achieved the highest performance in Tactic classification with a Jaccard Similarity of 0.9583 and Hamming Loss of 0.0182, and in Technique classification with a Jaccard Similarity of 0.9348 and Hamming Loss of 0.0127. However, the Label Powerset XGBoost model outperformed LLMs, achieving a Jaccard Similarity of 0.9893 for Tactic classification and 0.9753 for Technique classification, with a Hamming Loss of 0.0054 and 0.0050, respectively. While XGBoost showed superior performance, the narrow margin highlights the potential of LLM-based approaches in TTP classification. △ Less

Submitted 20 March, 2025; originally announced March 2025.

arXiv:2411.05442 [pdf, other]

IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery

Authors: Dincy R. Arikkat, Abhinav M., Navya Binu, Parvathi M., Navya Biju, K. S. Arunima, Vinod P., Rafidha Rehiman K. A., Mauro Conti

Abstract: In the rapidly evolving landscape of cyber security, intelligent chatbots are gaining prominence. Artificial Intelligence, Machine Learning, and Natural Language Processing empower these chatbots to handle user inquiries and deliver threat intelligence. This helps cyber security knowledge readily available to both professionals and the public. Traditional rule-based chatbots often lack flexibility… ▽ More In the rapidly evolving landscape of cyber security, intelligent chatbots are gaining prominence. Artificial Intelligence, Machine Learning, and Natural Language Processing empower these chatbots to handle user inquiries and deliver threat intelligence. This helps cyber security knowledge readily available to both professionals and the public. Traditional rule-based chatbots often lack flexibility and struggle to adapt to user interactions. In contrast, Large Language Model-based chatbots offer contextually relevant information across multiple domains and adapt to evolving conversational contexts. In this work, we develop IntellBot, an advanced cyber security Chatbot built on top of cutting-edge technologies like Large Language Models and Langchain alongside a Retrieval-Augmented Generation model to deliver superior capabilities. This chatbot gathers information from diverse data sources to create a comprehensive knowledge base covering known vulnerabilities, recent cyber attacks, and emerging threats. It delivers tailored responses, serving as a primary hub for cyber security insights. By providing instant access to relevant information and resources, this IntellBot enhances threat intelligence, incident response, and overall security posture, saving time and empowering users with knowledge of cyber security best practices. Moreover, we analyzed the performance of our copilot using a two-stage evaluation strategy. We achieved BERT score above 0.8 by indirect approach and a cosine similarity score ranging from 0.8 to 1, which affirms the accuracy of our copilot. Additionally, we utilized RAGAS to evaluate the RAG model, and all evaluation metrics consistently produced scores above 0.77, highlighting the efficacy of our system. △ Less

Submitted 8 November, 2024; originally announced November 2024.

arXiv:2406.14102 [pdf, other]

SeCTIS: A Framework to Secure CTI Sharing

Authors: Dincy R. Arikkat, Mert Cihangiroglu, Mauro Conti, Rafidha Rehiman K. A., Serena Nicolazzo, Antonino Nocera, Vinod P

Abstract: The rise of IT-dependent operations in modern organizations has heightened their vulnerability to cyberattacks. As a growing number of organizations include smart, interconnected devices in their systems to automate their processes, the attack surface becomes much bigger, and the complexity and frequency of attacks pose a significant threat. Consequently, organizations have been compelled to seek… ▽ More The rise of IT-dependent operations in modern organizations has heightened their vulnerability to cyberattacks. As a growing number of organizations include smart, interconnected devices in their systems to automate their processes, the attack surface becomes much bigger, and the complexity and frequency of attacks pose a significant threat. Consequently, organizations have been compelled to seek innovative approaches to mitigate the menaces inherent in their infrastructure. In response, considerable research efforts have been directed towards creating effective solutions for sharing Cyber Threat Intelligence (CTI). Current information-sharing methods lack privacy safeguards, leaving organizations vulnerable to leaks of both proprietary and confidential data. To tackle this problem, we designed a novel framework called SeCTIS (Secure Cyber Threat Intelligence Sharing), integrating Swarm Learning and Blockchain technologies to enable businesses to collaborate, preserving the privacy of their CTI data. Moreover, our approach provides a way to assess the data and model quality, and the trustworthiness of all the participants leveraging some validators through Zero Knowledge Proofs. An extensive experimental campaign demonstrates our framework's correctness and performance, and the detailed attack model discusses its robustness against attacks in the context of data and model quality. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2311.08807 [pdf, other]

NLP-Based Techniques for Cyber Threat Intelligence

Authors: Marco Arazzi, Dincy R. Arikkat, Serena Nicolazzo, Antonino Nocera, Rafidha Rehiman K. A., Vinod P., Mauro Conti

Abstract: In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and… ▽ More In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and mitigating threats and enabling proactive defense strategies. In this context, NLP, an artificial intelligence branch, has emerged as a powerful tool for enhancing threat intelligence capabilities. This survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI. Finally, the challenges and limitations of NLP in threat intelligence are exhaustively examined, including data quality issues and ethical considerations. This survey draws a complete framework and serves as a valuable resource for security professionals and researchers seeking to understand the state-of-the-art NLP-based threat intelligence techniques and their potential impact on cybersecurity. △ Less

Submitted 15 November, 2023; originally announced November 2023.

arXiv:2306.16087 [pdf, other]

Can Twitter be used to Acquire Reliable Alerts against Novel Cyber Attacks?

Authors: Dincy R Arikkat, Vinod P., Rafidha Rehiman K. A., Andrea Di Sorbo, Corrado A. Visaggio, Mauro Conti

Abstract: Time-relevant and accurate threat information from public domains are essential for cyber security. In a constantly evolving threat landscape, such information assists security researchers in thwarting attack strategies. In this work, we collect and analyze threat-related information from Twitter to extract intelligence for proactive security. We first use a convolutional neural network to classif… ▽ More Time-relevant and accurate threat information from public domains are essential for cyber security. In a constantly evolving threat landscape, such information assists security researchers in thwarting attack strategies. In this work, we collect and analyze threat-related information from Twitter to extract intelligence for proactive security. We first use a convolutional neural network to classify the tweets as containing or not valuable threat indicators. In particular, to gather threat intelligence from social media, the proposed approach collects pertinent Indicators of Compromise (IoCs) from tweets, such as IP addresses, URLs, File hashes, domain addresses, and CVE IDs. Then, we analyze the IoCs to confirm whether they are reliable and valuable for threat intelligence using performance indicators, such as correctness, timeliness, and overlap. We also evaluate how fast Twitter shares IoCs compared to existing threat intelligence services. Furthermore, through machine learning models, we classify Twitter accounts as either automated or human-operated and delve into the role of bot accounts in disseminating cyber threat information on social media. Our results demonstrate that Twitter is growing into a powerful platform for gathering precise and pertinent malware IoCs and a reliable source for mining threat intelligence. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Showing 1–5 of 5 results for author: Arikkat, D R