Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

Valeros, Veronica; Širokova, Anna; Catania, Carlos; Garcia, Sebastian

Computer Science > Computation and Language

arXiv:2404.01940 (cs)

[Submitted on 2 Apr 2024]

Title:Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

Authors:Veronica Valeros, Anna Širokova, Carlos Catania, Sebastian Garcia

View PDF HTML (experimental)

Abstract:Understanding cybercrime communications is paramount for cybersecurity defence. This often involves translating communications into English for processing, interpreting, and generating timely intelligence. The problem is that translation is hard. Human translation is slow, expensive, and scarce. Machine translation is inaccurate and biased. We propose using fine-tuned Large Language Models (LLM) to generate translations that can accurately capture the nuances of cybercrime language. We apply our technique to public chats from the NoName057(16) Russian-speaking hacktivist group. Our results show that our fine-tuned LLM model is better, faster, more accurate, and able to capture nuances of the language. Our method shows it is possible to achieve high-fidelity translations and significantly reduce costs by a factor ranging from 430 to 23,000 compared to a human translator.

Comments:	9 pages, 4 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2404.01940 [cs.CL]
	(or arXiv:2404.01940v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.01940

Submission history

From: Veronica Valeros [view email]
[v1] Tue, 2 Apr 2024 13:33:23 UTC (297 KB)

Computer Science > Computation and Language

Title:Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators