Detection of security smells in IaC scripts through semantics-aware code and language processing

War, Aicha; Rawass, Adnan A.; Kabore, Abdoul K.; Samhi, Jordan; Klein, Jacques; Bissyande, Tegawende F.

Computer Science > Cryptography and Security

arXiv:2509.18790 (cs)

[Submitted on 23 Sep 2025]

Title:Detection of security smells in IaC scripts through semantics-aware code and language processing

Authors:Aicha War, Adnan A. Rawass, Abdoul K. Kabore, Jordan Samhi, Jacques Klein, Tegawende F. Bissyande

View PDF HTML (experimental)

Abstract:Infrastructure as Code (IaC) automates the provisioning and management of IT infrastructure through scripts and tools, streamlining software deployment. Prior studies have shown that IaC scripts often contain recurring security misconfigurations, and several detection and mitigation approaches have been proposed. Most of these rely on static analysis, using statistical code representations or Machine Learning (ML) classifiers to distinguish insecure configurations from safe code.
In this work, we introduce a novel approach that enhances static analysis with semantic understanding by jointly leveraging natural language and code representations. Our method builds on two complementary ML models: CodeBERT, to capture semantics across code and text, and LongFormer, to represent long IaC scripts without losing contextual information. We evaluate our approach on misconfiguration datasets from two widely used IaC tools, Ansible and Puppet. To validate its effectiveness, we conduct two ablation studies (removing code text from the natural language input and truncating scripts to reduce context) and compare against four large language models (LLMs) and prior work. Results show that semantic enrichment substantially improves detection, raising precision and recall from 0.46 and 0.79 to 0.92 and 0.88 on Ansible, and from 0.55 and 0.97 to 0.87 and 0.75 on Puppet, respectively.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Software Engineering (cs.SE)
Cite as:	arXiv:2509.18790 [cs.CR]
	(or arXiv:2509.18790v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2509.18790

Submission history

From: Aïcha War [view email]
[v1] Tue, 23 Sep 2025 08:28:49 UTC (219 KB)

Computer Science > Cryptography and Security

Title:Detection of security smells in IaC scripts through semantics-aware code and language processing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Detection of security smells in IaC scripts through semantics-aware code and language processing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators