Search | arXiv e-print repository

Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox

Authors: Shivani Shukla, Himanshu Joshi, Romilla Syed

Abstract: The rapid adoption of Large Language Models(LLMs) for code generation has transformed software development, yet little attention has been given to how security vulnerabilities evolve through iterative LLM feedback. This paper analyzes security degradation in AI-generated code through a controlled experiment with 400 code samples across 40 rounds of "improvements" using four distinct prompting stra… ▽ More The rapid adoption of Large Language Models(LLMs) for code generation has transformed software development, yet little attention has been given to how security vulnerabilities evolve through iterative LLM feedback. This paper analyzes security degradation in AI-generated code through a controlled experiment with 400 code samples across 40 rounds of "improvements" using four distinct prompting strategies. Our findings show a 37.6% increase in critical vulnerabilities after just five iterations, with distinct vulnerability patterns emerging across different prompting approaches. This evidence challenges the assumption that iterative LLM refinement improves code security and highlights the essential role of human expertise in the loop. We propose practical guidelines for developers to mitigate these risks, emphasizing the need for robust human validation between LLM iterations to prevent the paradoxical introduction of new security issues during supposedly beneficial code "improvements". △ Less

Submitted 19 May, 2025; originally announced June 2025.

Comments: Keywords - Large Language Models, Security Vulnerabilities, AI-Generated Code, Iterative Feedback, Software Security, Secure Coding Practices, Feedback Loops, LLM Prompting Strategies

arXiv:2412.13988 [pdf]

RAG for Effective Supply Chain Security Questionnaire Automation

Authors: Zaynab Batool Reza, Abdul Rafay Syed, Omer Iqbal, Ethel Mensah, Qian Liu, Maxx Richard Rahman, Wolfgang Maass

Abstract: In an era where digital security is crucial, efficient processing of security-related inquiries through supply chain security questionnaires is imperative. This paper introduces a novel approach using Natural Language Processing (NLP) and Retrieval-Augmented Generation (RAG) to automate these responses. We developed QuestSecure, a system that interprets diverse document formats and generates preci… ▽ More In an era where digital security is crucial, efficient processing of security-related inquiries through supply chain security questionnaires is imperative. This paper introduces a novel approach using Natural Language Processing (NLP) and Retrieval-Augmented Generation (RAG) to automate these responses. We developed QuestSecure, a system that interprets diverse document formats and generates precise responses by integrating large language models (LLMs) with an advanced retrieval system. Our experiments show that QuestSecure significantly improves response accuracy and operational efficiency. By employing advanced NLP techniques and tailored retrieval mechanisms, the system consistently produces contextually relevant and semantically rich responses, reducing cognitive load on security teams and minimizing potential errors. This research offers promising avenues for automating complex security management tasks, enhancing organizational security processes. △ Less

Submitted 18 December, 2024; originally announced December 2024.

arXiv:2408.07832 [pdf, ps, other]

LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers

Authors: Shantanu Ghosh, Rayan Syed, Chenyu Wang, Vaibhav Choudhary, Binxu Li, Clare B. Poynton, Shyam Visweswaran, Kayhan Batmanghelich

Abstract: Error slice discovery is crucial to diagnose and mitigate model errors. Current clustering or discrete attribute-based slice discovery methods face key limitations: 1) clustering results in incoherent slices, while assigning discrete attributes to slices leads to incomplete coverage of error patterns due to missing or insufficient attributes; 2) these methods lack complex reasoning, preventing the… ▽ More Error slice discovery is crucial to diagnose and mitigate model errors. Current clustering or discrete attribute-based slice discovery methods face key limitations: 1) clustering results in incoherent slices, while assigning discrete attributes to slices leads to incomplete coverage of error patterns due to missing or insufficient attributes; 2) these methods lack complex reasoning, preventing them from fully explaining model biases; 3) they fail to integrate \textit{domain knowledge}, limiting their usage in specialized fields \eg radiology. We propose\ladder (\underline{La}nguage-\underline{D}riven \underline{D}iscovery and \underline{E}rror \underline{R}ectification), to address the limitations by: (1) leveraging the flexibility of natural language to address incompleteness, (2) employing LLM's latent \textit{domain knowledge} and advanced reasoning to analyze sentences and derive testable hypotheses directly, identifying biased attributes, and form coherent error slices without clustering. Existing mitigation methods typically address only the worst-performing group, often amplifying errors in other subgroups. In contrast,\ladder generates pseudo attributes from the discovered hypotheses to mitigate errors across all biases without explicit attribute annotations or prior knowledge of bias. Rigorous evaluations on 6 datasets spanning natural and medical images -- comparing 200+ classifiers with diverse architectures, pretraining strategies, and LLMs -- show that\ladder consistently outperforms existing baselines in discovering and mitigating biases. △ Less

Submitted 29 May, 2025; v1 submitted 31 July, 2024; originally announced August 2024.

Comments: ACL 2025 (Findings). Code: https://github.com/batmanlab/Ladder

arXiv:2404.12465 [pdf, other]

Toward a Quantum Information System Cybersecurity Taxonomy and Testbed: Exploiting a Unique Opportunity for Early Impact

Authors: Benjamin Blakely, Joaquin Chung, Alec Poczatek, Ryan Syed, Raj Kettimuthu

Abstract: Any human-designed system can potentially be exploited in ways that its designers did not envision, and information systems or networks using quantum components do not escape this reality. We are presented with a unique but quickly waning opportunity to bring cybersecurity concerns to the forefront for quantum information systems before they become widely deployed. The resources and knowledge requ… ▽ More Any human-designed system can potentially be exploited in ways that its designers did not envision, and information systems or networks using quantum components do not escape this reality. We are presented with a unique but quickly waning opportunity to bring cybersecurity concerns to the forefront for quantum information systems before they become widely deployed. The resources and knowledge required to do so, however, may not be common in the cybersecurity community. Yet, a nexus exist. Cybersecurity starts with risk, and there are good taxonomies for security vulnerabilities and impacts in classical systems. In this paper, we propose a preliminary taxonomy for quantum cybersecurity vulnerabilities that accounts for the latest advances in quantum information systems, and must evolve to incorporate well-established cybersecurity principles and methodologies. We envision a testbed environment designed and instrumented with the specific purpose of enabling a broad collaborative community of cybersecurity and quantum information system experts to conduct experimental evaluation of software and hardware security including both physical and virtual quantum components. Furthermore, we envision that such a resource may be available as a user facility to the open science research community. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2103.15163 [pdf, other]

Countering Racial Bias in Computer Graphics Research

Authors: Theodore Kim, Holly Rushmeier, Julie Dorsey, Derek Nowrouzezahrai, Raqi Syed, Wojciech Jarosz, A. M. Darke

Abstract: Current computer graphics research practices contain racial biases that have resulted in investigations into "skin" and "hair" that focus on the hegemonic visual features of Europeans and East Asians. To broaden our research horizons to encompass all of humanity, we propose a variety of improvements to quantitative measures and qualitative practices, and pose novel, open research problems. Current computer graphics research practices contain racial biases that have resulted in investigations into "skin" and "hair" that focus on the hegemonic visual features of Europeans and East Asians. To broaden our research horizons to encompass all of humanity, we propose a variety of improvements to quantitative measures and qualitative practices, and pose novel, open research problems. △ Less

Submitted 2 June, 2022; v1 submitted 28 March, 2021; originally announced March 2021.

Comments: 2 pages

arXiv:1807.05906 [pdf, other]

Human Perception of Surprise: A User Study

Authors: Nalin Chhibber, Rohail Syed, Mengqiu Teng, Joslin Goh, Kevyn Collins-Thompson, Edith Law

Abstract: Understanding how to engage users is a critical question in many applications. Previous research has shown that unexpected or astonishing events can attract user attention, leading to positive outcomes such as engagement and learning. In this work, we investigate the similarity and differences in how people and algorithms rank the surprisingness of facts. Our crowdsourcing study, involving 106 par… ▽ More Understanding how to engage users is a critical question in many applications. Previous research has shown that unexpected or astonishing events can attract user attention, leading to positive outcomes such as engagement and learning. In this work, we investigate the similarity and differences in how people and algorithms rank the surprisingness of facts. Our crowdsourcing study, involving 106 participants, shows that computational models of surprise can be used to artificially induce surprise in humans. △ Less

Submitted 16 July, 2018; originally announced July 2018.

Comments: 4 pages. Presented at Computational Surprise Workshop, SIGIR 2018 (Michigan)

Showing 1–6 of 6 results for author: Syed, R