Computer Science > Software Engineering
[Submitted on 1 Aug 2019 (v1), last revised 5 Aug 2019 (this version, v2)]
Title:Learning to Identify Security-Related Issues Using Convolutional Neural Networks
View PDFAbstract:Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements.
Submission history
From: David N. Palacio [view email][v1] Thu, 1 Aug 2019 20:33:56 UTC (294 KB)
[v2] Mon, 5 Aug 2019 13:26:14 UTC (294 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.