-
Chaos Engineering in the Wild: Findings from GitHub
Authors:
Joshua Owotogbe,
Indika Kumara,
Dario Di Nucci,
Damian Andrew Tamburri,
Willem-Jan van den Heuvel
Abstract:
Chaos engineering aims to improve the resilience of software systems by intentionally injecting faults to identify and address system weaknesses that cause outages in production environments. Although many tools for chaos engineering exist, their practical adoption is not yet explored. This study examines 971 GitHub repositories that incorporate 10 popular chaos engineering tools to identify patte…
▽ More
Chaos engineering aims to improve the resilience of software systems by intentionally injecting faults to identify and address system weaknesses that cause outages in production environments. Although many tools for chaos engineering exist, their practical adoption is not yet explored. This study examines 971 GitHub repositories that incorporate 10 popular chaos engineering tools to identify patterns and trends in their use. The analysis reveals that Toxiproxy and Chaos Mesh are the most frequently used, showing consistent growth since 2016 and reflecting increasing adoption in cloud-native development. The release of new chaos engineering tools peaked in 2018, followed by a shift toward refinement and integration, with Chaos Mesh and LitmusChaos leading in ongoing development activity. Software development is the most frequent application (58.0%), followed by unclassified purposes (16.2%), teaching (10.3%), learning (9.9%), and research (5.7%). Development-focused repositories tend to have higher activity, particularly for Toxiproxy and Chaos Mesh, highlighting their industrial relevance. Fault injection scenarios mainly address network disruptions (40.9%) and instance termination (32.7%), while application-level faults remain underrepresented (3.0%), highlighting for future exploration.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
A Preliminary Investigation on the Usage of Quantum Approximate Optimization Algorithms for Test Case Selection
Authors:
Antonio Trovato,
Martin Beseda,
Dario Di Nucci
Abstract:
Regression testing is key in verifying that software works correctly after changes. However, running the entire regression test suite can be impractical and expensive, especially for large-scale systems. Test suite optimization methods are highly effective but often become infeasible due to their high computational demands. In previous work, Trovato et al. proposed SelectQA, an approach based on q…
▽ More
Regression testing is key in verifying that software works correctly after changes. However, running the entire regression test suite can be impractical and expensive, especially for large-scale systems. Test suite optimization methods are highly effective but often become infeasible due to their high computational demands. In previous work, Trovato et al. proposed SelectQA, an approach based on quantum annealing that outperforms the traditional state-of-the-art methods, i.e., Additional Greedy and DIV-GA, in efficiency. This work envisions the usage of Quantum Approximate Optimization Algorithms (QAOAs) for test case selection by proposing QAOA-TCS. QAOAs merge the potential of gate-based quantum machines with the optimization capabilities of the adiabatic evolution. To prove the effectiveness of QAOAs for test case selection, we preliminarily investigate QAOA-TCS leveraging an ideal environment simulation before evaluating it on real quantum machines. Our results show that QAOAs perform better than the baseline algorithms in effectiveness while being comparable to SelectQA in terms of efficiency. These results encourage us to continue our experimentation with noisy environment simulations and real quantum machines.
△ Less
Submitted 30 April, 2025; v1 submitted 26 April, 2025;
originally announced April 2025.
-
Automated Vulnerability Injection in Solidity Smart Contracts: A Mutation-Based Approach for Benchmark Development
Authors:
Gerardo Iuliano,
Luigi Allocca,
Matteo Cicalese,
Dario Di Nucci
Abstract:
The security of smart contracts is critical in blockchain systems, where even minor vulnerabilities can lead to substantial financial losses. Researchers proposed several vulnerability detection tools evaluated using existing benchmarks. However, most benchmarks are outdated and focus on a narrow set of vulnerabilities. This work evaluates whether mutation seeding can effectively inject vulnerabil…
▽ More
The security of smart contracts is critical in blockchain systems, where even minor vulnerabilities can lead to substantial financial losses. Researchers proposed several vulnerability detection tools evaluated using existing benchmarks. However, most benchmarks are outdated and focus on a narrow set of vulnerabilities. This work evaluates whether mutation seeding can effectively inject vulnerabilities into Solidity-based smart contracts and whether state-of-the-art static analysis tools can detect the injected flaws. We aim to automatically inject vulnerabilities into smart contracts to generate large and wide benchmarks. We propose MuSe, a tool to generate vulnerable smart contracts by leveraging pattern-based mutation operators to inject six vulnerability types into real-world smart contracts. We analyzed these vulnerable smart contracts using Slither, a static analysis tool, to determine its capacity to identify them and assess their validity. The results show that each vulnerability has a different injection rate. Not all smart contracts can exhibit some vulnerabilities because they lack the prerequisites for injection. Furthermore, static analysis tools fail to detect all vulnerabilities injected using pattern-based mutations, underscoring the need for enhancements in static analyzers and demonstrating that benchmarks generated by mutation seeding tools can improve the evaluation of detection tools.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Identifying and Replicating Code Patterns Driving Performance Regressions in Software Systems
Authors:
Denivan Campos,
Luana Martins,
Emanuela Guglielmi,
Michele Tucci,
Daniele Di Pompeo,
Simone Scalabrino,
Vittorio Cortellessa,
Dario Di Nucci,
Rocco Oliveto
Abstract:
Context: Performance regressions negatively impact execution time and memory usage of software systems. Nevertheless, there is a lack of systematic methods to evaluate the effectiveness of performance test suites. Performance mutation testing, which introduces intentional defects (mutants) to measure and enhance fault-detection capabilities, is promising but underexplored. A key challenge is under…
▽ More
Context: Performance regressions negatively impact execution time and memory usage of software systems. Nevertheless, there is a lack of systematic methods to evaluate the effectiveness of performance test suites. Performance mutation testing, which introduces intentional defects (mutants) to measure and enhance fault-detection capabilities, is promising but underexplored. A key challenge is understanding if generated mutants accurately reflect real-world performance issues. Goal: This study evaluates and extends mutation operators for performance testing. Its objectives include (i) collecting existing performance mutation operators, (ii) introducing new operators from real-world code changes that impact performance, and (iii) evaluating these operators on real-world systems to see if they effectively degrade performance. Method: To this aim, we will (i) review the literature to identify performance mutation operators, (ii) conduct a mining study to extract patterns of code changes linked to performance regressions, (iii) propose new mutation operators based on these patterns, and (iv) apply and evaluate the operators to assess their effectiveness in exposing performance degradations. Expected Outcomes: We aim to provide an enriched set of mutation operators for performance testing, helping developers and researchers identify harmful coding practices and design better strategies to detect and prevent performance regressions.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
How Do Solidity Versions Affect Vulnerability Detection Tools? An Empirical Study
Authors:
Gerardo Iuliano,
Davide Corradini,
Michele Pasqua,
Mariano Ceccato,
Dario Di Nucci
Abstract:
Context: Smart contract vulnerabilities pose significant security risks for the Ethereum ecosystem, driving the development of automated tools for detection and mitigation. Smart contracts are written in Solidity, a programming language that is rapidly evolving to add features and improvements to enhance smart contract security. New versions of Solidity change the compilation process, potentially…
▽ More
Context: Smart contract vulnerabilities pose significant security risks for the Ethereum ecosystem, driving the development of automated tools for detection and mitigation. Smart contracts are written in Solidity, a programming language that is rapidly evolving to add features and improvements to enhance smart contract security. New versions of Solidity change the compilation process, potentially affecting how tools interpret and analyze smart contract code. Objective: In such a continuously evolving landscape, we aim to investigate the compatibility of detection tools with Solidity versions. More specifically, we present a plan to study detection tools by empirically assessing (i) their compatibility with the Solidity pragma directives, (ii) their detection effectiveness, and (iii) their execution time across different versions of Solidity. Method: We will conduct an exploratory study by running several tools and collecting a large number of real-world smart contracts to create a balanced dataset. We will track and analyze the tool execution through SmartBugs, a framework that facilitates the tool execution and allows the integration of new tools.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Teaching Mining Software Repositories
Authors:
Zadia Codabux,
Fatemeh Fard,
Roberto Verdecchia,
Fabio Palomba,
Dario Di Nucci,
Gilberto Recupito
Abstract:
Mining Software Repositories (MSR) has become a popular research area recently. MSR analyzes different sources of data, such as version control systems, code repositories, defect tracking systems, archived communication, deployment logs, and so on, to uncover interesting and actionable insights from the data for improved software development, maintenance, and evolution. This chapter provides an ov…
▽ More
Mining Software Repositories (MSR) has become a popular research area recently. MSR analyzes different sources of data, such as version control systems, code repositories, defect tracking systems, archived communication, deployment logs, and so on, to uncover interesting and actionable insights from the data for improved software development, maintenance, and evolution. This chapter provides an overview of MSR and how to conduct an MSR study, including setting up a study, formulating research goals and questions, identifying repositories, extracting and cleaning the data, performing data analysis and synthesis, and discussing MSR study limitations. Furthermore, the chapter discusses MSR as part of a mixed method study, how to mine data ethically, and gives an overview of recent trends in MSR as well as reflects on the future. As a teaching aid, the chapter provides tips for educators, exercises for students at all levels, and a list of repositories that can be used as a starting point for an MSR study.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
Smart Contract Vulnerabilities, Tools, and Benchmarks: An Updated Systematic Literature Review
Authors:
Gerardo Iuliano,
Dario Di Nucci
Abstract:
Smart contracts are self-executing programs on blockchain platforms like Ethereum, which have revolutionized decentralized finance by enabling trustless transactions and the operation of decentralized applications. Despite their potential, the security of smart contracts remains a critical concern due to their immutability and transparency, which expose them to malicious actors. Numerous solutions…
▽ More
Smart contracts are self-executing programs on blockchain platforms like Ethereum, which have revolutionized decentralized finance by enabling trustless transactions and the operation of decentralized applications. Despite their potential, the security of smart contracts remains a critical concern due to their immutability and transparency, which expose them to malicious actors. Numerous solutions for vulnerability detection have been proposed, but it is still unclear which one is the most effective. This paper presents a systematic literature review that explores vulnerabilities in Ethereum smart contracts, focusing on automated detection tools and benchmark evaluation. We reviewed 3,380 studies from five digital libraries and five major software engineering conferences, applying a structured selection process that resulted in 222 high-quality studies. The key results include a hierarchical taxonomy of 192 vulnerabilities grouped into 14 categories, a comprehensive list of 219 detection tools with corresponding functionalities, methods, and code transformation techniques, a mapping between our taxonomy and the list of tools, and a collection of 133 benchmarks used for tool evaluation. We conclude with a discussion about the insights into the current state of Ethereum smart contract security and directions for future research.
△ Less
Submitted 26 May, 2025; v1 submitted 2 December, 2024;
originally announced December 2024.
-
Reformulating Regression Test Suite Optimization using Quantum Annealing -- an Empirical Study
Authors:
Antonio Trovato,
Manuel De Stefano,
Fabiano Pecorelli,
Dario Di Nucci,
Andrea De Lucia
Abstract:
Maintaining software quality is crucial in the dynamic landscape of software development. Regression testing ensures that software works as expected after changes are implemented. However, re-executing all test cases for every modification is often impractical and costly, particularly for large systems. Although very effective, traditional test suite optimization techniques are often impractical i…
▽ More
Maintaining software quality is crucial in the dynamic landscape of software development. Regression testing ensures that software works as expected after changes are implemented. However, re-executing all test cases for every modification is often impractical and costly, particularly for large systems. Although very effective, traditional test suite optimization techniques are often impractical in resource-constrained scenarios, as they are computationally expensive. Hence, quantum computing solutions have been developed to improve their efficiency but have shown drawbacks in terms of effectiveness. We propose reformulating the regression test case selection problem to use quantum computation techniques better. Our objectives are (i) to provide more efficient solutions than traditional methods and (ii) to improve the effectiveness of previously proposed quantum-based solutions. We propose SelectQA, a quantum annealing approach that can outperform the quantum-based approach BootQA in terms of effectiveness while obtaining results comparable to those of the classic Additional Greedy and DIV-GA approaches. Regarding efficiency, SelectQA outperforms DIV-GA and has similar results with the Additional Greedy algorithm but is exceeded by BootQA.
△ Less
Submitted 28 January, 2025; v1 submitted 24 November, 2024;
originally announced November 2024.
-
KRONC: Keypoint-based Robust Camera Optimization for 3D Car Reconstruction
Authors:
Davide Di Nucci,
Alessandro Simoni,
Matteo Tomei,
Luca Ciuffreda,
Roberto Vezzani,
Rita Cucchiara
Abstract:
The three-dimensional representation of objects or scenes starting from a set of images has been a widely discussed topic for years and has gained additional attention after the diffusion of NeRF-based approaches. However, an underestimated prerequisite is the knowledge of camera poses or, more specifically, the estimation of the extrinsic calibration parameters. Although excellent general-purpose…
▽ More
The three-dimensional representation of objects or scenes starting from a set of images has been a widely discussed topic for years and has gained additional attention after the diffusion of NeRF-based approaches. However, an underestimated prerequisite is the knowledge of camera poses or, more specifically, the estimation of the extrinsic calibration parameters. Although excellent general-purpose Structure-from-Motion methods are available as a pre-processing step, their computational load is high and they require a lot of frames to guarantee sufficient overlapping among the views. This paper introduces KRONC, a novel approach aimed at inferring view poses by leveraging prior knowledge about the object to reconstruct and its representation through semantic keypoints. With a focus on vehicle scenes, KRONC is able to estimate the position of the views as a solution to a light optimization problem targeting the convergence of keypoints' back-projections to a singular point. To validate the method, a specific dataset of real-world car scenes has been collected. Experiments confirm KRONC's ability to generate excellent estimates of camera poses starting from very coarse initialization. Results are comparable with Structure-from-Motion methods with huge savings in computation. Code and data will be made publicly available.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems
Authors:
Gilberto Recupito,
Giammaria Giordano,
Filomena Ferrucci,
Dario Di Nucci,
Fabio Palomba
Abstract:
Context. The adoption of Machine Learning (ML)--enabled systems is steadily increasing. Nevertheless, there is a shortage of ML-specific quality assurance approaches, possibly because of the limited knowledge of how quality-related concerns emerge and evolve in ML-enabled systems. Objective. We aim to investigate the emergence and evolution of specific types of quality-related concerns known as ML…
▽ More
Context. The adoption of Machine Learning (ML)--enabled systems is steadily increasing. Nevertheless, there is a shortage of ML-specific quality assurance approaches, possibly because of the limited knowledge of how quality-related concerns emerge and evolve in ML-enabled systems. Objective. We aim to investigate the emergence and evolution of specific types of quality-related concerns known as ML-specific code smells, i.e., sub-optimal implementation solutions applied on ML pipelines that may significantly decrease both the quality and maintainability of ML-enabled systems. More specifically, we present a plan to study ML-specific code smells by empirically analyzing (i) their prevalence in real ML-enabled systems, (ii) how they are introduced and removed, and (iii) their survivability. Method. We will conduct an exploratory study, mining a large dataset of ML-enabled systems and analyzing over 400k commits about 337 projects. We will track and inspect the introduction and evolution of ML smells through CodeSmile, a novel ML smell detector that we will build to enable our investigation and to detect ML-specific code smells.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Architectural Design Decisions for Self-Serve Data Platforms in Data Meshes
Authors:
Tom van Eijk,
Indika Kumara,
Dario Di Nucci,
Damian Andrew Tamburri,
Willem-Jan van den Heuvel
Abstract:
Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale. It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model. The data mesh relies on a managed data platform that offers services to domain…
▽ More
Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale. It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model. The data mesh relies on a managed data platform that offers services to domain and governance teams to build, share, and manage data products efficiently. However, designing and implementing a self-serve data platform is challenging, and the platform engineers and architects must understand and choose the appropriate design options to ensure the platform will enhance the experience of domain and governance teams. For these reasons, this paper proposes a catalog of architectural design decisions and their corresponding decision options by systematically reviewing 43 industrial gray literature articles on self-serve data platforms in data mesh. Moreover, we used semi-structured interviews with six data engineering experts with data mesh experience to validate, refine, and extend the findings from the literature. Such a catalog of design decisions and options drawn from the state of practice shall aid practitioners in building data meshes while providing a baseline for further research on data mesh architectures.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components
Authors:
Davide Di Nucci,
Alessandro Simoni,
Matteo Tomei,
Luca Ciuffreda,
Roberto Vezzani,
Rita Cucchiara
Abstract:
Neural Radiance Fields (NeRFs) have gained widespread recognition as a highly effective technique for representing 3D reconstructions of objects and scenes derived from sets of images. Despite their efficiency, NeRF models can pose challenges in certain scenarios such as vehicle inspection, where the lack of sufficient data or the presence of challenging elements (e.g. reflections) strongly impact…
▽ More
Neural Radiance Fields (NeRFs) have gained widespread recognition as a highly effective technique for representing 3D reconstructions of objects and scenes derived from sets of images. Despite their efficiency, NeRF models can pose challenges in certain scenarios such as vehicle inspection, where the lack of sufficient data or the presence of challenging elements (e.g. reflections) strongly impact the accuracy of the reconstruction. To this aim, we introduce CarPatch, a novel synthetic benchmark of vehicles. In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view. Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques. The dataset is publicly released at https://aimagelab.ing.unimore.it/go/carpatch and can be used as an evaluation guide and as a baseline for future work on this challenging topic.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
The Quantum Frontier of Software Engineering: A Systematic Mapping Study
Authors:
Manuel De Stefano,
Fabiano Pecorelli,
Dario Di Nucci,
Fabio Palomba,
Andrea De Lucia
Abstract:
Context. Quantum computing is becoming a reality, and quantum software engineering (QSE) is emerging as a new discipline to enable developers to design and develop quantum programs.
Objective. This paper presents a systematic mapping study of the current state of QSE research, aiming to identify the most investigated topics, the types and number of studies, the main reported results, and the mos…
▽ More
Context. Quantum computing is becoming a reality, and quantum software engineering (QSE) is emerging as a new discipline to enable developers to design and develop quantum programs.
Objective. This paper presents a systematic mapping study of the current state of QSE research, aiming to identify the most investigated topics, the types and number of studies, the main reported results, and the most studied quantum computing tools/frameworks. Additionally, the study aims to explore the research community's interest in QSE, how it has evolved, and any prior contributions to the discipline before its formal introduction through the Talavera Manifesto.
Method. We searched for relevant articles in several databases and applied inclusion and exclusion criteria to select the most relevant studies. After evaluating the quality of the selected resources, we extracted relevant data from the primary studies and analyzed them.
Results. We found that QSE research has primarily focused on software testing, with little attention given to other topics, such as software engineering management. The most commonly studied technology for techniques and tools is Qiskit, although, in most studies, either multiple or none specific technologies were employed. The researchers most interested in QSE are interconnected through direct collaborations, and several strong collaboration clusters have been identified. Most articles in QSE have been published in non-thematic venues, with a preference for conferences.
Conclusions. The study's implications are providing a centralized source of information for researchers and practitioners in the field, facilitating knowledge transfer, and contributing to the advancement and growth of QSE.
△ Less
Submitted 1 June, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Data Mesh: a Systematic Gray Literature Review
Authors:
Abel Goedegebuure,
Indika Kumara,
Stefan Driessen,
Dario Di Nucci,
Geert Monsieur,
Willem-jan van den Heuvel,
Damian Andrew Tamburri
Abstract:
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept.…
▽ More
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.
△ Less
Submitted 7 August, 2024; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Machine Learning-Based Test Smell Detection
Authors:
Valeria Pontillo,
Dario Amoroso d'Aragona,
Fabiano Pecorelli,
Dario Di Nucci,
Filomena Ferrucci,
Fabio Palomba
Abstract:
Context: Test smells are symptoms of sub-optimal design choices adopted when developing test cases. Previous studies have proved their harmfulness for test code maintainability and effectiveness. Therefore, researchers have been proposing automated, heuristic-based techniques to detect them. However, the performance of such detectors is still limited and dependent on thresholds to be tuned.
Obje…
▽ More
Context: Test smells are symptoms of sub-optimal design choices adopted when developing test cases. Previous studies have proved their harmfulness for test code maintainability and effectiveness. Therefore, researchers have been proposing automated, heuristic-based techniques to detect them. However, the performance of such detectors is still limited and dependent on thresholds to be tuned.
Objective: We propose the design and experimentation of a novel test smell detection approach based on machine learning to detect four test smells.
Method: We plan to develop the largest dataset of manually-validated test smells. This dataset will be leveraged to train six machine learners and assess their capabilities in within- and cross-project scenarios. Finally, we plan to compare our approach with state-of-the-art heuristic-based techniques.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Software Engineering for Quantum Programming: How Far Are We?
Authors:
Manuel De Stefano,
Fabiano Pecorelli,
Dario Di Nucci,
Fabio Palomba,
Andrea De Lucia
Abstract:
Quantum computing is no longer only a scientific interest but is rapidly becoming an industrially available technology that can potentially overcome the limits of classical computation. Over the last years, all major companies have provided frameworks and programming languages that allow developers to create their quantum applications. This shift has led to the definition of a new discipline calle…
▽ More
Quantum computing is no longer only a scientific interest but is rapidly becoming an industrially available technology that can potentially overcome the limits of classical computation. Over the last years, all major companies have provided frameworks and programming languages that allow developers to create their quantum applications. This shift has led to the definition of a new discipline called quantum software engineering, which is demanded to define novel methods for engineering large-scale quantum applications. While the research community is successfully embracing this call, we notice a lack of systematic investigations into the state of the practice of quantum programming. Understanding the challenges that quantum developers face is vital to precisely define the aims of quantum software engineering. Hence, in this paper, we first mine all the GitHub repositories that make use of the most used quantum programming frameworks currently on the market and then conduct coding analysis sessions to produce a taxonomy of the purposes for which quantum technologies are used. In the second place, we conduct a survey study that involves the contributors of the considered repositories, which aims to elicit the developers' opinions on the current adoption and challenges of quantum programming. On the one hand, the results highlight that the current adoption of quantum programming is still limited. On the other hand, there are many challenges that the software engineering community should carefully consider: these do not strictly pertain to technical concerns but also socio-technical matters.
△ Less
Submitted 11 April, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
RepoMiner: a Language-agnostic Python Framework to Mine Software Repositories for Defect Prediction
Authors:
Stefano Dalla Palma,
Dario Di Nucci,
Damian Tamburri
Abstract:
Data originating from open-source software projects provide valuable information to enhance software quality. In the scope of Software Defect Prediction, one of the most challenging parts is extracting valid data about failure-prone software components from these repositories, which can help develop more robust software. In particular, collecting data, calculating metrics, and synthesizing results…
▽ More
Data originating from open-source software projects provide valuable information to enhance software quality. In the scope of Software Defect Prediction, one of the most challenging parts is extracting valid data about failure-prone software components from these repositories, which can help develop more robust software. In particular, collecting data, calculating metrics, and synthesizing results from these repositories is a tedious and error-prone task, which often requires understanding the programming languages involved in the mined repositories, eventually leading to a proliferation of language-specific data-mining software. This paper presents RepoMiner, a language-agnostic tool developed to support software engineering researchers in creating datasets to support any study on defect prediction. RepoMiner automatically collects failure data from software components, labels them as failure-prone or neutral, and calculates metrics to be used as ground truth for defect prediction models. We present its implementation and provide examples of its application.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Detecting Security Fixes in Open-Source Repositories using Static Code Analyzers
Authors:
Therese Fehrer,
RocĂo Cabrera Lozoya,
Antonino Sabetta,
Dario Di Nucci,
Damian A. Tamburri
Abstract:
The sources of reliable, code-level information about vulnerabilities that affect open-source software (OSS) are scarce, which hinders a broad adoption of advanced tools that provide code-level detection and assessment of vulnerable OSS dependencies.
In this paper, we study the extent to which the output of off-the-shelf static code analyzers can be used as a source of features to represent comm…
▽ More
The sources of reliable, code-level information about vulnerabilities that affect open-source software (OSS) are scarce, which hinders a broad adoption of advanced tools that provide code-level detection and assessment of vulnerable OSS dependencies.
In this paper, we study the extent to which the output of off-the-shelf static code analyzers can be used as a source of features to represent commits in Machine Learning (ML) applications. In particular, we investigate how such features can be used to construct embeddings and train ML models to automatically identify source code commits that contain vulnerability fixes.
We analyze such embeddings for security-relevant and non-security-relevant commits, and we show that, although in isolation they are not different in a statistically significant manner, it is possible to use them to construct a ML pipeline that achieves results comparable with the state of the art.
We also found that the combination of our method with commit2vec represents a tangible improvement over the state of the art in the automatic identification of commits that fix vulnerabilities: the ML models we construct and commit2vec are complementary, the former being more generally applicable, albeit not as accurate.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories
Authors:
Daan Hommersom,
Antonino Sabetta,
Bonaventura Coppola,
Dario Di Nucci,
Damian A. Tamburri
Abstract:
The lack of comprehensive sources of accurate vulnerability data represents a critical obstacle to studying and understanding software vulnerabilities (and their corrections). In this paper, we present an approach that combines heuristics stemming from practical experience and machine-learning (ML) - specifically, natural language processing (NLP) - to address this problem. Our method consists of…
▽ More
The lack of comprehensive sources of accurate vulnerability data represents a critical obstacle to studying and understanding software vulnerabilities (and their corrections). In this paper, we present an approach that combines heuristics stemming from practical experience and machine-learning (ML) - specifically, natural language processing (NLP) - to address this problem. Our method consists of three phases. First, an advisory record containing key information about a vulnerability is extracted from an advisory (expressed in natural language). Second, using heuristics, a subset of candidate fix commits is obtained from the source code repository of the affected project by filtering out commits that are known to be irrelevant for the task at hand. Finally, for each such candidate commit, our method builds a numerical feature vector reflecting the characteristics of the commit that are relevant to predicting its match with the advisory at hand. The feature vectors are then exploited for building a final ranked list of candidate fixing commits. The score attributed by the ML model to each feature is kept visible to the users, allowing them to interpret the predictions.
We evaluated our approach using a prototype implementation named FixFinder on a manually curated data set that comprises 2,391 known fix commits corresponding to 1,248 public vulnerability advisories. When considering the top-10 commits in the ranked results, our implementation could successfully identify at least one fix commit for up to 84.03% of the vulnerabilities (with a fix commit on the first position for 65.06% of the vulnerabilities). In conclusion, our method reduces considerably the effort needed to search OSS repositories for the commits that fix known vulnerabilities.
△ Less
Submitted 10 May, 2023; v1 submitted 24 March, 2021;
originally announced March 2021.
-
Automated Test-Case Generation for Solidity Smart Contracts: the AGSolT Approach and its Evaluation
Authors:
Stefan Driessen,
Dario Di Nucci,
Geert Monsieur,
Damian A. Tamburri,
Willem-Jan van den Heuvel
Abstract:
Blockchain and smart contract technology are novel approaches to data and code management that facilitate trusted computing by allowing for development in a distributed and decentralized manner. Testing smart contracts comes with its own set of challenges which have not yet been fully identified and explored. Although existing tools can identify and discover known vulnerabilities and their interac…
▽ More
Blockchain and smart contract technology are novel approaches to data and code management that facilitate trusted computing by allowing for development in a distributed and decentralized manner. Testing smart contracts comes with its own set of challenges which have not yet been fully identified and explored. Although existing tools can identify and discover known vulnerabilities and their interactions on the Ethereum blockchain through random search or symbolic execution, these tools generally do not produce test suites suitable for human oracles. In this paper, we present AGSOLT (Automated Generator of Solidity Test Suites). We demonstrate its efficiency by implementing two search algorithms to automatically generate test suites for stand-alone Solidity smart contracts, taking into account some of the blockchain-specific challenges. To test AGSOLT, we compared a random search algorithm and a genetic algorithm on a set of 36 real-world smart contracts. We found that AGSOLT is capable of achieving high branch coverage with both approaches and even discovered some errors in some of the most popular Solidity smart contracts on Github.
△ Less
Submitted 15 April, 2022; v1 submitted 17 February, 2021;
originally announced February 2021.
-
Prioritising Server Side Reachability via Inter-process Concolic Testing
Authors:
Maarten Vandercammen,
Laurent Christophe,
Dario Di Nucci,
Wolfgang De Meuter,
Coen De Roover
Abstract:
Context: Most approaches to automated white-box testing consider the client side and the server side of a web application in isolation from each other. Such testers lack a whole-program perspective on the web application under test.
Inquiry: We hypothesise that an additional whole-program perspective would enable the tester to discover which server side errors can be triggered by an actual end u…
▽ More
Context: Most approaches to automated white-box testing consider the client side and the server side of a web application in isolation from each other. Such testers lack a whole-program perspective on the web application under test.
Inquiry: We hypothesise that an additional whole-program perspective would enable the tester to discover which server side errors can be triggered by an actual end user accessing the application through the client, and which ones can only be triggered in hypothetical scenarios.
Approach: In this paper, we explore the idea of employing such a whole-program perspective in testing. To this end, we develop , a novel concolic tester which operates on full-stack JavaScript web applications, where both the client and the server side are JavaScript processes communicating via asynchronous messages -- as enabled by the WebSocket or Socket.IO-libraries.
Knowledge: We find that the whole-program perspective enables discerning high-priority errors, which are reachable from a particular client, from low-priority errors, which are not accessible through the tested client. Another benefit of the perspective is that it allows the automated tester to construct practical, step-by-step scenarios for triggering server side errors from the end user's perspective.
Grounding: We apply on a collection of web applications to evaluate how effective testing is in distinguishing between high- and low-priority errors. The results show that correctly classifies the majority of server errors.
Importance: This paper demonstrates the feasibility of testing as a novel approach for automatically testing web applications. Classifying errors as being of high or low importance aids developers in prioritising bugs that might be encountered by users, and postponing the diagnosis of bugs that are less easily reached.
△ Less
Submitted 4 March, 2021; v1 submitted 30 October, 2020;
originally announced October 2020.
-
DeepIaC: Deep Learning-Based Linguistic Anti-pattern Detection in IaC
Authors:
Nemania Borovits,
Indika Kumara,
Parvathy Krishnan,
Stefano Dalla Palma,
Dario Di Nucci,
Fabio Palomba,
Damian A. Tamburri,
Willem-Jan van den Heuvel
Abstract:
Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we conside…
▽ More
Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we consider inconsistencies between the logic/body of IaC code units and their names. To this end, we propose a novel automated approach that employs word embeddings and deep learning techniques. We build and use the abstract syntax tree of IaC code units to create their code embedments. Our experiments with a dataset systematically extracted from open source repositories show that our approach yields an accuracy between0.785and0.915in detecting inconsistencies
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
Towards a Catalogue of Software Quality Metrics for Infrastructure Code
Authors:
Stefano Dalla Palma,
Dario Di Nucci,
Fabio Palomba,
Damian A. Tamburri
Abstract:
Infrastructure-as-code (IaC) is a practice to implement continuous deployment by allowing management and provisioning of infrastructure through the definition of machine-readable files and automation around them, rather than physical hardware configuration or interactive configuration tools. On the one hand, although IaC represents an ever-increasing widely adopted practice nowadays, still little…
▽ More
Infrastructure-as-code (IaC) is a practice to implement continuous deployment by allowing management and provisioning of infrastructure through the definition of machine-readable files and automation around them, rather than physical hardware configuration or interactive configuration tools. On the one hand, although IaC represents an ever-increasing widely adopted practice nowadays, still little is known concerning how to best maintain, speedily evolve, and continuously improve the code behind the IaC practice in a measurable fashion. On the other hand, source code measurements are often computed and analyzed to evaluate the different quality aspects of the software developed. However, unlike general-purpose programming languages (GPLs), IaC scripts use domain-specific languages, and metrics used for GPLs may not be applicable for IaC scripts. This article proposes a catalogue consisting of 46 metrics to identify IaC properties focusing on Ansible, one of the most popular IaC language to date, and shows how they can be used to analyze IaC scripts.
△ Less
Submitted 7 July, 2020; v1 submitted 27 May, 2020;
originally announced May 2020.