Search | arXiv e-print repository

A Systematic Approach to Predict the Impact of Cybersecurity Vulnerabilities Using LLMs

Authors: Anders Mølmen Høst, Pierre Lison, Leon Moonen

Abstract: Vulnerability databases, such as the National Vulnerability Database (NVD), offer detailed descriptions of Common Vulnerabilities and Exposures (CVEs), but often lack information on their real-world impact, such as the tactics, techniques, and procedures (TTPs) that adversaries may use to exploit the vulnerability. However, manually linking CVEs to their corresponding TTPs is a challenging and tim… ▽ More Vulnerability databases, such as the National Vulnerability Database (NVD), offer detailed descriptions of Common Vulnerabilities and Exposures (CVEs), but often lack information on their real-world impact, such as the tactics, techniques, and procedures (TTPs) that adversaries may use to exploit the vulnerability. However, manually linking CVEs to their corresponding TTPs is a challenging and time-consuming task, and the high volume of new vulnerabilities published annually makes automated support desirable. This paper introduces TRIAGE, a two-pronged automated approach that uses Large Language Models (LLMs) to map CVEs to relevant techniques from the ATT&CK knowledge base. We first prompt an LLM with instructions based on MITRE's CVE Mapping Methodology to predict an initial list of techniques. This list is then combined with the results from a second LLM-based module that uses in-context learning to map a CVE to relevant techniques. This hybrid approach strategically combines rule-based reasoning with data-driven inference. Our evaluation reveals that in-context learning outperforms the individual mapping methods, and the hybrid approach improves recall of exploitation techniques. We also find that GPT-4o-mini performs better than Llama3.3-70B on this task. Overall, our results show that LLMs can be used to automatically predict the impact of cybersecurity vulnerabilities and TRIAGE makes the process of mapping CVEs to ATT&CK more efficient. Keywords: vulnerability impact, CVE, ATT&CK techniques, large language models, automated mapping. △ Less

Submitted 25 August, 2025; originally announced August 2025.

arXiv:2507.19909 [pdf, ps, other]

The Impact of Fine-tuning Large Language Models on Automated Program Repair

Authors: Roman Macháček, Anastasiia Grishina, Max Hort, Leon Moonen

Abstract: Automated Program Repair (APR) uses various tools and techniques to help developers achieve functional and error-free code faster. In recent years, Large Language Models (LLMs) have gained popularity as components in APR tool chains because of their performance and flexibility. However, training such models requires a significant amount of resources. Fine-tuning techniques have been developed to a… ▽ More Automated Program Repair (APR) uses various tools and techniques to help developers achieve functional and error-free code faster. In recent years, Large Language Models (LLMs) have gained popularity as components in APR tool chains because of their performance and flexibility. However, training such models requires a significant amount of resources. Fine-tuning techniques have been developed to adapt pre-trained LLMs to specific tasks, such as APR, and enhance their performance at far lower computational costs than training from scratch. In this study, we empirically investigate the impact of various fine-tuning techniques on the performance of LLMs used for APR. Our experiments provide insights into the performance of a selection of state-of-the-art LLMs pre-trained on code. The evaluation is done on three popular APR benchmarks (i.e., QuixBugs, Defects4J and HumanEval-Java) and considers six different LLMs with varying parameter sizes (resp. CodeGen, CodeT5, StarCoder, DeepSeekCoder, Bloom, and CodeLlama-2). We consider three training regimens: no fine-tuning, full fine-tuning, and parameter-efficient fine-tuning (PEFT) using LoRA and IA3. We observe that full fine-tuning techniques decrease the benchmarking performance of various models due to different data distributions and overfitting. By using parameter-efficient fine-tuning methods, we restrict models in the amount of trainable parameters and achieve better results. Keywords: large language models, automated program repair, parameter-efficient fine-tuning, AI4Code, AI4SE, ML4SE. △ Less

Submitted 26 July, 2025; originally announced July 2025.

Comments: Accepted for publication in the research track of the 41th International Conference on Software Maintenance and Evolution (ICSME 2025)

arXiv:2505.02931 [pdf, other]

The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models

Authors: Fernando Vallecillos Ruiz, Max Hort, Leon Moonen

Abstract: Automatic program repair (APR) aims to reduce the manual efforts required to identify and fix errors in source code. Before the rise of LLM-based agents, a common strategy was to increase the number of generated patches, sometimes to the thousands, to achieve better repair results on benchmarks. More recently, self-iterative capabilities enabled LLMs to refine patches over multiple rounds guided b… ▽ More Automatic program repair (APR) aims to reduce the manual efforts required to identify and fix errors in source code. Before the rise of LLM-based agents, a common strategy was to increase the number of generated patches, sometimes to the thousands, to achieve better repair results on benchmarks. More recently, self-iterative capabilities enabled LLMs to refine patches over multiple rounds guided by feedback. However, literature often focuses on many iterations and disregards different numbers of outputs. We investigate an APR pipeline that balances these two approaches, the generation of multiple outputs and multiple rounds of iteration, while imposing a limit of 10 total patches per bug. We apply three SOTA instruction-tuned LLMs - DeepSeekCoder-Instruct, Codellama-Instruct, Llama3.1-Instruct - to the APR task. We further fine-tune each model on an APR dataset with three sizes (1K, 30K, 65K) and two techniques (Full Fine-Tuning and LoRA), allowing us to assess their repair capabilities on two APR benchmarks: HumanEval-Java and Defects4J. Our results show that by using only a fraction (<1%) of the fine-tuning dataset, we can achieve improvements of up to 78% in the number of plausible patches generated, challenging prior studies that reported limited gains using Full Fine-Tuning. However, we find that exceeding certain thresholds leads to diminishing outcomes, likely due to overfitting. Moreover, we show that base models greatly benefit from creating patches in an iterative fashion rather than generating them all at once. In addition, the benefit of iterative strategies becomes more pronounced in complex benchmarks. Even fine-tuned models, while benefiting less from iterations, still gain advantages, particularly on complex benchmarks. The research underscores the need for balanced APR strategies that combine multi-output generation and iterative refinement. △ Less

Submitted 5 May, 2025; originally announced May 2025.

Comments: Accepted for publication in the research track of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE), 17-20 June 2025, Istanbul, Türkiye

arXiv:2503.23466 [pdf, other]

Codehacks: A Dataset of Adversarial Tests for Competitive Programming Problems Obtained from Codeforces

Authors: Max Hort, Leon Moonen

Abstract: Software is used in critical applications in our day-to-day life and it is important to ensure its correctness. One popular approach to assess correctness is to evaluate software on tests. If a test fails, it indicates a fault in the software under test; if all tests pass correctly, one may assume that the software is correct. However, the reliability of these results depends on the test suite con… ▽ More Software is used in critical applications in our day-to-day life and it is important to ensure its correctness. One popular approach to assess correctness is to evaluate software on tests. If a test fails, it indicates a fault in the software under test; if all tests pass correctly, one may assume that the software is correct. However, the reliability of these results depends on the test suite considered, and there is a risk of false negatives (i.e. software that passes all available tests but contains bugs because some cases are not tested). Therefore, it is important to consider error-inducing test cases when evaluating software. To support data-driven creation of such a test-suite, which is especially of interest for testing software synthesized from large language models, we curate a dataset (Codehacks) of programming problems together with corresponding error-inducing test cases (i.e., "hacks"). This dataset is collected from the wild, in particular, from the Codeforces online judge platform. The dataset comprises 288,617 hacks for 5,578 programming problems, each with a natural language description, as well as the source code for 2,196 submitted solutions to these problems that can be broken with their corresponding hacks. Keywords: competitive programming, language model, dataset △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: Accepted for publication at the 18th IEEE International Conference on Software Testing, Verification and Validation (ICST 2025)

arXiv:2503.23448 [pdf, other]

Semantic-Preserving Transformations as Mutation Operators: A Study on Their Effectiveness in Defect Detection

Authors: Max Hort, Linas Vidziunas, Leon Moonen

Abstract: Recent advances in defect detection use language models. Existing works enhanced the training data to improve the models' robustness when applied to semantically identical code (i.e., predictions should be the same). However, the use of semantically identical code has not been considered for improving the tools during their application - a concept closely related to metamorphic testing. The goal… ▽ More Recent advances in defect detection use language models. Existing works enhanced the training data to improve the models' robustness when applied to semantically identical code (i.e., predictions should be the same). However, the use of semantically identical code has not been considered for improving the tools during their application - a concept closely related to metamorphic testing. The goal of our study is to determine whether we can use semantic-preserving transformations, analogue to mutation operators, to improve the performance of defect detection tools in the testing stage. We first collect existing publications which implemented semantic-preserving transformations and share their implementation, such that we can reuse them. We empirically study the effectiveness of three different ensemble strategies for enhancing defect detection tools. We apply the collected transformations on the Devign dataset, considering vulnerabilities as a type of defect, and two fine-tuned large language models for defect detection (VulBERTa, PLBART). We found 28 publications with 94 different transformations. We choose to implement 39 transformations from four of the publications, but a manual check revealed that 23 out 39 transformations change code semantics. Using the 16 remaining, correct transformations and three ensemble strategies, we were not able to increase the accuracy of the defect detection models. Our results show that reusing shared semantic-preserving transformation is difficult, sometimes even causing wrongful changes to the semantics. Keywords: defect detection, language model, semantic-preserving transformation, ensemble △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: Accepted for publication in Mutation 2025 at the 18th IEEE International Conference on Software Testing, Verification and Validation (ICST 2025)

arXiv:2503.07693 [pdf, other]

doi 10.1145/3719351

Fully Autonomous Programming using Iterative Multi-Agent Debugging with Large Language Models

Authors: Anastasiia Grishina, Vadim Liventsev, Aki Härmä, Leon Moonen

Abstract: Program synthesis with Large Language Models (LLMs) suffers from a "near-miss syndrome": the generated code closely resembles a correct solution but fails unit tests due to minor errors. We address this with a multi-agent framework called Synthesize, Execute, Instruct, Debug, and Repair (SEIDR). Effectively applying SEIDR to instruction-tuned LLMs requires determining (a) optimal prompts for LLMs,… ▽ More Program synthesis with Large Language Models (LLMs) suffers from a "near-miss syndrome": the generated code closely resembles a correct solution but fails unit tests due to minor errors. We address this with a multi-agent framework called Synthesize, Execute, Instruct, Debug, and Repair (SEIDR). Effectively applying SEIDR to instruction-tuned LLMs requires determining (a) optimal prompts for LLMs, (b) what ranking algorithm selects the best programs in debugging rounds, and (c) balancing the repair of unsuccessful programs with the generation of new ones. We empirically explore these trade-offs by comparing replace-focused, repair-focused, and hybrid debug strategies. We also evaluate lexicase and tournament selection to rank candidates in each generation. On Program Synthesis Benchmark 2 (PSB2), our framework outperforms both conventional use of OpenAI Codex without a repair phase and traditional genetic programming approaches. SEIDR outperforms the use of an LLM alone, solving 18 problems in C++ and 20 in Python on PSB2 at least once across experiments. To assess generalizability, we employ GPT-3.5 and Llama 3 on the PSB2 and HumanEval-X benchmarks. Although SEIDR with these models does not surpass current state-of-the-art methods on the Python benchmarks, the results on HumanEval-C++ are promising. SEIDR with Llama 3-8B achieves an average pass@100 of 84.2%. Across all SEIDR runs, 163 of 164 problems are solved at least once with GPT-3.5 in HumanEval-C++, and 162 of 164 with the smaller Llama 3-8B. We conclude that SEIDR effectively overcomes the near-miss syndrome in program synthesis with LLMs. △ Less

Submitted 10 March, 2025; originally announced March 2025.

Comments: Accepted for publication in ACM Trans. Evol. Learn. Optim., February 2025. arXiv admin note: text overlap with arXiv:2304.10423

arXiv:2409.02474 [pdf, other]

doi 10.1145/3674805.3686684

A Comparative Study on Large Language Models for Log Parsing

Authors: Merve Astekin, Max Hort, Leon Moonen

Abstract: Background: Log messages provide valuable information about the status of software systems. This information is provided in an unstructured fashion and automated approaches are applied to extract relevant parameters. To ease this process, log parsing can be applied, which transforms log messages into structured log templates. Recent advances in language models have led to several studies that appl… ▽ More Background: Log messages provide valuable information about the status of software systems. This information is provided in an unstructured fashion and automated approaches are applied to extract relevant parameters. To ease this process, log parsing can be applied, which transforms log messages into structured log templates. Recent advances in language models have led to several studies that apply ChatGPT to the task of log parsing with promising results. However, the performance of other state-of-the-art large language models (LLMs) on the log parsing task remains unclear. Aims: In this study, we investigate the current capability of state-of-the-art LLMs to perform log parsing. Method: We select six recent LLMs, including both paid proprietary (GPT-3.5, Claude 2.1) and four free-to-use open models, and compare their performance on system logs obtained from a selection of mature open-source projects. We design two different prompting approaches and apply the LLMs on 1, 354 log templates across 16 different projects. We evaluate their effectiveness, in the number of correctly identified templates, and the syntactic similarity between the generated templates and the ground truth. Results: We found that free-to-use models are able to compete with paid models, with CodeLlama extracting 10% more log templates correctly than GPT-3.5. Moreover, we provide qualitative insights into the usability of language models (e.g., how easy it is to use their responses). Conclusions: Our results reveal that some of the smaller, free-to-use LLMs can considerably assist log parsing compared to their paid proprietary competitors, especially code-specialized models. △ Less

Submitted 4 September, 2024; originally announced September 2024.

Comments: Accepted for publication in the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '24)

arXiv:2408.01134 [pdf, other]

The Impact of Program Reduction on Automated Program Repair

Authors: Linas Vidziunas, David Binkley, Leon Moonen

Abstract: Correcting bugs using modern Automated Program Repair (APR) can be both time-consuming and resource-expensive. We describe a program repair approach that aims to improve the scalability of modern APR tools. The approach leverages program reduction in the form of program slicing to eliminate code irrelevant to fixing the bug, which improves the APR tool's overall performance. We investigate slicing… ▽ More Correcting bugs using modern Automated Program Repair (APR) can be both time-consuming and resource-expensive. We describe a program repair approach that aims to improve the scalability of modern APR tools. The approach leverages program reduction in the form of program slicing to eliminate code irrelevant to fixing the bug, which improves the APR tool's overall performance. We investigate slicing's impact on all three phases of the repair process: fault localization, patch generation, and patch validation. Our empirical exploration finds that the proposed approach, on average, enhances the repair ability of the TBar APR tool, but we also discovered a few cases where it was less successful. Specifically, on examples from the widely used Defects4J dataset, we obtain a substantial reduction in median repair time, which falls from 80 minutes to just under 18 minutes. We conclude that program reduction can improve the performance of APR without degrading repair quality, but this improvement is not universal. A replication package is available via Zenodo at https://doi.org/10.5281/zenodo.13074333. Keywords: automated program repair, dynamic program slicing, fault localization, test-suite reduction, hybrid techniques. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: Accepted for publication as full research paper in the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME 2024)

arXiv:2401.07994 [pdf, other]

A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language Models

Authors: Fernando Vallecillos Ruiz, Anastasiia Grishina, Max Hort, Leon Moonen

Abstract: Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back using neural machine translation with language models. We investigate whether this correction capability of Large Language Models (LLMs) extends to Automatic Program Repair (APR). Current generative models for APR are pre-trained on source code and fine-tuned for repair. This pape… ▽ More Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back using neural machine translation with language models. We investigate whether this correction capability of Large Language Models (LLMs) extends to Automatic Program Repair (APR). Current generative models for APR are pre-trained on source code and fine-tuned for repair. This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back. We hypothesize that RTT with LLMs restores the most commonly seen patterns in code during pre-training, i.e., performs a regression toward the mean, which removes bugs as they are a form of noise w.r.t. the more frequent, natural, bug-free code in the training data. To test this hypothesis, we employ eight recent LLMs pre-trained on code, including the latest GPT versions, and four common program repair benchmarks in Java. We find that RTT with English as an intermediate language repaired 101 of 164 bugs with GPT-4 on the HumanEval-Java dataset. Moreover, 46 of these are unique bugs that are not repaired by other LLMs fine-tuned for APR. Our findings highlight the viability of round-trip translation with LLMs as a technique for automated program repair and its potential for research in software engineering. Keywords: automated program repair, large language model, machine translation △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2307.02443 [pdf, other]

An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code

Authors: Max Hort, Anastasiia Grishina, Leon Moonen

Abstract: Large language models trained on source code can support a variety of software development tasks, such as code recommendation and program repair. Large amounts of data for training such models benefit the models' performance. However, the size of the data and models results in long training times and high energy consumption. While publishing source code allows for replicability, users need to repe… ▽ More Large language models trained on source code can support a variety of software development tasks, such as code recommendation and program repair. Large amounts of data for training such models benefit the models' performance. However, the size of the data and models results in long training times and high energy consumption. While publishing source code allows for replicability, users need to repeat the expensive training process if models are not shared. The main goal of the study is to investigate if publications that trained language models for software engineering (SE) tasks share source code and trained artifacts. The second goal is to analyze the transparency on training energy usage. We perform a snowballing-based literature search to find publications on language models for source code, and analyze their reusability from a sustainability standpoint. From 494 unique publications, we identified 293 relevant publications that use language models to address code-related tasks. Among them, 27% (79 out of 293) make artifacts available for reuse. This can be in the form of tools or IDE plugins designed for specific tasks or task-agnostic models that can be fine-tuned for a variety of downstream tasks. Moreover, we collect insights on the hardware used for model training, as well as training time, which together determine the energy consumption of the development process. We find that there are deficiencies in the sharing of information and artifacts for current studies on source code models for software engineering tasks, with 40% of the surveyed papers not sharing source code or trained artifacts. We recommend the sharing of source code as well as trained artifacts, to enable sustainable reproducibility. Moreover, comprehensive information on training times and hardware configurations should be shared for transparency on a model's carbon footprint. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Accepted for publication in the 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2023)

arXiv:2305.04940 [pdf, other]

doi 10.1145/3611643.3616304

The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification

Authors: Anastasiia Grishina, Max Hort, Leon Moonen

Abstract: The use of modern Natural Language Processing (NLP) techniques has shown to be beneficial for software engineering tasks, such as vulnerability detection and type inference. However, training deep NLP models requires significant computational resources. This paper explores techniques that aim at achieving the best usage of resources and available information in these models. We propose a generic… ▽ More The use of modern Natural Language Processing (NLP) techniques has shown to be beneficial for software engineering tasks, such as vulnerability detection and type inference. However, training deep NLP models requires significant computational resources. This paper explores techniques that aim at achieving the best usage of resources and available information in these models. We propose a generic approach, EarlyBIRD, to build composite representations of code from the early layers of a pre-trained transformer model. We empirically investigate the viability of this approach on the CodeBERT model by comparing the performance of 12 strategies for creating composite representations with the standard practice of only using the last encoder layer. Our evaluation on four datasets shows that several early layer combinations yield better performance on defect detection, and some combinations improve multi-class classification. More specifically, we obtain a +2 average improvement of detection accuracy on Devign with only 3 out of 12 layers of CodeBERT and a 3.3x speed-up of fine-tuning. These findings show that early layers can be used to obtain better results using the same resources, as well as to reduce resource usage during fine-tuning and inference. △ Less

Submitted 11 September, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: The content in this pre-print is the same as in the CRC accepted for publication in the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023)

arXiv:2305.00382 [pdf, other]

Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Authors: Anders Mølmen Høst, Pierre Lison, Leon Moonen

Abstract: Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of ne… ▽ More Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, heuristic rules, and knowledge graph embeddings. We demonstrate how our method helps to fix missing entities in knowledge graphs used for cybersecurity and evaluate the performance. △ Less

Submitted 15 May, 2023; v1 submitted 30 April, 2023; originally announced May 2023.

Comments: Accepted for publication in the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), Tórshavn, Faroe Islands, May 22nd-24th, 2023. [v2]: added funding acknowledgments

arXiv:2304.10423 [pdf, other]

doi 10.1145/3583131.3590481

Fully Autonomous Programming with Large Language Models

Authors: Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen

Abstract: Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndrome": they tend to generate programs that semantically resemble the correct answer (as measured by text similarity metrics or human evaluation), but achieve a low or even zero accuracy as measured by unit tests due to small imperfections, such as the wrong input or output format. This calls for an a… ▽ More Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndrome": they tend to generate programs that semantically resemble the correct answer (as measured by text similarity metrics or human evaluation), but achieve a low or even zero accuracy as measured by unit tests due to small imperfections, such as the wrong input or output format. This calls for an approach known as Synthesize, Execute, Debug (SED), whereby a draft of the solution is generated first, followed by a program repair phase addressing the failed tests. To effectively apply this approach to instruction-driven LLMs, one needs to determine which prompts perform best as instructions for LLMs, as well as strike a balance between repairing unsuccessful programs and replacing them with newly generated ones. We explore these trade-offs empirically, comparing replace-focused, repair-focused, and hybrid debug strategies, as well as different template-based and model-based prompt-generation techniques. We use OpenAI Codex as the LLM and Program Synthesis Benchmark 2 as a database of problem descriptions and tests for evaluation. The resulting framework outperforms both conventional usage of Codex without the repair phase and traditional genetic programming approaches. △ Less

Submitted 20 April, 2023; originally announced April 2023.

Comments: Accepted for publication in the Genetic and Evolutionary Computation Conference (GECCO 2023)

arXiv:2303.07283 [pdf, other]

CHESS: A Framework for Evaluation of Self-adaptive Systems based on Chaos Engineering

Authors: Sehrish Malik, Moeen Ali Naqvi, Leon Moonen

Abstract: There is an increasing need to assess the correct behavior of self-adaptive and self-healing systems due to their adoption in critical and highly dynamic environments. However, there is a lack of systematic evaluation methods for self-adaptive and self-healing systems. We proposed CHESS, a novel approach to address this gap by evaluating self-adaptive and self-healing systems through fault injecti… ▽ More There is an increasing need to assess the correct behavior of self-adaptive and self-healing systems due to their adoption in critical and highly dynamic environments. However, there is a lack of systematic evaluation methods for self-adaptive and self-healing systems. We proposed CHESS, a novel approach to address this gap by evaluating self-adaptive and self-healing systems through fault injection based on chaos engineering (CE) [ arXiv:2208.13227 ]. The artifact presented in this paper provides an extensive overview of the use of CHESS through two microservice-based case studies: a smart office case study and an existing demo application called Yelb. It comes with a managing system service, a self-monitoring service, as well as five fault injection scenarios covering infrastructure faults and functional faults. Each of these components can be easily extended or replaced to adopt the CHESS approach to a new case study, help explore its promises and limitations, and identify directions for future research. Keywords: self-healing, resilience, chaos engineering, evaluation, artifact △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: Accepted for publication in the 18nd Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2023)

arXiv:2211.03911 [pdf, other]

Towards Extending the Range of Bugs That Automated Program Repair Can Handle

Authors: Omar I. Al-Bataineh, Leon Moonen

Abstract: Modern automated program repair (APR) is well-tuned to finding and repairing bugs that introduce observable erroneous behavior to a program. However, a significant class of bugs does not lead to such observable behavior (e.g., liveness/termination bugs, non-functional bugs, and information flow bugs). Such bugs can generally not be handled with current APR approaches, so, as a community, we need t… ▽ More Modern automated program repair (APR) is well-tuned to finding and repairing bugs that introduce observable erroneous behavior to a program. However, a significant class of bugs does not lead to such observable behavior (e.g., liveness/termination bugs, non-functional bugs, and information flow bugs). Such bugs can generally not be handled with current APR approaches, so, as a community, we need to develop complementary techniques. To stimulate the systematic study of alternative APR approaches and hybrid APR combinations, we devise a novel bug classification system that enables methodical analysis of their bug detection power and bug repair capabilities. To demonstrate the benefits, we analyze the repair of termination bugs in sequential and concurrent programs. The study shows that integrating dynamic APR with formal analysis techniques, such as termination provers and software model checkers, reduces complexity and improves the overall reliability of these repairs. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: Accepted for publication in the 22nd IEEE International Conference on Software Quality, Reliability and Security (QRS 2022)

arXiv:2208.13244 [pdf, other]

Assessing the Impact of Execution Environment on Observation-Based Slicing

Authors: David Binkley, Leon Moonen

Abstract: Program slicing reduces a program to a smaller version that retains a chosen computation, referred to as a slicing criterion. One recent multi-lingual slicing approach, observation-based slicing (ORBS), speculatively deletes parts of the program and then executes the code. If the behavior of the slicing criteria is unchanged, the speculative deletion is made permanent. While this makes ORBS lang… ▽ More Program slicing reduces a program to a smaller version that retains a chosen computation, referred to as a slicing criterion. One recent multi-lingual slicing approach, observation-based slicing (ORBS), speculatively deletes parts of the program and then executes the code. If the behavior of the slicing criteria is unchanged, the speculative deletion is made permanent. While this makes ORBS language agnostic, it can lead to the production of some non-intuitive slices. One particular challenge is when the execution environment plays a role. For example, ORBS will delete the line "a = 0" if the memory location assigned to a contains zero before executing the statement, since deletion will not affect the value of a and thus the slicing criterion. Consequently, slices can differ between execution environments due to factors such as initialization and call stack reuse. The technique considered, nVORBS, attempts to ameliorate this problem by validating a candidate slice in n different execution environments. We conduct an empirical study to collect initial insights into how often the execution environment leads to slice differences. Specifically, we compare and contrast the slices produced by seven different instantiations of nVORBS. Looking forward, the technique can be seen as a variation on metamorphic testing, and thus suggests how ideas from metamorphic testing might be used to improve dynamic program analysis. △ Less

Submitted 28 August, 2022; originally announced August 2022.

arXiv:2208.13227 [pdf, other]

doi 10.1109/ACSOS55765.2022.00018

On Evaluating Self-Adaptive and Self-Healing Systems using Chaos Engineering

Authors: Moeen Ali Naqvi, Sehrish Malik, Merve Astekin, Leon Moonen

Abstract: With the growing adoption of self-adaptive systems in various domains, there is an increasing need for strategies to assess their correct behavior. In particular self-healing systems, which aim to provide resilience and fault-tolerance, often deal with unanticipated failures in critical and highly dynamic environments. Their reactive and complex behavior makes it challenging to assess if these sys… ▽ More With the growing adoption of self-adaptive systems in various domains, there is an increasing need for strategies to assess their correct behavior. In particular self-healing systems, which aim to provide resilience and fault-tolerance, often deal with unanticipated failures in critical and highly dynamic environments. Their reactive and complex behavior makes it challenging to assess if these systems execute according to the desired goals. Recently, several studies have expressed concern about the lack of systematic evaluation methods for self-healing behavior. In this paper, we propose CHESS, an approach for the systematic evaluation of self-adaptive and self-healing systems that builds on chaos engineering. Chaos engineering is a methodology for subjecting a system to unexpected conditions and scenarios. It has shown great promise in helping developers build resilient microservice architectures and cyber-physical systems. CHESS turns this idea around by using chaos engineering to evaluate how well a self-healing system can withstand such perturbations. We investigate the viability of this approach through an exploratory study on a self-healing smart office environment. The study helps us explore the promises and limitations of the approach, as well as identify directions where additional work is needed. We conclude with a summary of lessons learned. △ Less

Submitted 28 August, 2022; originally announced August 2022.

Comments: 10 pages

arXiv:2202.02679 [pdf, other]

doi 10.1016/j.infsof.2022.106844

Featherweight Assisted Vulnerability Discovery

Authors: David Binkley, Leon Moonen, Sibren Isaacman

Abstract: Predicting vulnerable source code helps to focus attention on those parts of the code that need to be examined with more scrutiny. Recent work proposed the use of function names as semantic cues that can be learned by a deep neural network (DNN) to aid in the hunt for vulnerability of functions. Combining identifier splitting, which splits each function name into its constituent words, with a no… ▽ More Predicting vulnerable source code helps to focus attention on those parts of the code that need to be examined with more scrutiny. Recent work proposed the use of function names as semantic cues that can be learned by a deep neural network (DNN) to aid in the hunt for vulnerability of functions. Combining identifier splitting, which splits each function name into its constituent words, with a novel frequency-based algorithm, we explore the extent to which the words that make up a function's name can predict potentially vulnerable functions. In contrast to *lightweight* predictions by a DNN that considers only function names, avoiding the use of a DNN provides *featherweight* predictions. The underlying idea is that function names that contain certain "dangerous" words are more likely to accompany vulnerable functions. Of course, this assumes that the frequency-based algorithm can be properly tuned to focus on truly dangerous words. Because it is more transparent than a DNN, the frequency-based algorithm enables us to investigate the inner workings of the DNN. If successful, this investigation into what the DNN does and does not learn will help us train more effective future models. We empirically evaluate our approach on a heterogeneous dataset containing over 73000 functions labeled vulnerable, and over 950000 functions labeled benign. Our analysis shows that words alone account for a significant portion of the DNN's classification ability. We also find that words are of greatest value in the datasets with a more homogeneous vocabulary. Thus, when working within the scope of a given project, where the vocabulary is unavoidably homogeneous, our approach provides a cheaper, potentially complementary, technique to aid in the hunt for source-code vulnerabilities. Finally, this approach has the advantage that it is viable with orders of magnitude less training data. △ Less

Submitted 5 February, 2022; originally announced February 2022.

Comments: 17 pages, 6 figures, 6 tables

Journal ref: Information and Software Technology, 2022

arXiv:2111.05713 [pdf, ps, other]

Towards More Reliable Automated Program Repair by Integrating Static Analysis Techniques

Authors: Omar I. Al-Bataineh, Anastasiia Grishina, Leon Moonen

Abstract: A long-standing open challenge for automated program repair is the overfitting problem, which is caused by having insufficient or incomplete specifications to validate whether a generated patch is correct or not. Most available repair systems rely on weak specifications (i.e., specifications that are synthesized from test cases) which limits the quality of generated repairs. To strengthen specific… ▽ More A long-standing open challenge for automated program repair is the overfitting problem, which is caused by having insufficient or incomplete specifications to validate whether a generated patch is correct or not. Most available repair systems rely on weak specifications (i.e., specifications that are synthesized from test cases) which limits the quality of generated repairs. To strengthen specifications and improve the quality of repairs, we propose to closer integrate static bug detection techniques with automated program repair. The integration combines automated program repair with static analysis techniques in such a way that bug detection patterns can be synthesized into specifications that the repair system can use. We explore the feasibility of such integration using two types of bugs: arithmetic bugs, such as integer overflow, and logical bugs, such as termination bugs. As part of our analysis, we make several observations that help to improve patch generation for these classes of bugs. Moreover, these observations assist with narrowing down the candidate patch search space, and inferring an effective search order. △ Less

Submitted 10 November, 2021; originally announced November 2021.

Comments: Accepted at the 21st IEEE International Conference on Software Quality, Reliability, and Security (QRS 2021)

arXiv:2107.08760 [pdf, other]

doi 10.1145/3475960.3475985

CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software

Authors: Guru Prasad Bhandari, Amara Naseer, Leon Moonen

Abstract: Data-driven research on the automated discovery and repair of security vulnerabilities in source code requires comprehensive datasets of real-life vulnerable code and their fixes. To assist in such research, we propose a method to automatically collect and curate a comprehensive vulnerability dataset from Common Vulnerabilities and Exposures (CVE) records in the public National Vulnerability Datab… ▽ More Data-driven research on the automated discovery and repair of security vulnerabilities in source code requires comprehensive datasets of real-life vulnerable code and their fixes. To assist in such research, we propose a method to automatically collect and curate a comprehensive vulnerability dataset from Common Vulnerabilities and Exposures (CVE) records in the public National Vulnerability Database (NVD). We implement our approach in a fully automated dataset collection tool and share an initial release of the resulting vulnerability dataset named CVEfixes. The CVEfixes collection tool automatically fetches all available CVE records from the NVD, gathers the vulnerable code and corresponding fixes from associated open-source repositories, and organizes the collected information in a relational database. Moreover, the dataset is enriched with meta-data such as programming language, and detailed code and security metrics at five levels of abstraction. The collection can easily be repeated to keep up-to-date with newly discovered or patched vulnerabilities. The initial release of CVEfixes spans all published CVEs up to 9 June 2021, covering 5365 CVE records for 1754 open-source projects that were addressed in a total of 5495 vulnerability fixing commits. CVEfixes supports various types of data-driven software security research, such as vulnerability prediction, vulnerability classification, vulnerability severity prediction, analysis of vulnerability-related code changes, and automated vulnerability repair. △ Less

Submitted 19 July, 2021; originally announced July 2021.

Comments: Accepted for publication in Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '21), August 19-20, 2021, Athens, Greece

arXiv:2101.02534 [pdf, other]

Adaptive Immunity for Software: Towards Autonomous Self-healing Systems

Authors: Moeen Ali Naqvi, Merve Astekin, Sehrish Malik, Leon Moonen

Abstract: Testing and code reviews are known techniques to improve the quality and robustness of software. Unfortunately, the complexity of modern software systems makes it impossible to anticipate all possible problems that can occur at runtime, which limits what issues can be found using testing and reviews. Thus, it is of interest to consider autonomous self-healing software systems, which can automatica… ▽ More Testing and code reviews are known techniques to improve the quality and robustness of software. Unfortunately, the complexity of modern software systems makes it impossible to anticipate all possible problems that can occur at runtime, which limits what issues can be found using testing and reviews. Thus, it is of interest to consider autonomous self-healing software systems, which can automatically detect, diagnose, and contain unanticipated problems at runtime. Most research in this area has adopted a model-driven approach, where actual behavior is checked against a model specifying the intended behavior, and a controller takes action when the system behaves outside of the specification. However, it is not easy to develop these specifications, nor to keep them up-to-date as the system evolves. We pose that, with the recent advances in machine learning, such models may be learned by observing the system. Moreover, we argue that artificial immune systems (AISs) are particularly well-suited for building self-healing systems, because of their anomaly detection and diagnosis capabilities. We present the state-of-the-art in self-healing systems and in AISs, surveying some of the research directions that have been considered up to now. To help advance the state-of-the-art, we develop a research agenda for building self-healing software systems using AISs, identifying required foundations, and promising research directions. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Comments: 5 pages, 2 figures

arXiv:2009.03257 [pdf, other]

doi 10.1145/3239235.3239248

Improving Problem Identification via Automated Log Clustering using Dimensionality Reduction

Authors: Carl Martin Rosenberg, Leon Moonen

Abstract: Goal: We consider the problem of automatically grouping logs of runs that failed for the same underlying reasons, so that they can be treated more effectively, and investigate the following questions: (1) Does an approach developed to identify problems in system logs generalize to identifying problems in continuous deployment logs? (2) How does dimensionality reduction affect the quality of automa… ▽ More Goal: We consider the problem of automatically grouping logs of runs that failed for the same underlying reasons, so that they can be treated more effectively, and investigate the following questions: (1) Does an approach developed to identify problems in system logs generalize to identifying problems in continuous deployment logs? (2) How does dimensionality reduction affect the quality of automated log clustering? (3) How does the criterion used for merging clusters in the clustering algorithm affect clustering quality? Method: We replicate and extend earlier work on clustering system log files to assess its generalization to continuous deployment logs. We consider the optional inclusion of one of these dimensionality reduction techniques: Principal Component Analysis (PCA), Latent Semantic Indexing (LSI), and Non-negative Matrix Factorization (NMF). Moreover, we consider three alternative cluster merge criteria (Single Linkage, Average Linkage, and Weighted Linkage), in addition to the Complete Linkage criterion used in earlier work. We empirically evaluate the 16 resulting configurations on continuous deployment logs provided by our industrial collaborator. Results: Our study shows that (1) identifying problems in continuous deployment logs via clustering is feasible, (2) including NMF significantly improves overall accuracy and robustness, and (3) Complete Linkage performs best of all merge criteria analyzed. Conclusions: We conclude that problem identification via automated log clustering is improved by including dimensionality reduction, as it decreases the pipeline's sensitivity to parameter choice, thereby increasing its robustness for handling different inputs. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Journal ref: Published in ESEM'18, Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2018, Article: 16, pp. 1-10,

arXiv:2008.06948 [pdf, other]

doi 10.1145/3382494.3410684

Spectrum-Based Log Diagnosis

Authors: Carl Martin Rosenberg, Leon Moonen

Abstract: We present and evaluate Spectrum-Based Log Diagnosis (SBLD), a method to help developers quickly diagnose problems found in complex integration and deployment runs. Inspired by Spectrum-Based Fault Localization, SBLD leverages the differences in event occurrences between logs for failing and passing runs, to highlight events that are stronger associated with failing runs. Using data provided by… ▽ More We present and evaluate Spectrum-Based Log Diagnosis (SBLD), a method to help developers quickly diagnose problems found in complex integration and deployment runs. Inspired by Spectrum-Based Fault Localization, SBLD leverages the differences in event occurrences between logs for failing and passing runs, to highlight events that are stronger associated with failing runs. Using data provided by our industrial partner, we empirically investigate the following questions: (i) How well does SBLD reduce the effort needed to identify all failure-relevant events in the log for a failing run? (ii) How is the performance of SBLD affected by available data? (iii) How does SBLD compare to searching for simple textual patterns that often occur in failure-relevant events? We answer (i) and (ii) using summary statistics and heatmap visualizations, and for (iii) we compare three configurations of SBLD (with resp. minimum, median and maximum data) against a textual search using Wilcoxon signed-rank tests and the Vargha-Delaney measure of stochastic superiority. Our evaluation shows that (i) SBLD achieves a significant effort reduction for the dataset used, (ii) SBLD benefits from additional logs for passing runs in general, and it benefits from additional logs for failing runs when there is a proportional amount of logs for passing runs in the data. Finally, (iii) SBLD and textual search are roughly equally effective at effort-reduction, while textual search has a slightly better recall. We investigate the cause, and discuss how it is due to the characteristics of a specific part of our data. We conclude that SBLD shows promise as a method for diagnosing failing runs, that its performance is positively affected by additional data, but that it does not outperform textual search on the dataset considered. Future work includes investigating SBLD's generalizability on additional datasets. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: Published in ESEM'20: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), October 8-9, 2020, Bari, Italy. ACM, 12 pages

arXiv:0707.2291 [pdf]

An Integrated Crosscutting Concern Migration Strategy and its Application to JHotDraw

Authors: Marius Marin, Leon Moonen, Arie van Deursen

Abstract: In this paper we propose a systematic strategy for migrating crosscutting concerns in existing object-oriented systems to aspect-based solutions. The proposed strategy consists of four steps: mining, exploration, documentation and refactoring of crosscutting concerns. We discuss in detail a new approach to aspect refactoring that is fully integrated with our strategy, and apply the whole strateg… ▽ More In this paper we propose a systematic strategy for migrating crosscutting concerns in existing object-oriented systems to aspect-based solutions. The proposed strategy consists of four steps: mining, exploration, documentation and refactoring of crosscutting concerns. We discuss in detail a new approach to aspect refactoring that is fully integrated with our strategy, and apply the whole strategy to an object-oriented system, namely the JHotDraw framework. The result of this migration is made available as an open-source project, which is the largest aspect refactoring available to date. We report on our experiences with conducting this case study and reflect on the success and challenges of the migration process, as well as on the feasibility of automatic aspect refactoring. △ Less

Submitted 22 July, 2007; v1 submitted 16 July, 2007; originally announced July 2007.

Comments: 10+ 4 pages

Report number: TUD-SERG-2007-019 ACM Class: D.2

arXiv:cs/0609147 [pdf]

Identifying Crosscutting Concerns Using Fan-in Analysis

Authors: Marius Marin, Arie van Deursen, Leon Moonen

Abstract: Aspect mining is a reverse engineering process that aims at finding crosscutting concerns in existing systems. This paper proposes an aspect mining approach based on determining methods that are called from many different places, and hence have a high fan-in, which can be seen as a symptom of crosscutting functionality. The approach is semi-automatic, and consists of three steps: metric calculat… ▽ More Aspect mining is a reverse engineering process that aims at finding crosscutting concerns in existing systems. This paper proposes an aspect mining approach based on determining methods that are called from many different places, and hence have a high fan-in, which can be seen as a symptom of crosscutting functionality. The approach is semi-automatic, and consists of three steps: metric calculation, method filtering, and call site analysis. Carrying out these steps is an interactive process supported by an Eclipse plug-in called FINT. Fan-in analysis has been applied to three open source Java systems, totaling around 200,000 lines of code. The most interesting concerns identified are discussed in detail, which includes several concerns not previously discussed in the aspect-oriented literature. The results show that a significant number of crosscutting concerns can be recognized using fan-in analysis, and each of the three steps can be supported by tools. △ Less

Submitted 19 February, 2007; v1 submitted 26 September, 2006; originally announced September 2006.

Comments: 34+4 pages; Extended version [Marin et al. 2004a]

Report number: TUD-SERG-2006-013 ACM Class: D.2.3; D.2.7; D.2.8

Journal ref: ACM Transactions on Software Engineering and Methodology, 2007

arXiv:cs/0607063 [pdf]

Prioritizing Software Inspection Results using Static Profiling

Authors: Cathal Boogerd, Leon Moonen

Abstract: Static software checking tools are useful as an additional automated software inspection step that can easily be integrated in the development cycle and assist in creating secure, reliable and high quality code. However, an often quoted disadvantage of these tools is that they generate an overly large number of warnings, including many false positives due to the approximate analysis techniques.… ▽ More Static software checking tools are useful as an additional automated software inspection step that can easily be integrated in the development cycle and assist in creating secure, reliable and high quality code. However, an often quoted disadvantage of these tools is that they generate an overly large number of warnings, including many false positives due to the approximate analysis techniques. This information overload effectively limits their usefulness. In this paper we present ELAN, a technique that helps the user prioritize the information generated by a software inspection tool, based on a demand-driven computation of the likelihood that execution reaches the locations for which warnings are reported. This analysis is orthogonal to other prioritization techniques known from literature, such as severity levels and statistical analysis to reduce false positives. We evaluate feasibility of our technique using a number of case studies and assess the quality of our predictions by comparing them to actual values obtained by dynamic profiling. △ Less

Submitted 12 July, 2006; originally announced July 2006.

Comments: 14 pages

Report number: TUD-SERG-2006-001

arXiv:cs/0607006 [pdf]

Applying and Combining Three Different Aspect Mining Techniques

Authors: Mariano Ceccato, Marius Marin, Kim Mens, Leon Moonen, Paolo Tonella, Tom Tourwe

Abstract: Understanding a software system at source-code level requires understanding the different concerns that it addresses, which in turn requires a way to identify these concerns in the source code. Whereas some concerns are explicitly represented by program entities (like classes, methods and variables) and thus are easy to identify, crosscutting concerns are not captured by a single program entity… ▽ More Understanding a software system at source-code level requires understanding the different concerns that it addresses, which in turn requires a way to identify these concerns in the source code. Whereas some concerns are explicitly represented by program entities (like classes, methods and variables) and thus are easy to identify, crosscutting concerns are not captured by a single program entity but are scattered over many program entities and are tangled with the other concerns. Because of their crosscutting nature, such crosscutting concerns are difficult to identify, and reduce the understandability of the system as a whole. In this paper, we report on a combined experiment in which we try to identify crosscutting concerns in the JHotDraw framework automatically. We first apply three independently developed aspect mining techniques to JHotDraw and evaluate and compare their results. Based on this analysis, we present three interesting combinations of these three techniques, and show how these combinations provide a more complete coverage of the detected concerns as compared to the original techniques individually. Our results are a first step towards improving the understandability of a system that contains crosscutting concerns, and can be used as a basis for refactoring the identified crosscutting concerns into aspects. △ Less

Submitted 2 July, 2006; originally announced July 2006.

Comments: 28 pages

Report number: TUD-SERG-2006-002

arXiv:cs/0606113 [pdf]

doi 10.1109/WCRE.2006.6

A common framework for aspect mining based on crosscutting concern sorts

Authors: Marius Marin, Leon Moonen, Arie van Deursen

Abstract: The increasing number of aspect mining techniques proposed in literature calls for a methodological way of comparing and combining them in order to assess, and improve on, their quality. This paper addresses this situation by proposing a common framework based on crosscutting concern sorts which allows for consistent assessment, comparison and combination of aspect mining techniques. The framewo… ▽ More The increasing number of aspect mining techniques proposed in literature calls for a methodological way of comparing and combining them in order to assess, and improve on, their quality. This paper addresses this situation by proposing a common framework based on crosscutting concern sorts which allows for consistent assessment, comparison and combination of aspect mining techniques. The framework identifies a set of requirements that ensure homogeneity in formulating the mining goals, presenting the results and assessing their quality. We demonstrate feasibility of the approach by retrofitting an existing aspect mining technique to the framework, and by using it to design and implement two new mining techniques. We apply the three techniques to a known aspect mining benchmark and show how they can be consistently assessed and combined to increase the quality of the results. The techniques and combinations are implemented in FINT, our publicly available free aspect mining tool. △ Less

Submitted 27 June, 2006; originally announced June 2006.

Comments: 14 pages

Report number: TUD-SERG-2006-009

Journal ref: Proceedings Working Conference on Reverse Engineering (WCRE), IEEE Computer Society, 2006, pages 29-38

arXiv:cs/0503015 [pdf, ps, other]

A Systematic Aspect-Oriented Refactoring and Testing Strategy, and its Application to JHotDraw

Authors: Arie van Deursen, Marius Marin, Leon Moonen

Abstract: Aspect oriented programming aims at achieving better modularization for a system's crosscutting concerns in order to improve its key quality attributes, such as evolvability and reusability. Consequently, the adoption of aspect-oriented techniques in existing (legacy) software systems is of interest to remediate software aging. The refactoring of existing systems to employ aspect-orientation wil… ▽ More Aspect oriented programming aims at achieving better modularization for a system's crosscutting concerns in order to improve its key quality attributes, such as evolvability and reusability. Consequently, the adoption of aspect-oriented techniques in existing (legacy) software systems is of interest to remediate software aging. The refactoring of existing systems to employ aspect-orientation will be considerably eased by a systematic approach that will ensure a safe and consistent migration. In this paper, we propose a refactoring and testing strategy that supports such an approach and consider issues of behavior conservation and (incremental) integration of the aspect-oriented solution with the original system. The strategy is applied to the JHotDraw open source project and illustrated on a group of selected concerns. Finally, we abstract from the case study and present a number of generic refactorings which contribute to an incremental aspect-oriented refactoring process and associate particular types of crosscutting concerns to the model and features of the employed aspect language. The contributions of this paper are both in the area of supporting migration towards aspect-oriented solutions and supporting the development of aspect languages that are better suited for such migrations. △ Less

Submitted 5 March, 2005; originally announced March 2005.

Comments: 25 pages

ACM Class: D.2.7; D.2.5; D.1.5

Showing 1–29 of 29 results for author: Moonen, L