Evaluating Source Code Quality with Large Language Models: a comparative study
Authors:
Igor Regis da Silva Simões,
Elaine Venson
Abstract:
Code quality is an attribute composed of various metrics, such as complexity, readability, testability, interoperability, reusability, and the use of good or bad practices, among others. Static code analysis tools aim to measure a set of attributes to assess code quality. However, some quality attributes can only be measured by humans in code review activities, readability being an example. Given…
▽ More
Code quality is an attribute composed of various metrics, such as complexity, readability, testability, interoperability, reusability, and the use of good or bad practices, among others. Static code analysis tools aim to measure a set of attributes to assess code quality. However, some quality attributes can only be measured by humans in code review activities, readability being an example. Given their natural language text processing capability, we hypothesize that a Large Language Model (LLM) could evaluate the quality of code, including attributes currently not automatable. This paper aims to describe and analyze the results obtained using LLMs as a static analysis tool, evaluating the overall quality of code. We compared the LLM with the results obtained with the SonarQube software and its Maintainability metric for two Open Source Software (OSS) Java projects, one with Maintainability Rating A and the other B. A total of 1,641 classes were analyzed, comparing the results in two versions of models: GPT 3.5 Turbo and GPT 4o. We demonstrated that the GPT 3.5 Turbo LLM has the ability to evaluate code quality, showing a correlation with Sonar's metrics. However, there are specific aspects that differ in what the LLM measures compared to SonarQube. The GPT 4o version did not present the same results, diverging from the previous model and Sonar by assigning a high classification to codes that were assessed as lower quality. This study demonstrates the potential of LLMs in evaluating code quality. However, further research is necessary to investigate limitations such as LLM's cost, variability of outputs and explore quality characteristics not measured by traditional static analysis tools.
△ Less
Submitted 22 September, 2024; v1 submitted 7 August, 2024;
originally announced August 2024.
Bridging Theory to Practice in Software Testing Teaching through Team-based Learning (TBL) and Open Source Software (OSS) Contribution
Authors:
Elaine Venson,
Reem Alfayez
Abstract:
Curricula recommendation for undergraduate Software Engineering courses underscore the importance of transcending from traditional lecture format to actively involving students in time-limited, iterative development practices. This paper presents a teaching approach for a software testing course that integrates theory and practical experience through the utilization of both TBL and active contribu…
▽ More
Curricula recommendation for undergraduate Software Engineering courses underscore the importance of transcending from traditional lecture format to actively involving students in time-limited, iterative development practices. This paper presents a teaching approach for a software testing course that integrates theory and practical experience through the utilization of both TBL and active contributions to OSS projects. The paper reports on our experience implementing the pedagogical approach over four consecutive semesters of a Software Testing course within an undergraduate Software Engineering program. The experience encompassed both online and in-person classes, involving a substantial cohort of over 300 students spanning four semesters. Students' perceptions regarding the course are analyzed and compared with previous, related studies. Our results are positively aligned with the existing literature of software engineering teaching, confirming the effectiveness of combining TBL with OSS contributions. Additionally, our survey has shed light on the challenges that students encounter during their first contribution to OSS projects, highlighting the need for targeted solutions. Overall, the experience demonstrates that the proposed pedagogical structure can effectively facilitate the transition from theoretical knowledge to real-world practice in the domain of Software Testing.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.