-
The Impact of Mutability on Cyclomatic Complexity in Java
Authors:
Marat Bagaev,
Alisa Khabibrakhmanova,
Georgy Sabaev,
Yegor Bugayenko
Abstract:
In Java, some object attributes are mutable, while others are immutable (with the "final" modifier attached to them). Objects that have at least one mutable attribute may be referred to as "mutable" objects. We suspect that mutable objects have higher McCabe's Cyclomatic Complexity (CC) than immutable ones. To validate this intuition, we analysed 862,446 Java files from 1,000 open-GitHub repositor…
▽ More
In Java, some object attributes are mutable, while others are immutable (with the "final" modifier attached to them). Objects that have at least one mutable attribute may be referred to as "mutable" objects. We suspect that mutable objects have higher McCabe's Cyclomatic Complexity (CC) than immutable ones. To validate this intuition, we analysed 862,446 Java files from 1,000 open-GitHub repositories. Our results demonstrated that immutable objects are almost three times less complex than mutable ones. It can be therefore assumed that using more immutable classes could reduce the overall complexity and maintainability of the code base.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Evaluating the Dependency Between Cyclomatic Complexity and Response For Class
Authors:
Maxim Stavtsev,
Yegor Bugayenko
Abstract:
In object-oriented programming, it is reasonable to hypothesize that smaller classes with fewer methods are less complex. Should this hypothesis hold true, it would be advisable for programmers to design classes with fewer methods, as complexity significantly contributes to poor maintainability. To test this assumption, we analyzed 862,517 Java classes from 1,000 open GitHub repositories. Our find…
▽ More
In object-oriented programming, it is reasonable to hypothesize that smaller classes with fewer methods are less complex. Should this hypothesis hold true, it would be advisable for programmers to design classes with fewer methods, as complexity significantly contributes to poor maintainability. To test this assumption, we analyzed 862,517 Java classes from 1,000 open GitHub repositories. Our findings indicate a strong Pearson correlation of 0.79 between the cumulative McCabe's Cyclomatic Complexity (CC) of all class methods and the number of methods, a metric known as Response for Class (RFC).
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Embracing Objects Over Statics: An Analysis of Method Preferences in Open Source Java Frameworks
Authors:
Vladimir Zakharov,
Yegor Bugayenko
Abstract:
In today's software development landscape, the extent to which Java applications utilize object-oriented programming paradigm remains a subject of interest. Although some researches point to the considerable overhead associated with object orientation, one might logically assume that modern Java applications would lean towards a procedural style to boost performance, favoring static over instance…
▽ More
In today's software development landscape, the extent to which Java applications utilize object-oriented programming paradigm remains a subject of interest. Although some researches point to the considerable overhead associated with object orientation, one might logically assume that modern Java applications would lean towards a procedural style to boost performance, favoring static over instance method calls. In order to validate this assumption, this study scrutinizes the runtime behavior of 28 open-source Java frameworks using the YourKit profiler. Contrary to expectations, our findings reveal a predominant use of instance methods and constructors over static methods. This suggests that developers still favor an object-oriented approach, despite its potential drawbacks.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Java Classes with "-Er" and "-Utils" Suffixes Have Higher Complexity
Authors:
Anna Sukhova,
Alexey Akhundov,
Efim Verzakov,
Yegor Bugayenko
Abstract:
In object-oriented programming languages, a belief exists that classes with -Er/-Or and -Utils suffixes are "code smells" because they take over a lot of functional responsibility, turning out to be bulky and complicated, and therefore making it more difficult to maintain the code. In order to validate this intuition, we analyzed complexity and cohesion of 13,861 Java classes from 212 unique open-…
▽ More
In object-oriented programming languages, a belief exists that classes with -Er/-Or and -Utils suffixes are "code smells" because they take over a lot of functional responsibility, turning out to be bulky and complicated, and therefore making it more difficult to maintain the code. In order to validate this intuition, we analyzed complexity and cohesion of 13,861 Java classes from 212 unique open-source GitHub repositories. We found out that average values of Cyclomatic Complexity and Cognitive Complexity metrics are at least 2.5 times higher when suffixes are present.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Programmers Prefer Individually Assigned Tasks vs. Shared Responsibility
Authors:
Adela Krylova,
Roman Makarov,
Sergei Pasynkov,
Yegor Bugayenko
Abstract:
In traditional management, tasks are typically assigned to individuals, with each worker taking full responsibility for the success or failure of a task. In contrast, modern Agile, Lean, and eXtreme Programming practices advocate for shared responsibility, where an entire group is accountable for the outcome of a project or task. Despite numerous studies in other domains, the preferences of progra…
▽ More
In traditional management, tasks are typically assigned to individuals, with each worker taking full responsibility for the success or failure of a task. In contrast, modern Agile, Lean, and eXtreme Programming practices advocate for shared responsibility, where an entire group is accountable for the outcome of a project or task. Despite numerous studies in other domains, the preferences of programmers have not been thoroughly analyzed. To address this gap, we conducted a survey featuring seven situational questions and collected the opinions of 120 software development practitioners. Our findings reveal that programmers prefer tasks to be assigned to them on an individual basis and appreciate taking personal responsibility for failures, as well as receiving individual rewards for successes. Understanding these preferences is crucial for project managers aiming to optimize team dynamics and ensure the successful completion of software projects.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Developers' Perception: Fixed Bugs Often Overlooked as Quality Contributions
Authors:
Vitaly Alifanov,
Kamil Almetov,
Ivan Kornienko,
Arsen Mutalapov,
Yegor Bugayenko
Abstract:
High-quality software products rely on both well-written source code and timely detection and thorough reporting of bugs. However, some programmers view bug reports as negative assessments of their work, leading them to withhold reporting bugs, thereby detrimentally impacting projects. Through a survey of 102 programmers, we discovered that only a third of them perceive the quantity of bugs found…
▽ More
High-quality software products rely on both well-written source code and timely detection and thorough reporting of bugs. However, some programmers view bug reports as negative assessments of their work, leading them to withhold reporting bugs, thereby detrimentally impacting projects. Through a survey of 102 programmers, we discovered that only a third of them perceive the quantity of bugs found and rectified in a repository as indicative of higher quality. This finding substantiates the notion that programmers often misinterpret the significance of testing and bug reporting.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
CAM: A Collection of Snapshots of GitHub Java Repositories Together with Metrics
Authors:
Yegor Bugayenko
Abstract:
Even though numerous researchers require stable datasets along with source code and basic metrics calculated on them, neither GitHub nor any other code hosting platform provides such a resource. Consequently, each researcher must download their own data, compute the necessary metrics, and then publish the dataset somewhere to ensure it remains accessible indefinitely. Our CAM (stands for ``Classes…
▽ More
Even though numerous researchers require stable datasets along with source code and basic metrics calculated on them, neither GitHub nor any other code hosting platform provides such a resource. Consequently, each researcher must download their own data, compute the necessary metrics, and then publish the dataset somewhere to ensure it remains accessible indefinitely. Our CAM (stands for ``Classes and Metrics'') project addresses this need. It is an open-source software capable of cloning Java repositories from GitHub, filtering out unnecessary files, parsing Java classes, and computing metrics such as Cyclomatic Complexity, Halstead Effort and Volume, C\&K metrics, Maintainability Metrics, LCOM5 and HND, as well as some Git-based Metrics. At least once a year, we execute the entire script, a process which requires a minimum of ten days on a very powerful server, to generate a new dataset. Subsequently, we publish it on Amazon S3, thereby ensuring its availability as a reference for researchers. The latest archive of 2.2Gb that we published on the 2nd of March, 2024 includes 532K Java classes with 48 metrics for each class.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Heap vs. Stack: Analyzing Memory Allocations in C and C++ Open Source Software
Authors:
Roman Korostinskiy,
Eugene Darashkevich,
Roman Rusyaev,
Yegor Bugayenko
Abstract:
In C++, objects can be allocated in static memory, on the stack, or on the heap -- the latter being significantly more performance-costly than the former options. We hypothesized that programmers, particularly those involved in widely-used open-source projects, would be conscious of these performance costs and consequently avoid heap allocations. To test this hypothesis, we compiled and executed 7…
▽ More
In C++, objects can be allocated in static memory, on the stack, or on the heap -- the latter being significantly more performance-costly than the former options. We hypothesized that programmers, particularly those involved in widely-used open-source projects, would be conscious of these performance costs and consequently avoid heap allocations. To test this hypothesis, we compiled and executed 797 automated tests across 13 C and 10 C++ open GitHub projects, measuring their heap allocations with Valgrind and stack allocations using DynamoRIO instrumentation. Our findings showed a wide variation in heap allocations, ranging from 0 to 99\% with an average of 9.26\%. We also found that C++ programs use heap less frequently than C programs. Contrary to our initial intuition, this suggests that heap allocations are actively employed in both C and C++ programs. Determining the prevalence of objects in these allocations remains a topic for future research.
△ Less
Submitted 7 October, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
An experience in automatically extracting CAPAs from code repositories
Authors:
Yegor Bugayenko,
Imre Delgado,
Firas Jolha,
Zamira Kholmatova,
Artem Kruglov,
Witold Pedrycz,
Giancarlo Succi,
Xavier Vasquez
Abstract:
TOM (stands for Theoretically Objective Measurements of Software Development Projects) is a set of services that are in charge of helping developers or teams in the process of identifying anomilies within their software development process, and providing a list of preventive or corrective actions (aka CAPAS) that positively impact the process. and in this way to improve the quality of the final pr…
▽ More
TOM (stands for Theoretically Objective Measurements of Software Development Projects) is a set of services that are in charge of helping developers or teams in the process of identifying anomilies within their software development process, and providing a list of preventive or corrective actions (aka CAPAS) that positively impact the process. and in this way to improve the quality of the final product and its development process. In order to get help from TOM, it is as simple as adding our bot (@0capa) to the list of collaborators in your repository, and with this our bot will automatically take care of obtaining different metrics from your repository, in order to suggest actions to take into account to that in your future updates the identified anomalies are not repeated. This paper presents the underlying research on this idea.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
On the Origins of Objects by Means of Careful Selection
Authors:
Yegor Bugayenko
Abstract:
We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a few principles in mind: non-redundancy, simplicity, and so on. The taxonomy is supposed to be used as a navigation map by EO programmers. It may also be helpful as a guideline for designers of other object-oriented languages or libraries for them.
We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a few principles in mind: non-redundancy, simplicity, and so on. The taxonomy is supposed to be used as a navigation map by EO programmers. It may also be helpful as a guideline for designers of other object-oriented languages or libraries for them.
△ Less
Submitted 10 April, 2024; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Reducing Programs to Objects
Authors:
Yegor Bugayenko
Abstract:
C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languages, if language power would be defined as the number of features available for a programmer. EO, on the other hand, is an object-oriented programming language with a reduced set of features: it has nothing by objects and mechanisms of their composition and decoration. We are trying to answer the follow…
▽ More
C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languages, if language power would be defined as the number of features available for a programmer. EO, on the other hand, is an object-oriented programming language with a reduced set of features: it has nothing by objects and mechanisms of their composition and decoration. We are trying to answer the following research question: "Which known features are possible to implement using only objects?"
△ Less
Submitted 27 October, 2023; v1 submitted 17 December, 2021;
originally announced December 2021.
-
EOLANG and $\varphi$-calculus
Authors:
Yegor Bugayenko
Abstract:
Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to $λ$-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO progr…
▽ More
Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to $λ$-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created $\varphi$-calculus. We also created EOLANG (also called EO), an experimental programming language based on $\varphi$-calculus.
△ Less
Submitted 1 March, 2024; v1 submitted 26 November, 2021;
originally announced November 2021.