-
A Large Scale Survey of Motivation in Software Development and Analysis of its Validity
Authors:
Idan Amit,
Dror G. Feitelson
Abstract:
Context: Motivation is known to improve performance. In software development in particular, there has been considerable interest in the motivation of contributors to open source. Objective: We identify 11 motivators from the literature (enjoying programming, ownership of code, learning, self use, etc.), and evaluate their relative effect on motivation. Since motivation is an internal subjective fe…
▽ More
Context: Motivation is known to improve performance. In software development in particular, there has been considerable interest in the motivation of contributors to open source. Objective: We identify 11 motivators from the literature (enjoying programming, ownership of code, learning, self use, etc.), and evaluate their relative effect on motivation. Since motivation is an internal subjective feeling, we also analyze the validity of the answers. Method: We conducted a survey with 66 questions on motivation which was completed by 521 developers. Most of the questions used an 11 point scale. We evaluated the validity of the answers validity by comparing related questions, comparing to actual behavior on GitHub, and comparison with the same developer in a follow up survey. Results: Validity problems include moderate correlations between answers to related questions, as well as self promotion and mistakes in the answers. Despite these problems, predictive analysis, investigating how diverse motivators influence the probability of high motivation, provided valuable insights. The correlations between the different motivators are low, implying their independence. High values in all 11 motivators predict increased probability of high motivation. In addition, improvement analysis shows that an increase in most motivators predicts an increase in general motivation.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Reproducing, Extending, and Analyzing Naming Experiments
Authors:
Rachel Alpern,
Ido Lazer,
Issar Tzachor,
Hanit Hakim,
Sapir Weissbuch,
Dror G. Feitelson
Abstract:
Naming is very important in software development, as names are often the only vehicle of meaning about what the code is intended to do. A recent study on how developers choose names collected the names given by different developers for the same objects. This enabled a study of these names' diversity and structure, and the construction of a model of how names are created. We reproduce different par…
▽ More
Naming is very important in software development, as names are often the only vehicle of meaning about what the code is intended to do. A recent study on how developers choose names collected the names given by different developers for the same objects. This enabled a study of these names' diversity and structure, and the construction of a model of how names are created. We reproduce different parts of this study in three independent experiments. Importantly, we employ methodological variations rather than striving of an exact replication. When the same results are obtained this then boosts our confidence in their validity by demonstrating that they do not depend on the methodology.
Our results indeed corroborate those of the original study in terms of the diversity of names, the low probability of two developers choosing the same name, and the finding that experienced developers tend to use slightly longer names than inexperienced students. We explain name diversity by performing a new analysis of the names, classifying the concepts represented in them as universal (agreed upon), alternative (reflecting divergent views on a topic), or optional (reflecting divergent opinions on whether to include this concept at all). This classification enables new research directions concerning the considerations involved in naming decisions. We also show that explicitly using the model proposed in the original study to guide naming leads to the creation of better names, whereas the simpler approach of just asking participants to use longer and more detailed names does not.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
The Paradox of Function Header Comments
Authors:
Arthur Oxenhorn,
Almog Mor,
Uri Stern,
Dror G. Feitelson
Abstract:
Perhaps the most widely used form of code documentation is function header comments. We performed a large-scale survey of 367 developers to catalog their expectations from such documentation and to chronicle actual practice. Paradoxically, we found that developers appreciate the value of header comments and estimate that they are worth the investment in time, but nevertheless they tend not to writ…
▽ More
Perhaps the most widely used form of code documentation is function header comments. We performed a large-scale survey of 367 developers to catalog their expectations from such documentation and to chronicle actual practice. Paradoxically, we found that developers appreciate the value of header comments and estimate that they are worth the investment in time, but nevertheless they tend not to write such documentation in their own code. Reasons for not writing header comments vary from the belief that code should be self-documenting to concern that documentation will not be kept up-to-date. A possible outcome of this situation is that developers may evade requirements to write documentation by using templates to generate worthless comments that do not provide any real information. We define a simple metric for information-less documentation based on its similarity to the function signature. Applying this to 21,140 files in GitHub Python projects shows that most functions are undocumented, but when header comments are written they typically do contain additional information beyond the function signature.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
When Are Names Similar Or the Same? Introducing the Code Names Matcher Library
Authors:
Moshe Munk,
Dror G. Feitelson
Abstract:
Program code contains functions, variables, and data structures that are represented by names. To promote human understanding, these names should describe the role and use of the code elements they represent. But the names given by developers show high variability, reflecting the tastes of each developer, with different words used for the same meaning or the same words used for different meanings.…
▽ More
Program code contains functions, variables, and data structures that are represented by names. To promote human understanding, these names should describe the role and use of the code elements they represent. But the names given by developers show high variability, reflecting the tastes of each developer, with different words used for the same meaning or the same words used for different meanings. This makes comparing names hard. A precise comparison should be based on matching identical words, but also take into account possible variations on the words (including spelling and typing errors), reordering of the words, matching between synonyms, and so on. To facilitate this we developed a library of comparison functions specifically targeted to comparing names in code. The different functions calculate the similarity between names in different ways, so a researcher can choose the one appropriate for his specific needs. All of them share an attempt to reflect human perceptions of similarity, at the possible expense of lexical matching.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
How Developers Extract Functions: An Experiment
Authors:
Alexey Braver,
Dror G. Feitelson
Abstract:
Creating functions is at the center of writing computer programs. But there has been little empirical research on how this is done and what are the considerations that developers use. We design an experiment in which we can compare the decisions made by multiple developers under exactly the same conditions. The experiment is based on taking existing production code, "flattening" it into a single m…
▽ More
Creating functions is at the center of writing computer programs. But there has been little empirical research on how this is done and what are the considerations that developers use. We design an experiment in which we can compare the decisions made by multiple developers under exactly the same conditions. The experiment is based on taking existing production code, "flattening" it into a single monolithic function, and then charging developers with the task of refactoring it to achieve a better design by extracting functions. The results indicate that developers tend to extract functions based on structural cues, such as 'if' or 'try' blocks. And while there are significant correlations between the refactorings performed by different developers, there are also significant differences in the magnitude of refactoring done. For example, the number of functions that were extracted was between 3 and 10, and the amount of code extracted into functions ranged from 37% to 95%.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
"We do not appreciate being experimented on": Developer and Researcher Views on the Ethics of Experiments on Open-Source Projects
Authors:
Dror G. Feitelson
Abstract:
A tenet of open source software development is to accept contributions from users-developers (typically after appropriate vetting). But should this also include interventions done as part of research on open source development? Following an incident in which buggy code was submitted to the Linux kernel to see whether it would be caught, we conduct a survey among open source developers and empirica…
▽ More
A tenet of open source software development is to accept contributions from users-developers (typically after appropriate vetting). But should this also include interventions done as part of research on open source development? Following an incident in which buggy code was submitted to the Linux kernel to see whether it would be caught, we conduct a survey among open source developers and empirical software engineering researchers to see what behaviors they think are acceptable. This covers two main issues: the use of publicly accessible information, and conducting active experimentation. The survey had 224 respondents. The results indicate that open-source developers are largely open to research, provided it is done transparently. In other words, many would agree to experiments on open-source projects if the subjects were notified and provided informed consent, and in special cases also if only the project leaders agree. While researchers generally hold similar opinions, they sometimes fail to appreciate certain nuances that are important to developers. Examples include observing license restrictions on publishing open-source code and safeguarding the code. Conversely, researchers seem to be more concerned than developers about privacy issues. Based on these results, it is recommended that open source repositories and projects address use for research in their access guidelines, and that researchers take care to ask permission also when not formally required to do so. We note too that the open source community wants to be heard, so professional societies and IRBs should consult with them when formulating ethics codes.
△ Less
Submitted 2 June, 2023; v1 submitted 25 December, 2021;
originally announced December 2021.
-
Use and Perceptions of Multi-Monitor Workstations: A Natural Experiment
Authors:
Guy Amir,
Ayala Prusak,
Tal Reiss,
Nir Zabari,
Dror G. Feitelson
Abstract:
Using multiple monitors is commonly thought to improve productivity, but this is hard to check experimentally. We use a survey, taken by 101 practitioners of which 80% have coded professionally for at least 2 years, to assess subjective perspectives based on experience. To improve validity, we compare situations in which developers naturally use different setups -- the difference between working a…
▽ More
Using multiple monitors is commonly thought to improve productivity, but this is hard to check experimentally. We use a survey, taken by 101 practitioners of which 80% have coded professionally for at least 2 years, to assess subjective perspectives based on experience. To improve validity, we compare situations in which developers naturally use different setups -- the difference between working at home or at the office, and how things changed when developers were forced to work from home due to the Covid-19 pandemic. The results indicate that using multiple monitors is indeed perceived as beneficial and desirable. 19% of the respondents reported adding a monitor to their home setup in response to the Covid-19 situation. At the same time, the single most influential factor cited as affecting productivity was not the physical setup but interactions with co-workers -- both reduced productivity due to lack of connections available at work, and improved productivity due to reduced interruptions from co-workers. A central implication of our work is that empirical research on software development should be conducted in settings similar to those actually used by practitioners, and in particular using workstations configured with multiple monitors.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Does Code Structure Affect Comprehension? On Using and Naming Intermediate Variables
Authors:
Roee Cates,
Nadav Yunik,
Dror G. Feitelson
Abstract:
Intermediate variables can be used to break complex expressions into more manageable smaller expressions, which may be easier to understand. But it is unclear when and whether this actually helps. We conducted an experiment in which subjects read 6 mathematical functions and were supposed to give them meaningful names. 113 subjects participated, of which 58% had 3 or more years of programming work…
▽ More
Intermediate variables can be used to break complex expressions into more manageable smaller expressions, which may be easier to understand. But it is unclear when and whether this actually helps. We conducted an experiment in which subjects read 6 mathematical functions and were supposed to give them meaningful names. 113 subjects participated, of which 58% had 3 or more years of programming work experience. Each function had 3 versions: using a compound expression, using intermediate variables with meaningless names, or using intermediate variables with meaningful names. The results were that in only one case there was a significant difference between the two extreme versions, in favor of the one with intermediate variables with meaningful names. This case was the function that was the hardest to understand to begin with. In two additional cases using intermediate variables with meaningless names appears to have caused a slight decrease in understanding. In all other cases the code structure did not make much of a difference. As it is hard to anticipate what others will find difficult to understand, the conclusion is that using intermediate variables is generally desirable. However, this recommendation hinges on giving them good names.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
Considerations and Pitfalls in Controlled Experiments on Code Comprehension
Authors:
Dror G. Feitelson
Abstract:
Understanding program code is a complicated endeavor. As such, myriad different factors can influence the outcome. Investigations of program comprehension, and in particular those using controlled experiments, have to take these factors into account. In order to promote the development and use of sound experimental methodology, we discuss potential problems with regard to the experimental subjects…
▽ More
Understanding program code is a complicated endeavor. As such, myriad different factors can influence the outcome. Investigations of program comprehension, and in particular those using controlled experiments, have to take these factors into account. In order to promote the development and use of sound experimental methodology, we discuss potential problems with regard to the experimental subjects, the code they work on, the tasks they are asked to perform, and the metrics for their performance.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Using Non-Verbal Expressions as a Tool in Naming Research
Authors:
Omer Regev,
Michael Soloveitchik,
Dror G. Feitelson
Abstract:
Variable and function names are extremely important for program comprehension. It is therefore also important to study how developers select names. But controlled experiments on naming are hindered by the need to describe to experimental subjects what it is they need to name. Words appearing in these descriptions may then find their way into the names, leading to a bias in the results. We suggest…
▽ More
Variable and function names are extremely important for program comprehension. It is therefore also important to study how developers select names. But controlled experiments on naming are hindered by the need to describe to experimental subjects what it is they need to name. Words appearing in these descriptions may then find their way into the names, leading to a bias in the results. We suggest that this problem can be alleviated by using emojis or other small graphics in lieu of key words in the descriptions. A replication of previous work on naming, this time including such emojis and graphics, indeed led to a more diverse and less biased choice of words in the names than when using English descriptions.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
How Developers Choose Names
Authors:
Dror G. Feitelson,
Ayelet Mizrahi,
Nofar Noy,
Aviad Ben Shabat,
Or Eliyahu,
Roy Sheffer
Abstract:
The names of variables and functions serve as implicit documentation and are instrumental for program comprehension. But choosing good meaningful names is hard. We perform a sequence of experiments in which a total of 334 subjects are required to choose names in given programming scenarios. The first experiment shows that the probability that two developers would select the same name is low: in th…
▽ More
The names of variables and functions serve as implicit documentation and are instrumental for program comprehension. But choosing good meaningful names is hard. We perform a sequence of experiments in which a total of 334 subjects are required to choose names in given programming scenarios. The first experiment shows that the probability that two developers would select the same name is low: in the 47 instances in our experiments the median probability was only 6.9%. At the same time, given that a specific name is chosen, it is usually understood by the majority of developers. Analysis of the names given in the experiment suggests a model where naming is a (not necessarily cognizant or serial) three-step process: (1) selecting the concepts to include in the name, (2) choosing the words to represent each concept, and (3) constructing a name using these words. A followup experiment, using the same experimental setup, then checked whether using this model explicitly can improve the quality of names. The results were that names selected by subjects using the model were judged by two independent judges to be superior to names chosen in the original experiment by a ratio of two-to-one. Using the model appears to encourage the use of more concepts and longer names.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Follow Your Nose -- Which Code Smells are Worth Chasing?
Authors:
Idan Amit,
Nili Ben Ezra,
Dror G. Feitelson
Abstract:
The common use case of code smells assumes causality: Identify a smell, remove it, and by doing so improve the code. We empirically investigate their fitness to this use. We present a list of properties that code smells should have if they indeed cause lower quality. We evaluated the smells in 31,687 Java files from 677 GitHub repositories, all the repositories with 200+ commits in 2019. We measur…
▽ More
The common use case of code smells assumes causality: Identify a smell, remove it, and by doing so improve the code. We empirically investigate their fitness to this use. We present a list of properties that code smells should have if they indeed cause lower quality. We evaluated the smells in 31,687 Java files from 677 GitHub repositories, all the repositories with 200+ commits in 2019. We measured the influence of smells on four metrics for quality, productivity, and bug detection efficiency. Out of 151 code smells computed by the CheckStyle smell detector, less than 20% were found to be potentially causal, and only a handful are rather robust. The strongest smells deal with simplicity, defensive programming, and abstraction. Files without the potentially causal smells are 50% more likely to be of high quality. Unfortunately, most smells are not removed, and developers tend to remove the easy ones and not the effective ones.
△ Less
Submitted 15 January, 2024; v1 submitted 2 March, 2021;
originally announced March 2021.
-
The Corrective Commit Probability Code Quality Metric
Authors:
Idan Amit,
Dror G. Feitelson
Abstract:
We present a code quality metric, Corrective Commit Probability (CCP), measuring the probability that a commit reflects corrective maintenance. We show that this metric agrees with developers' concept of quality, informative, and stable. Corrective commits are identified by applying a linguistic model to the commit messages. Corrective commits are identified by applying a linguistic model to the c…
▽ More
We present a code quality metric, Corrective Commit Probability (CCP), measuring the probability that a commit reflects corrective maintenance. We show that this metric agrees with developers' concept of quality, informative, and stable. Corrective commits are identified by applying a linguistic model to the commit messages. Corrective commits are identified by applying a linguistic model to the commit messages. We compute the CCP of all large active GitHub projects (7,557 projects with at least 200 commits in 2019). This leads to the creation of a quality scale, suggesting that the bottom 10% of quality projects spend at least 6 times more effort on fixing bugs than the top 10%. Analysis of project attributes shows that lower CCP (higher quality) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer developers, lower developer churn, better onboarding, and better productivity. Among other things these results support the "Quality is Free" claim, and suggest that achieving higher quality need not require higher expenses.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Using Students as Experimental Subjects in Software Engineering Research -- A Review and Discussion of the Evidence
Authors:
Dror G. Feitelson
Abstract:
Should students be used as experimental subjects in software engineering? Given that students are in many cases readily available and cheap it is no surprise that the vast majority of controlled experiments in software engineering use them. But they can be argued to constitute a convenience sample that may not represent the target population (typically "real" developers), especially in terms of ex…
▽ More
Should students be used as experimental subjects in software engineering? Given that students are in many cases readily available and cheap it is no surprise that the vast majority of controlled experiments in software engineering use them. But they can be argued to constitute a convenience sample that may not represent the target population (typically "real" developers), especially in terms of experience and proficiency. This causes many researchers (and reviewers) to have reservations about the external validity of student-based experiments, and claim that students should not be used. Based on an extensive review of published works that have compared students to professionals, we find that picking on "students" is counterproductive for two main reasons. First, classifying experimental subjects by their status is merely a proxy for more important and meaningful classifications, such as classifying them according to their abilities, and effort should be invested in defining and using these more meaningful classifications. Second, in many cases using students is perfectly reasonable, and student subjects can be used to obtain reliable results and further the research goals. In particular, this appears to be the case when the study involves basic programming and comprehension skills, when tools or methodologies that do not require an extensive learning curve are being compared, and in the initial formative stages of large industrial research initiatives -- in other words, in many of the cases that are suitable for controlled experiments of limited scope.
△ Less
Submitted 28 December, 2015;
originally announced December 2015.
-
No justified complaints: On fair sharing of multiple resources
Authors:
Danny Dolev,
Dror G. Feitelson,
Joseph Y. Halpern,
Raz Kupferman,
Nati Linial
Abstract:
Fair allocation has been studied intensively in both economics and computer science, and fair sharing of resources has aroused renewed interest with the advent of virtualization and cloud computing. Prior work has typically focused on mechanisms for fair sharing of a single resource. We provide a new definition for the simultaneous fair allocation of multiple continuously-divisible resources…
▽ More
Fair allocation has been studied intensively in both economics and computer science, and fair sharing of resources has aroused renewed interest with the advent of virtualization and cloud computing. Prior work has typically focused on mechanisms for fair sharing of a single resource. We provide a new definition for the simultaneous fair allocation of multiple continuously-divisible resources. Roughly speaking, we define fairness as the situation where every user either gets all the resources he wishes for, or else gets at least his entitlement on some bottleneck resource, and therefore cannot complain about not getting more. This definition has the same desirable properties as the recently suggested dominant resource fairness, and also handles the case of multiple bottlenecks. We then prove that a fair allocation according to this definition is guaranteed to exist for any combination of user requests and entitlements (where a user's relative use of the different resources is fixed). The proof, which uses tools from the theory of ordinary differential equations, is constructive and provides a method to compute the allocations numerically.
△ Less
Submitted 14 June, 2011;
originally announced June 2011.