Search | arXiv e-print repository

Learning from Mistakes: Understanding Ad-hoc Logs through Analyzing Accidental Commits

Authors: Yi-Hung Chou, Yiyang Min, April Yi Wang, James A. Jones

Abstract: Developers often insert temporary "print" or "log" instructions into their code to help them better understand runtime behavior, usually when the code is not behaving as they expected. Despite the fact that such monitoring instructions, or "ad-hoc logs," are so commonly used by developers, there is almost no existing literature that studies developers' practices in how they use them. This paucity… ▽ More Developers often insert temporary "print" or "log" instructions into their code to help them better understand runtime behavior, usually when the code is not behaving as they expected. Despite the fact that such monitoring instructions, or "ad-hoc logs," are so commonly used by developers, there is almost no existing literature that studies developers' practices in how they use them. This paucity of knowledge of the use of these ephemeral logs may be largely due to the fact that they typically only exist in the developers' local environments and are removed before they commit their code to their revision control system. In this work, we overcome this challenge by observing that developers occasionally mistakenly forget to remove such instructions before committing, and then they remove them shortly later. Additionally, we further study such developer logging practices by watching and analyzing live-streamed coding videos. Through these empirical approaches, we study where, how, and why developers use ad-hoc logs to better understand their code and its execution. We collect 27 GB of accidental commits that removed 548,880 ad-hoc logs in JavaScript from GitHub Archive repositories to provide the first large-scale dataset and empirical studies on ad-hoc logging practices. Our results reveal several illuminating findings, including a particular propensity for developers to use ad-hoc logs in asynchronous and callback functions. Our findings provide both empirical evidence and a valuable dataset for researchers and tool developers seeking to enhance ad-hoc logging practices, and potentially deepen our understanding of developers' practices towards understanding of software's runtime behaviors. △ Less

Submitted 17 April, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

Comments: Accepted at MSR 2025

arXiv:2411.09846 [pdf, other]

Leveraging Propagated Infection to Crossfire Mutants

Authors: Hang Du, Vijay Krishna Palepu, James A. Jones

Abstract: Mutation testing was proposed to identify weaknesses in test suites by repeatedly generating artificially faulty versions of the software (mutants) and determining if the test suite is sufficient to detect them (kill them). When the tests are insufficient, each surviving mutant provides an opportunity to improve the test suite. We conducted a study and found that many such surviving mutants (up to… ▽ More Mutation testing was proposed to identify weaknesses in test suites by repeatedly generating artificially faulty versions of the software (mutants) and determining if the test suite is sufficient to detect them (kill them). When the tests are insufficient, each surviving mutant provides an opportunity to improve the test suite. We conducted a study and found that many such surviving mutants (up to 84% for the subjects of our study) are detectable by simply augmenting existing tests with additional assertions, or assertion amplification. Moreover, we find that many of these mutants are detectable by multiple existing tests, giving developers options for how to detect them. To help with these challenges, we created a technique that performs memory-state analysis to identify candidate assertions that developers can use to detect the surviving mutants. Additionally, we build upon prior research that identifies ``crossfiring'' opportunities -- tests that coincidentally kill multiple mutants. To this end, we developed a theoretical model that describes the varying granularities that crossfiring can occur in the existing test suite, which provide opportunities and options for how to kill surviving mutants. We operationalize this model to an accompanying technique that optimizes the assertion amplification of the existing tests to crossfire multiple mutants with fewer added assertions, optionally concentrated within fewer tests. Our experiments show that we can kill all surviving mutants that are detectable with existing test data with only 1.1% of the identified assertion candidates, and increasing by a factor of 6x, on average, the number of killed mutants from amplified tests, over tests that do not crossfire. △ Less

Submitted 14 November, 2024; originally announced November 2024.

Comments: Accepted at ICSE '25

ACM Class: D.2.5

arXiv:1908.08196 [pdf, other]

Unveiling Elite Developers' Activities in Open Source Projects

Authors: Zhendong Wang, Yang Feng, Yi Wang, James A. Jones, David Redmiles

Abstract: Open-source developers, particularly the elite developers, maintain a diverse portfolio of contributing activities. They do not only commit source code but also spend a significant amount of effort on other communicative, organizational, and supportive activities. However, almost all prior research focuses on a limited number of specific activities and fails to analyze elite developers' activities… ▽ More Open-source developers, particularly the elite developers, maintain a diverse portfolio of contributing activities. They do not only commit source code but also spend a significant amount of effort on other communicative, organizational, and supportive activities. However, almost all prior research focuses on a limited number of specific activities and fails to analyze elite developers' activities in a comprehensive way. To bridge this gap, we conduct an empirical study with fine-grained event data from 20 large open-source projects hosted on GitHub. Thus, we investigate elite developers' contributing activities and their impacts on project outcomes. Our analyses reveal three key findings: (1) they participate in a variety of activities while technical contributions (e.g., coding) accounting for a small proportion only; (2) they tend to put more effort into supportive and communicative activities and less effort into coding as the project grows; (3) their participation in non-technical activities is negatively associated with the project's outcomes in term of productivity and software quality. These results provide a panoramic view of elite developers' activities and can inform an individual's decision making about effort allocation, thus leading to finer project outcomes. The results also provide implications for supporting these elite developers. △ Less

Submitted 13 November, 2019; v1 submitted 22 August, 2019; originally announced August 2019.

Showing 1–3 of 3 results for author: Jones, J A