Search | arXiv e-print repository

Towards Identifying Code Proficiency through the Analysis of Python Textbooks

Authors: Ruksit Rojpaisarnkit, Gregorio Robles, Raula Gaikovina Kula, Dong Wang, Chaiyong Ragkhitwetsagul, Jesus M. Gonzalez-Barahona, Kenichi Matsumoto

Abstract: Python, one of the most prevalent programming languages today, is widely utilized in various domains, including web development, data science, machine learning, and DevOps. Recent scholarly efforts have proposed a methodology to assess Python competence levels, similar to how proficiency in natural languages is evaluated. This method involves assigning levels of competence to Python constructs, fo… ▽ More Python, one of the most prevalent programming languages today, is widely utilized in various domains, including web development, data science, machine learning, and DevOps. Recent scholarly efforts have proposed a methodology to assess Python competence levels, similar to how proficiency in natural languages is evaluated. This method involves assigning levels of competence to Python constructs, for instance, placing simple 'print' statements at the most basic level and abstract base classes at the most advanced. The aim is to gauge the level of proficiency a developer must have to understand a piece of source code. This is particularly crucial for software maintenance and evolution tasks, such as debugging or adding new features. For example, in a code review process, this method could determine the competence level required for reviewers. However, categorizing Python constructs by proficiency levels poses significant challenges. Prior attempts, which relied heavily on expert opinions and developer surveys, have led to considerable discrepancies. In response, this paper presents a new approach to identifying Python competency levels through the systematic analysis of introductory Python programming textbooks. By comparing the sequence in which Python constructs are introduced in these textbooks with the current state of the art, we have uncovered notable discrepancies in the order of introduction of Python constructs. Our study underscores a misalignment in the sequences, demonstrating that pinpointing proficiency levels is not trivial. Insights from the study serve as pivotal steps toward reinforcing the idea that textbooks serve as a valuable source for evaluating developers' proficiency, and particularly in terms of their ability to undertake maintenance and evolution tasks. △ Less

Submitted 5 August, 2024; originally announced August 2024.

Comments: 12 pages, 7 figures, 6 tables, ICSME2024

ACM Class: D.2.0; D.2.7

arXiv:2405.01565 [pdf, other]

The Role of Code Proficiency in the Era of Generative AI

Authors: Gregorio Robles, Christoph Treude, Jesus M. Gonzalez-Barahona, Raula Gaikovina Kula

Abstract: At the current pace of technological advancements, Generative AI models, including both Large Language Models and Large Multi-modal Models, are becoming integral to the developer workspace. However, challenges emerge due to the 'black box' nature of many of these models, where the processes behind their outputs are not transparent. This position paper advocates for a 'white box' approach to these… ▽ More At the current pace of technological advancements, Generative AI models, including both Large Language Models and Large Multi-modal Models, are becoming integral to the developer workspace. However, challenges emerge due to the 'black box' nature of many of these models, where the processes behind their outputs are not transparent. This position paper advocates for a 'white box' approach to these generative models, emphasizing the necessity of transparency and understanding in AI-generated code to match the proficiency levels of human developers and better enable software maintenance and evolution. We outline a research agenda aimed at investigating the alignment between AI-generated code and developer skills, highlighting the importance of responsibility, security, legal compliance, creativity, and social value in software development. The proposed research questions explore the potential of white-box methodologies to ensure that software remains an inspectable, adaptable, and trustworthy asset in the face of rapid AI integration, setting a course for research that could shape the role of code proficiency into 2030 and beyond. △ Less

Submitted 8 April, 2024; originally announced May 2024.

Comments: submitted to Software Engineering 2030

arXiv:2404.09789 [pdf, ps, other]

doi 10.1145/3643796.3648457

Software development in the age of LLMs and XR

Authors: Jesus M. Gonzalez-Barahona

Abstract: Let's imagine that in a few years generative AI has changed software development dramatically, taking charge of most of the programming tasks. Let's also assume that extended reality devices became ubiquitous, being the preferred interface for interacting with computers. This paper proposes how this situation would impact IDEs, by exploring how the development process would be affected, and analyz… ▽ More Let's imagine that in a few years generative AI has changed software development dramatically, taking charge of most of the programming tasks. Let's also assume that extended reality devices became ubiquitous, being the preferred interface for interacting with computers. This paper proposes how this situation would impact IDEs, by exploring how the development process would be affected, and analyzing which tools would be needed for supporting developers. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Journal ref: Proceedings of the First IDE Workshop (IDE'24), April 20, 2024, Lisbon, Portugal. ACM, New York, NY, USA

arXiv:2308.11258 [pdf, other]

doi 10.1007/s10664-023-10377-w

The Software Heritage License Dataset (2022 Edition)

Authors: Jesús M. González-Barahona, Sergio Montes-Leon, Gregorio Robles, Stefano Zacchiroli

Abstract: Context: When software is released publicly, it is common to include with it either the full text of the license or licenses under which it is published, or a detailed reference to them. Therefore public licenses, including FOSS (free, open source software) licenses, are usually publicly available in source code repositories.Objective: To compile a dataset containing as many documents as possible… ▽ More Context: When software is released publicly, it is common to include with it either the full text of the license or licenses under which it is published, or a detailed reference to them. Therefore public licenses, including FOSS (free, open source software) licenses, are usually publicly available in source code repositories.Objective: To compile a dataset containing as many documents as possible that contain the text of software licenses, or references to the license terms. Once compiled, characterize the dataset so that it can be used for further research, or practical purposes related to license analysis.Method: Retrieve from Software Heritage-the largest publicly available archive of FOSS source code-all versions of all files whose names are commonly used to convey licensing terms. All retrieved documents will be characterized in various ways, using automated and manual analyses.Results: The dataset consists of 6.9 million unique license files. Additional metadata about shipped license files is also provided, making the dataset ready to use in various contexts, including: file length measures, MIME type, SPDX license (detected using ScanCode), and oldest appearance. The results of a manual analysis of 8102 documents is also included, providing a ground truth for further analysis. The dataset is released as open data as an archive file containing all deduplicated license files, plus several portable CSV files with metadata, referencing files via cryptographic checksums.Conclusions: Thanks to the extensive coverage of Software Heritage, the dataset presented in this paper covers a very large fraction of all software licenses for public code. We have assembled a large body of software licenses, characterized it quantitatively and qualitatively, and validated that it is mostly composed of licensing information and includes almost all known license texts. The dataset can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. It can also be used in practice to improve tools detecting licenses in source code. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Journal ref: Empirical Software Engineering, In press

arXiv:2203.15990 [pdf, other]

pycefr: Python Competency Level through Code Analysis

Authors: Gregorio Robles, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Tattiya Sakulniwat, Kenichi Matsumoto, Jesus M. Gonzalez-Barahona

Abstract: Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, th… ▽ More Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, the general programming competence and programming skills, etc. In this paper, we present pycefr, a tool that detects the use of the different elements of the Python language, effectively measuring the level of Python proficiency required to comprehend and deal with a fragment of Python code. Following the well-known Common European Framework of Reference for Languages (CEFR), widely used for natural languages, pycefr categorizes Python code in six levels, depending on the proficiency required to create and understand it. We also discuss different use cases for pycefr: identifying code snippets that can be understood by developers with a certain proficiency, labeling code examples in online resources such as Stackoverflow and GitHub to suit them to a certain level of competency, helping in the onboarding process of new developers in Open Source Software projects, etc. A video shows availability and usage of the tool: https://tinyurl.com/ypdt3fwe. △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: Accepted at International Conference on Program Comprehension, 2022

arXiv:2203.09898 [pdf, other]

Development Effort Estimation in Free/Open Source Software from Activity in Version Control Systems

Authors: Gregorio Robles, Andrea Capiluppi, Jesus M. Gonzalez-Barahona, Bjorn Lundell, Jonas Gamalielsson

Abstract: Effort estimation models are a fundamental tool in software management, and used as a forecast for resources, constraints and costs associated to software development. For Free/Open Source Software (FOSS) projects, effort estimation is especially complex: professional developers work alongside occasional, volunteer developers, so the overall effort (in person-months) becomes non-trivial to determi… ▽ More Effort estimation models are a fundamental tool in software management, and used as a forecast for resources, constraints and costs associated to software development. For Free/Open Source Software (FOSS) projects, effort estimation is especially complex: professional developers work alongside occasional, volunteer developers, so the overall effort (in person-months) becomes non-trivial to determine. The objective of this work it to develop a simple effort estimation model for FOSS projects, based on the historic data of developers' effort. The model is fed with direct developer feedback to ensure its accuracy. After extracting the personal development profiles of several thousands of developers from 6 large FOSS projects, we asked them to fill in a questionnaire to determine if they should be considered as full-time developers in the project that they work in. Their feedback was used to fine-tune the value of an effort threshold, above which developers can be considered as full-time. With the help of the over 1,000 questionnaires received, we were able to determine, for every project in our sample, the threshold of commits that separates full-time from non-full-time developers.%, and that minimizes the type I and type II errors. We finally offer guidelines and a tool to apply our model to FOSS projects that use a version control system. △ Less

Submitted 18 March, 2022; originally announced March 2022.

arXiv:2109.13768 [pdf, other]

To VR or not to VR: Is virtual reality suitable to understand software development metrics?

Authors: David Moreno-Lumbreras, Gregorio Robles, Daniel Izquierdo-Cortázar, Jesus M. Gonzalez-Barahona

Abstract: Background/Context: Currently, the usual interface for visualizing data is based on 2-D screens. Recently, devices capable of visualizing data while immersed in VR scenes are becoming common. However, it has not been studied in detail to which extent these devices are suitable for interacting with data visualizations in the specific case of data about software development. Objective/Aim: In this r… ▽ More Background/Context: Currently, the usual interface for visualizing data is based on 2-D screens. Recently, devices capable of visualizing data while immersed in VR scenes are becoming common. However, it has not been studied in detail to which extent these devices are suitable for interacting with data visualizations in the specific case of data about software development. Objective/Aim: In this registered report, we propose to answer the following question: "Is comprehension of software development processes, via the visualization of their metrics, better when presented in VR scenes than in 2D screens?" In particular, we will study if answers obtained after interacting with visualizations presented as VR scenes are more or less correct than those obtained from traditional screens, and if it takes more or less time to produce those answers. Method: We will run an experiment with volunteer subjects from several backgrounds. We will have two setups: an on-screen application, and a VR scene. Both will be designed to be as much equivalent as possible in terms of the information they provide. For the former, we use a commercial-grade set of \kibana-based interactive dashboards that stakeholders currently use to get insights. For the latter, we use a set of visualizations similar to those in the on-screen case, prepared to provide the same set of data using the museum metaphor in a VR room. The field of analysis will be related to modern code review, in particular pull request activity. The subjects will try to answer some questions in both setups (some will work first in VR, some on-screen), which will be presented to them in random order. To draw results, we will compare and statistically analyze both the correctness of their answers, and the time spent until they are produced. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: ESEM Registered Reports track

arXiv:2107.10634 [pdf, other]

Factors determining maximum energy consumption of Bitcoin miners

Authors: Jesus M. Gonzalez-Barahona

Abstract: Background: During the last years, there has been a lot of discussion and estimations on the energy consumption of Bitcoin miners. However, most of the studies are focused on estimating energy consumption, not in exploring the factors that determine it. Goal: To explore the factors that determine maximum energy consumption of Bitcoin miners. In particular, analyze the limits of energy consumptio… ▽ More Background: During the last years, there has been a lot of discussion and estimations on the energy consumption of Bitcoin miners. However, most of the studies are focused on estimating energy consumption, not in exploring the factors that determine it. Goal: To explore the factors that determine maximum energy consumption of Bitcoin miners. In particular, analyze the limits of energy consumption, and to which extent variations of the factors could produce its reduction. Method: Estimate the overall profit of all Bitcoin miners during a certain period of time, and the costs (including energy) that they face during that time, because of the mining activity. The underlying assumptions is that miners will only consume energy to mine Bitcoin if they have the expectation of profit, and at the same time they are competitive with respect of each other. Therefore, they will operate as a group in the point where profits balance expenditures. Results: We show a basic equation that determines energy consumption based on some specific factors: minting, transaction fees, exchange rate, energy price, and amortization cost. We also define the Amortization Factor, which can be computed for mining devices based on their cost and energy consumption, helps to understand how the cost of equipment influences total energy consumption. Conclusions: The factors driving energy consumption are identified, and from them, some ways in which Bitcoin energy consumption could be reduced are discussed. Some of these ways do not reduce the most important properties of Bitcoin, such as the chances of control of the aggregated hashpower, or the fundamentals of the proof of work mechanism. In general, the methods presented can help to predict energy consumption in different scenarios, based on factors that can be calculated from available data, or assumed in scenarios. △ Less

Submitted 19 July, 2021; originally announced July 2021.

Comments: 24 pages, request for comments

arXiv:1901.04217 [pdf, other]

On the Diversity of Software Package Popularity Metrics: An Empirical Study of npm

Authors: Ahmed Zerouali, Tom Mens, Gregorio Robles, Jesus M. Gonzalez-Barahona

Abstract: Software systems often leverage on open source software libraries to reuse functionalities. Such libraries are readily available through software package managers like npm for JavaScript. Due to the huge amount of packages available in such package distributions, developers often decide to rely on or contribute to a software package based on its popularity. Moreover, it is a common practice for re… ▽ More Software systems often leverage on open source software libraries to reuse functionalities. Such libraries are readily available through software package managers like npm for JavaScript. Due to the huge amount of packages available in such package distributions, developers often decide to rely on or contribute to a software package based on its popularity. Moreover, it is a common practice for researchers to depend on popularity metrics for data sampling and choosing the right candidates for their studies. However, the meaning of popularity is relative and can be defined and measured in a diversity of ways, that might produce different outcomes even when considered for the same studies. In this paper, we show evidence of how different is the meaning of popularity in software engineering research. Moreover, we empirically analyse the relationship between different software popularity measures. As a case study, for a large dataset of 175k npm packages, we computed and extracted 9 different popularity metrics from three open source tracking systems: libraries.io, npmjs.com and GitHub. We found that indeed popularity can be measured with different unrelated metrics, each metric can be defined within a specific context. This indicates a need for a generic framework that would use a portfolio of popularity metrics drawing from different concepts. △ Less

Submitted 14 January, 2019; originally announced January 2019.

Comments: ERA Track paper at 26th IEEE International Conference on Software Evolution, Analysis and Reengineering (SANER 2019, Hangzhou, China)

Journal ref: IEEE International Conference on Software Evolution, Analysis and Reengineering, 2019, ISBN 978-1-7281-0591-8

Showing 1–9 of 9 results for author: Gonzalez-Barahona, J M