Search | arXiv e-print repository

GHIssuemarket: A Sandbox Environment for SWE-Agents Economic Experimentation

Authors: Mohamed A. Fouad, Marcelo de Almeida Maia

Abstract: Software engineering agents (swe-agents), as key innovations in intelligent software engineering, are poised in the industry's end-of-programming debate to transcend from assistance to primary roles. we argue the importance of swe-agents' economic viability to their transcendence -- defined as their capacity to maintain efficient operations in constrained environments -- and propose its exploratio… ▽ More Software engineering agents (swe-agents), as key innovations in intelligent software engineering, are poised in the industry's end-of-programming debate to transcend from assistance to primary roles. we argue the importance of swe-agents' economic viability to their transcendence -- defined as their capacity to maintain efficient operations in constrained environments -- and propose its exploration via software engineering economics experimentation.we introduce ghissuemarket sandbox, a controlled virtual environment for swe-agents' economic experimentation, simulating the environment of an envisioned peer-to-peer multiagent system for github issues outsourcing auctions. in this controlled setting, autonomous swe-agents auction and bid on github issues, leveraging real-time communication, a built-in retrieval-augmented generation (rag) interface for effective decision-making, and instant cryptocurrency micropayments. we open-source our software artifacts, discuss our sandbox engineering decisions, and advocate towards swe-agents' economic exploration -- an emerging field we intend to pursue under the term intelligent software engineering economics (isee). △ Less

Submitted 17 December, 2024; v1 submitted 16 December, 2024; originally announced December 2024.

Comments: 2 figures

arXiv:2107.09512 [pdf, other]

doi 10.1145/3474624.3474716

On the Interplay of Smells Large Class, Complex Class and Duplicate Code

Authors: Elder Vicente de Paulo Sobrinho, Marcelo de Almeida Maia

Abstract: Bad smells have been defined to describe potential problems in code, possibly pointing out refactoring opportunities. Several empirical studies have highlighted that smells have a negative impact on comprehension and maintainability. Consequently, several approaches have been proposed to detect and restructure them. However, studies on the inter-relationship of occurrence of different types of sme… ▽ More Bad smells have been defined to describe potential problems in code, possibly pointing out refactoring opportunities. Several empirical studies have highlighted that smells have a negative impact on comprehension and maintainability. Consequently, several approaches have been proposed to detect and restructure them. However, studies on the inter-relationship of occurrence of different types of smells in source code are still lacking, especially those focused on the quantification of this inter-relationship. In this work, we aim at understand and quantify the possible the inter-relation of smells Large Class - LC, Complex Class - CC and Duplicate Code - DC. In particular, we investigate patterns of LC and CC regarding the presence or absence of duplicate code. We conduct a quantitative study on five open source projects, and also a qualitative analysis to measure and understand the association of specific smells. As one of the main results, we highlight that there are "occurrence patterns" among these smells, for example: either in Complex Class or in the co-occurrence of Large Class and Complex Class, clones tend to be more prevalent in highly complex classes than less complex classes. The found patterns could be used to improve the performance of detection tools or even help in refactoring tasks. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: 10 pages

Journal ref: Brazilian Symposium on Software Engineering (SBES '21), September 27-October 1, 2021, Joinville, Brazil

arXiv:2103.09423 [pdf, ps, other]

Towards a question answering assistant for software development using a transformer-based language model

Authors: Liliane do Nascimento Vale, Marcelo de Almeida Maia

Abstract: Question answering platforms, such as Stack Overflow, have impacted substantially how developers search for solutions for their programming problems. The crowd knowledge content available from such platforms has also been used to leverage software development tools. The recent advances on Natural Language Processing, specifically on more powerful language models, have demonstrated ability to enhan… ▽ More Question answering platforms, such as Stack Overflow, have impacted substantially how developers search for solutions for their programming problems. The crowd knowledge content available from such platforms has also been used to leverage software development tools. The recent advances on Natural Language Processing, specifically on more powerful language models, have demonstrated ability to enhance text understanding and generation. In this context, we aim at investigating the factors that can influence on the application of such models for understanding source code related data and produce more interactive and intelligent assistants for software development. In this preliminary study, we particularly investigate if a how-to question filter and the level of context in the question may impact the results of a question answering transformer-based model. We suggest that fine-tuning models with corpus based on how-to questions can impact positively in the model and more contextualized questions also induce more objective answers. △ Less

Submitted 16 March, 2021; originally announced March 2021.

arXiv:1903.09174 [pdf, other]

Bootstrapping Cookbooks for APIs from Crowd Knowledge on Stack Overflow

Authors: Lucas B. L. Souza, Eduardo C. Campos, Fernanda Madeiral, Klérisson Paixão, Adriano M. Rocha, Marcelo de Almeida Maia

Abstract: Well established libraries typically have API documentation. However, they frequently lack examples and explanations, possibly making difficult their effective reuse. Stack Overflow is a question-and-answer website oriented to issues related to software development. Despite the increasing adoption of Stack Overflow, the information related to a particular topic (e.g., an API) is spread across the… ▽ More Well established libraries typically have API documentation. However, they frequently lack examples and explanations, possibly making difficult their effective reuse. Stack Overflow is a question-and-answer website oriented to issues related to software development. Despite the increasing adoption of Stack Overflow, the information related to a particular topic (e.g., an API) is spread across the website. Thus, Stack Overflow still lacks organization of the crowd knowledge available on it. Our target goal is to address the problem of the poor quality documentation for APIs by providing an alternative artifact to document them based on the crowd knowledge available on Stack Overflow, called crowd cookbook. A cookbook is a recipe-oriented book, and we refer to our cookbook as crowd cookbook since it contains content generated by a crowd. The cookbooks are meant to be used through an exploration process, i.e. browsing. In this paper, we present a semi-automatic approach that organizes the crowd knowledge available on Stack Overflow to build cookbooks for APIs. We have generated cookbooks for three APIs widely used by the software development community: SWT, LINQ and QT. We have also defined desired properties that crowd cookbooks must meet, and we conducted an evaluation of the cookbooks against these properties with human subjects. The results showed that the cookbooks built using our approach, in general, meet those properties. As a highlight, most of the recipes were considered appropriate to be in the cookbooks and have self-contained information. We concluded that our approach is capable to produce adequate cookbooks automatically, which can be as useful as manually produced cookbooks. This opens an opportunity for API designers to enrich existent cookbooks with the different points of view from the crowd, or even to generate initial versions of new cookbooks. △ Less

Submitted 21 March, 2019; originally announced March 2019.

Comments: Accepted at Information and Software Technology - Journal - Elsevier. 16 pages

arXiv:1903.07662 [pdf, other]

Recommending Comprehensive Solutions for Programming Tasks by Mining Crowd Knowledge

Authors: Rodrigo F. G. Silva, Chanchal K. Roy, Mohammad Masudur Rahman, Kevin A. Schneider, Klerisson Paixao, Marcelo de Almeida Maia

Abstract: Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face two major problems. First, the search is impaired due to a lexical gap between their query (task description) and the information associated with the solution. Second, the retrieved solution may not be comprehensive, i.e., the code segment might miss a succinct explanation. These pro… ▽ More Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face two major problems. First, the search is impaired due to a lexical gap between their query (task description) and the information associated with the solution. Second, the retrieved solution may not be comprehensive, i.e., the code segment might miss a succinct explanation. These problems make the developers browse dozens of documents in order to synthesize an appropriate solution. To address these two problems, we propose CROKAGE (Crowd Knowledge Answer Generator), a tool that takes the description of a programming task (the query) and provides a comprehensive solution for the task. Our solutions contain not only relevant code examples but also their succinct explanations. Our proposed approach expands the task description with relevant API classes from Stack Overflow Q&A threads and then mitigates the lexical gap problems. Furthermore, we perform natural language processing on the top quality answers and then return such programming solutions containing code examples and code explanations unlike earlier studies. We evaluate our approach using 48 programming queries and show that it outperforms six baselines including the state-of-art by a statistically significant margin. Furthermore, our evaluation with 29 developers using 24 tasks (queries) confirms the superiority of CROKAGE over the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation). △ Less

Submitted 20 March, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

Comments: Accepted at ICPC, 12 pages, 2019

arXiv:1703.09602 [pdf, other]

On the Interplay between Non-Functional Requirements and Builds on Continuous Integration

Authors: Klérisson V. R. Paixão, Crícia Z. Felício, Fernanda M. Delfim, Marcelo de A. Maia

Abstract: Continuous Integration (CI) implies that a whole developer team works together on the mainline of a software project. CI systems automate the builds of a software. Sometimes a developer checks in code, which breaks the build. A broken build might not be a problem by itself, but it has the potential to disrupt co-workers, hence it affects the performance of the team. In this study, we investigate t… ▽ More Continuous Integration (CI) implies that a whole developer team works together on the mainline of a software project. CI systems automate the builds of a software. Sometimes a developer checks in code, which breaks the build. A broken build might not be a problem by itself, but it has the potential to disrupt co-workers, hence it affects the performance of the team. In this study, we investigate the interplay between nonfunctional requirements (NFRs) and builds statuses from 1,283 software projects. We found significant differences among NFRs related-builds statuses. Thus, tools can be proposed to improve CI with focus on new ways to prevent failures into CI, specially for efficiency and usability related builds. Also, the time required to put a broken build back on track indicates a bimodal distribution along all NFRs, with higher peaks within a day and lower peaks in six weeks. Our results suggest that more planned schedule for maintainability for Ruby, and for functionality and reliability for Java would decrease delays related to broken builds. △ Less

Submitted 29 March, 2017; v1 submitted 28 March, 2017; originally announced March 2017.

Comments: 4 pages, accepted in MSR 2017 Mining Challenge Track

Showing 1–6 of 6 results for author: Maia, M d A