Skip to main content

Showing 1–18 of 18 results for author: Robbes, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.08532  [pdf, ps, other

    cs.SE

    Bogus Bugs, Duplicates, and Revealing Comments: Data Quality Issues in NPR

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: The performance of a machine learning system is not only determined by the model but also, to a substantial degree, by the data it is trained on. With the increasing use of machine learning, issues related to data quality have become a concern also in automated program repair research. In this position paper, we report some of the data-related issues we have come across when working with several l… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  2. arXiv:2503.04301  [pdf, other

    cs.SE

    Simple Fault Localization using Execution Traces

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: Traditional spectrum-based fault localization (SBFL) exploits differences in a program's coverage spectrum when run on passing and failing test cases. However, such runs can provide a wealth of additional information beyond mere coverage. Working with thousands of execution traces of short programs submitted to competitive programming contests and leveraging machine learning and additional runtime… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  3. arXiv:2503.04241  [pdf, ps, other

    cs.SE cs.LG

    ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: Modern Large Language Models (LLMs) have shown astounding capabilities of code understanding and synthesis. In order to assess such capabilities, several benchmarks have been devised (e.g., HumanEval). However, most benchmarks focus on code synthesis from natural language instructions. Hence, such benchmarks do not test for other forms of code understanding. Moreover, there have been concerns abou… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  4. arXiv:2503.04214  [pdf, other

    cs.SE

    Extracting Fix Ingredients using Language Models

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: Deep learning and language models are increasingly dominating automated program repair research. While previous generate-and-validate approaches were able to find and use fix ingredients on a file or even project level, neural language models are limited to the code that fits their input window. In this work we investigate how important identifier ingredients are in neural program repair and prese… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  5. Impermanent Identifiers: Enhanced Source Code Comprehension and Refactoring

    Authors: Eduardo Martins Guerra, Andre A. S. Ivo, Fernando O. Pereira, Romain Robbes, Andrea Janes, Fabio Fagundes Silveira

    Abstract: In response to the prevailing challenges in contemporary software development, this article introduces an innovative approach to code augmentation centered around Impermanent Identifiers. The primary goal is to enhance the software development experience by introducing dynamic identifiers that adapt to changing contexts, facilitating more efficient interactions between developers and source code,… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: to be published in The Journal of Systems & Software

  6. What the Fix? A Study of ASATs Rule Documentation

    Authors: Corentin Latappy, Thomas Degueule, Jean-Rémy Falleri, Romain Robbes, Xavier Blanc, Cédric Teyton

    Abstract: Automatic Static Analysis Tools (ASATs) are widely used by software developers to diffuse and enforce coding practices. Yet, we know little about the documentation of ASATs, despite it being critical to learn about the coding practices in the first place. We shed light on this through several contributions. First, we analyze the documentation of more than 100 rules of 16 ASATs for multiple program… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 32nd IEEE/ACM International Conference on Program Comprehension (ICPC 2024), Apr 2024, Lisboa, Portugal

  7. arXiv:2312.05092  [pdf, other

    cs.SE cs.LG

    INSPECT: Intrinsic and Systematic Probing Evaluation for Code Transformers

    Authors: Anjan Karmakar, Romain Robbes

    Abstract: Pre-trained models of source code have recently been successfully applied to a wide variety of Software Engineering tasks; they have also seen some practical adoption in practice, e.g. for code completion. Yet, we still know very little about what these pre-trained models learn about source code. In this article, we use probing--simple diagnostic tasks that do not further train the models--to disc… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted to IEEE Transactions on Software Engineering. Extension of our previous paper "What do pre-trained code models know about code?" (ASE 2021, arXiv:2108.11308). 21 pages

  8. arXiv:2312.04986  [pdf, other

    cs.SE cs.AI

    Out of Context: How important is Local Context in Neural Program Repair?

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: Deep learning source code models have been applied very successfully to the problem of automated program repair. One of the standing issues is the small input window of current models which often cannot fully fit the context code required for a bug fix (e.g., method or class declarations of a project). Instead, input is often restricted to the local context, that is, the lines below and above the… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  9. arXiv:2304.01102  [pdf, other

    cs.SE cs.LG

    RunBugRun -- An Executable Dataset for Automated Program Repair

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: Recently, we can notice a transition to data-driven techniques in Automated Program Repair (APR), in particular towards deep neural networks. This entails training on hundreds of thousands or even millions of non-executable code fragments. We would like to bring more attention to an aspect of code often neglected in Neural Program Repair (NPR), namely its execution. Code execution has several sign… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  10. arXiv:2212.09132  [pdf, other

    cs.SE cs.LG

    JEMMA: An Extensible Java Dataset for ML4Code Applications

    Authors: Anjan Karmakar, Miltiadis Allamanis, Romain Robbes

    Abstract: Machine Learning for Source Code (ML4Code) is an active research field in which extensive experimentation is needed to discover how to best use source code's richly structured information. With this in mind, we introduce JEMMA, an Extensible Java Dataset for ML4Code Applications, which is a large-scale, diverse, and high-quality dataset targeted at ML4Code. Our goal with JEMMA is to lower the barr… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

  11. arXiv:2212.02684  [pdf, other

    cs.SE cs.LG

    Codex Hacks HackerRank: Memorization Issues and a Framework for Code Synthesis Evaluation

    Authors: Anjan Karmakar, Julian Aron Prenner, Marco D'Ambros, Romain Robbes

    Abstract: The Codex model has demonstrated extraordinary competence in synthesizing code from natural language problem descriptions. However, in order to reveal unknown failure modes and hidden biases, such large-scale models must be systematically subjected to multiple and diverse evaluation studies. In this work, we evaluate the code synthesis capabilities of the Codex model based on a set of 115 Python… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  12. arXiv:2111.03922  [pdf, ps, other

    cs.SE

    Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: OpenAI's Codex, a GPT-3 like model trained on a large code corpus, has made headlines in and outside of academia. Given a short user-provided description, it is capable of synthesizing code snippets that are syntactically and semantically valid in most cases. In this work, we want to investigate whether Codex is able to localize and fix bugs, a task of central interest in the field of automated pr… ▽ More

    Submitted 6 November, 2021; originally announced November 2021.

  13. arXiv:2108.11308  [pdf, other

    cs.SE cs.LG

    What do pre-trained code models know about code?

    Authors: Anjan Karmakar, Romain Robbes

    Abstract: Pre-trained models of code built on the transformer architecture have performed well on software engineering (SE) tasks such as predictive code generation, code summarization, among others. However, whether the vector representations from these pre-trained models comprehensively encode characteristics of source code well enough to be applicable to a broad spectrum of downstream tasks remains an op… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

  14. arXiv:2106.15209  [pdf, other

    cs.SE

    Making the most of small Software Engineering datasets with modern machine learning

    Authors: Julian Aron Prenner, Romain Robbes

    Abstract: This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to the high costs associated with labeling data, in Software Engineering,there exist many small (< 1 000 samples) and medium-sized (< 100 000 samples) datasets. While deep learning has set the state of the art in many mac… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  15. arXiv:2103.01722  [pdf, other

    cs.SE

    Mining Software Repositories with a Collaborative Heuristic Repository

    Authors: Hlib Babii, Julian Aron Prenner, Laurin Stricker, Anjan Karmakar, Andrea Janes, Romain Robbes

    Abstract: Many software engineering studies or tasks rely on categorizing software engineering artifacts. In practice, this is done either by defining simple but often imprecise heuristics, or by manual labelling of the artifacts. Unfortunately, errors in these categorizations impact the tasks that rely on them. To improve the precision of these categorizations, we propose to gather heuristics in a collabor… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: 5 pages; to appear in Proceedings of ICSE NIER 2021

  16. arXiv:2010.03525  [pdf

    cs.SE cs.GL

    Empirical Standards for Software Engineering Research

    Authors: Paul Ralph, Nauman bin Ali, Sebastian Baltes, Domenico Bianculli, Jessica Diaz, Yvonne Dittrich, Neil Ernst, Michael Felderer, Robert Feldt, Antonio Filieri, Breno Bernard Nicolau de França, Carlo Alberto Furia, Greg Gay, Nicolas Gold, Daniel Graziotin, Pinjia He, Rashina Hoda, Natalia Juristo, Barbara Kitchenham, Valentina Lenarduzzi, Jorge Martínez, Jorge Melegati, Daniel Mendez, Tim Menzies, Jefferson Molleri , et al. (18 additional authors not shown)

    Abstract: Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around resear… ▽ More

    Submitted 4 March, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: For the complete standards, supplements and other resources, see https://github.com/acmsigsoft/EmpiricalStandards

  17. Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code

    Authors: Rafael-Michael Karampatsis, Hlib Babii, Romain Robbes, Charles Sutton, Andrea Janes

    Abstract: Statistical language modeling techniques have successfully been applied to large source code corpora, yielding a variety of new software development tools, such as tools for code suggestion, improving readability, and API migration. A major issue with these techniques is that code introduces new vocabulary at a far higher rate than natural language, as new identifier names proliferate. Both large… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: 13 pages; to appear in Proceedings of ICSE 2020

  18. arXiv:1904.01873  [pdf, other

    cs.CL cs.SE

    Modeling Vocabulary for Big Code Machine Learning

    Authors: Hlib Babii, Andrea Janes, Romain Robbes

    Abstract: When building machine learning models that operate on source code, several decisions have to be made to model source-code vocabulary. These decisions can have a large impact: some can lead to not being able to train models at all, others significantly affect performance, particularly for Neural Language Models. Yet, these decisions are not often fully described. This paper lists important modeling… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: 12 pages, 1 figure