Skip to main content

Showing 1–12 of 12 results for author: Cogo, F R

.
  1. arXiv:2505.10640  [pdf, ps, other

    cs.SE cs.AI cs.LG

    The Hitchhikers Guide to Production-ready Trustworthy Foundation Model powered Software (FMware)

    Authors: Kirill Vasilevski, Benjamin Rombaut, Gopi Krishnan Rajbahadur, Gustavo A. Oliva, Keheliya Gallaba, Filipe R. Cogo, Jiahuei Lin, Dayi Lin, Haoxiang Zhang, Bouyan Chen, Kishanthan Thangarajah, Ahmed E. Hassan, Zhen Ming Jiang

    Abstract: Foundation Models (FMs) such as Large Language Models (LLMs) are reshaping the software industry by enabling FMware, systems that integrate these FMs as core components. In this KDD 2025 tutorial, we present a comprehensive exploration of FMware that combines a curated catalogue of challenges with real-world production concerns. We first discuss the state of research and practice in building FMwar… ▽ More

    Submitted 2 June, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

  2. arXiv:2502.17378  [pdf, other

    cs.SE cs.LG

    Continuous Integration Practices in Machine Learning Projects: The Practitioners` Perspective

    Authors: João Helis Bernardo, Daniel Alencar da Costa, Filipe Roseiro Cogo, Sérgio Queiróz de Medeiros, Uirá Kulesza

    Abstract: Continuous Integration (CI) is a cornerstone of modern software development. However, while widely adopted in traditional software projects, applying CI practices to Machine Learning (ML) projects presents distinctive characteristics. For example, our previous work revealed that ML projects often experience longer build durations and lower test coverage rates compared to their non-ML counterparts.… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  3. arXiv:2411.01063  [pdf, other

    cs.SE cs.AI

    InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation

    Authors: Marcos Macedo, Yuan Tian, Pengyu Nie, Filipe R. Cogo, Bram Adams

    Abstract: Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance, and more. However, automating this process remains challenging due to many syntactic and semantic differences between PLs. Recent studies show that even advance… ▽ More

    Submitted 4 November, 2024; v1 submitted 1 November, 2024; originally announced November 2024.

  4. arXiv:2409.10472  [pdf, other

    cs.SE

    Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face

    Authors: Adekunle Ajibode, Abdul Ali Bangash, Filipe Roseiro Cogo, Bram Adams, Ahmed E. Hassan

    Abstract: The proliferation of open Pre-trained Language Models (PTLMs) on model registry platforms like Hugging Face (HF) presents both opportunities and challenges for companies building products around them. Similar to traditional software dependencies, PTLMs continue to evolve after a release. However, the current state of release practices of PTLMs on model registry platforms are plagued by a variety o… ▽ More

    Submitted 19 February, 2025; v1 submitted 16 September, 2024; originally announced September 2024.

  5. On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

    Authors: Zhimin Zhao, Abdul Ali Bangash, Filipe Roseiro Côgo, Bram Adams, Ahmed E. Hassan

    Abstract: Foundation models (FM), such as large language models (LLMs), which are large-scale machine learning (ML) models, have demonstrated remarkable adaptability in various downstream software engineering (SE) tasks, such as code completion, code understanding, and software development. As a result, FM leaderboards have become essential tools for SE teams to compare and select the best third-party FMs f… ▽ More

    Submitted 28 January, 2025; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Awesome Foundation Model Leaderboard List: https://github.com/SAILResearch/awesome-foundation-model-leaderboards; Foundation Model Leaderboard Search Toolkit: https://huggingface.co/spaces/zhiminy/awesome-foundation-model-leaderboard-search

  6. Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation

    Authors: Marcos Macedo, Yuan Tian, Filipe R. Cogo, Bram Adams

    Abstract: Code translation between programming languages is a long-existing and critical task in software engineering, facilitating the modernization of legacy systems, ensuring cross-platform compatibility, and enhancing software performance. With the recent advances in large language models (LLMs) and their applications to code translation, there is an increasing need for comprehensive evaluation of these… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted into 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge)

  7. arXiv:2403.09012  [pdf, other

    cs.SE

    Leveraging the Crowd for Dependency Management: An Empirical Study on the Dependabot Compatibility Score

    Authors: Benjamin Rombaut, Filipe R. Cogo, Ahmed E. Hassan

    Abstract: Dependabot, a popular dependency management tool, includes a compatibility score feature that helps client packages assess the risk of accepting a dependency update by leveraging knowledge from "the crowd". For each dependency update, Dependabot calculates this compatibility score as the proportion of successful updates performed by other client packages that use the same provider package as a dep… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  8. arXiv:2402.15943  [pdf

    cs.SE cs.AI

    Rethinking Software Engineering in the Foundation Model Era: A Curated Catalogue of Challenges in the Development of Trustworthy FMware

    Authors: Ahmed E. Hassan, Dayi Lin, Gopi Krishnan Rajbahadur, Keheliya Gallaba, Filipe R. Cogo, Boyuan Chen, Haoxiang Zhang, Kishanthan Thangarajah, Gustavo Ansaldi Oliva, Jiahuei Lin, Wali Mohammad Abdullah, Zhen Ming Jiang

    Abstract: Foundation models (FMs), such as Large Language Models (LLMs), have revolutionized software development by enabling new use cases and business models. We refer to software built using FMs as FMware. The unique properties of FMware (e.g., prompts, agents, and the need for orchestration), coupled with the intrinsic limitations of FMs (e.g., hallucination) lead to a completely new set of software eng… ▽ More

    Submitted 3 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  9. I depended on you and you broke me: An empirical study of manifesting breaking changes in client packages

    Authors: Daniel Venturini, Filipe Roseiro Cogo, Ivanilton Polato, Marco A Gerosa, Igor Scaliante Wiese

    Abstract: Complex software systems have a network of dependencies. Developers often configure package managers (e.g., npm) to automatically update dependencies with each publication of new releases containing bug fixes and new features. When a dependency release introduces backward-incompatible changes, commonly known as breaking changes, dependent packages may not build anymore. This may indirectly impact… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Journal ref: ACM Transactions on Software Engineering and Methodology (TOSEM 2023)

  10. Towards Build Verifiability for Java-based Systems

    Authors: Jiawen Xiong, Yong Shi, Boyuan Chen, Filipe R. Cogo, Zhen Ming, Jiang

    Abstract: Build verifiability refers to the property that the build of a software system can be verified by independent third parties and it is crucial for the trustworthiness of a software system. Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (e.g., Maven). In… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  11. arXiv:2202.04431  [pdf, other

    cs.SE cs.PL

    Assessing the alignment between the information needs of developers and the documentation of programming languages: A case study on Rust

    Authors: Filipe R. Cogo, Xin Xia, Ahmed E. Hassan

    Abstract: Programming language documentation refers to the set of technical documents that provide application developers with a description of the high-level concepts of a language. Such documentation is essential to support application developers in the effective use of a programming language. One of the challenges faced by documenters (i.e., personnel that produce documentation) is to ensure that documen… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

    Journal ref: ACM Transactions on Software Engineering and Methodology (2022)

  12. An Empirical Study of Yanked Releases in the Rust Package Registry

    Authors: Hao Li, Filipe R. Cogo, Cor-Paul Bezemer

    Abstract: Cargo, the software packaging manager of Rust, provides a yank mechanism to support release-level deprecation, which can prevent packages from depending on yanked releases. Most prior studies focused on code-level (i.e., deprecated APIs) and package-level deprecation (i.e., deprecated packages). However, few studies have focused on release-level deprecation. In this study, we investigate how often… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: 13 pages, 7 figures