-
AI and Agile Software Development: From Frustration to Success -- XP2025 Workshop Summary
Authors:
Tomas Herda,
Victoria Pichler,
Zheying Zhang,
Pekka Abrahamsson,
Geir K. Hanssen
Abstract:
The full-day workshop on AI and Agile at XP 2025 convened a diverse group of researchers and industry practitioners to address the practical challenges and opportunities of integrating Artificial Intelligence into Agile software development. Through interactive sessions, participants identified shared frustrations related to integrating AI into Agile Software Development practices, including chall…
▽ More
The full-day workshop on AI and Agile at XP 2025 convened a diverse group of researchers and industry practitioners to address the practical challenges and opportunities of integrating Artificial Intelligence into Agile software development. Through interactive sessions, participants identified shared frustrations related to integrating AI into Agile Software Development practices, including challenges with tooling, governance, data quality, and critical skill gaps. These challenges were systematically prioritized and analyzed to uncover root causes. The workshop culminated in the collaborative development of a research roadmap that pinpoints actionable directions for future work, including both immediate solutions and ambitious long-term goals. The key outcome is a structured agenda designed to foster joint industry-academic efforts to move from identified frustrations to successful implementation.
△ Less
Submitted 3 July, 2025; v1 submitted 25 June, 2025;
originally announced June 2025.
-
TimeLess: A Vision for the Next Generation of Software Development
Authors:
Zeeshan Rasheed,
Malik Abdul Sami,
Jussi Rasku,
Kai-Kristian Kemell,
Zheying Zhang,
Janne Harjamaki,
Shahbaz Siddeeq,
Sami Lahti,
Tomas Herda,
Mikko Nurminen,
Niklas Lavesson,
Jose Siqueira de Cerqueira,
Toufique Hasan,
Ayman Khan,
Mahade Hasan,
Mika Saari,
Petri Rantanen,
Jari Soini,
Pekka Abrahamsson
Abstract:
Present-day software development faces three major challenges: complexity, time consumption, and high costs. Developing large software systems often requires battalions of teams and considerable time for meetings, which end without any action, resulting in unproductive cycles, delayed progress, and increased cost. What if, instead of large meetings with no immediate results, the software product i…
▽ More
Present-day software development faces three major challenges: complexity, time consumption, and high costs. Developing large software systems often requires battalions of teams and considerable time for meetings, which end without any action, resulting in unproductive cycles, delayed progress, and increased cost. What if, instead of large meetings with no immediate results, the software product is completed by the end of the meeting? In response, we present a vision for a system called TimeLess, designed to reshape the software development process by enabling immediate action during meetings. The goal is to shift meetings from planning discussions to productive, action-oriented sessions. This approach minimizes the time and effort required for development, allowing teams to focus on critical decision-making while AI agents execute development tasks based on the meeting discussions. We will employ multiple AI agents that work collaboratively to capture human discussions and execute development tasks in real time. This represents a step toward next-generation software development environments, where human expertise drives strategy and AI accelerates task execution.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Prioritizing Software Requirements Using Large Language Models
Authors:
Malik Abdul Sami,
Zeeshan Rasheed,
Muhammad Waseem,
Zheying Zhang,
Tomas Herda,
Pekka Abrahamsson
Abstract:
Large Language Models (LLMs) are revolutionizing Software Engineering (SE) by introducing innovative methods for tasks such as collecting requirements, designing software, generating code, and creating test cases, among others. This article focuses on requirements engineering, typically seen as the initial phase of software development that involves multiple system stakeholders. Despite its key ro…
▽ More
Large Language Models (LLMs) are revolutionizing Software Engineering (SE) by introducing innovative methods for tasks such as collecting requirements, designing software, generating code, and creating test cases, among others. This article focuses on requirements engineering, typically seen as the initial phase of software development that involves multiple system stakeholders. Despite its key role, the challenge of identifying requirements and satisfying all stakeholders within time and budget constraints remains significant. To address the challenges in requirements engineering, this study introduces a web-based software tool utilizing AI agents and prompt engineering to automate task prioritization and apply diverse prioritization techniques, aimed at enhancing project management within the agile framework. This approach seeks to transform the prioritization of agile requirements, tackling the substantial challenge of meeting stakeholder needs within set time and budget limits. Furthermore, the source code of our developed prototype is available on GitHub, allowing for further experimentation and prioritization of requirements, facilitating research and practical application.
△ Less
Submitted 5 April, 2024;
originally announced May 2024.
-
Exploring Human-AI Collaboration in Agile: Customised LLM Meeting Assistants
Authors:
Beatriz Cabrero-Daniel,
Tomas Herda,
Victoria Pichler,
Martin Eder
Abstract:
This action research study focuses on the integration of "AI assistants" in two Agile software development meetings: the Daily Scrum and a feature refinement, a planning meeting that is part of an in-house Scaled Agile framework. We discuss the critical drivers of success, and establish a link between the use of AI and team collaboration dynamics. We conclude with a list of lessons learnt during t…
▽ More
This action research study focuses on the integration of "AI assistants" in two Agile software development meetings: the Daily Scrum and a feature refinement, a planning meeting that is part of an in-house Scaled Agile framework. We discuss the critical drivers of success, and establish a link between the use of AI and team collaboration dynamics. We conclude with a list of lessons learnt during the interventions in an industrial context, and provide a assessment checklist for companies and teams to reflect on their readiness level. This paper is thus a road-map to facilitate the integration of AI tools in Agile setups.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Generating Test Scenarios from NL Requirements using Retrieval-Augmented LLMs: An Industrial Study
Authors:
Chetan Arora,
Tomas Herda,
Verena Homm
Abstract:
Test scenarios are specific instances of test cases that describe actions to validate a particular software functionality. By outlining the conditions under which the software operates and the expected outcomes, test scenarios ensure that the software functionality is tested in an integrated manner. Test scenarios are crucial for systematically testing an application under various conditions, incl…
▽ More
Test scenarios are specific instances of test cases that describe actions to validate a particular software functionality. By outlining the conditions under which the software operates and the expected outcomes, test scenarios ensure that the software functionality is tested in an integrated manner. Test scenarios are crucial for systematically testing an application under various conditions, including edge cases, to identify potential issues and guarantee overall performance and reliability. Specifying test scenarios is tedious and requires a deep understanding of software functionality and the underlying domain. It further demands substantial effort and investment from already time- and budget-constrained requirements engineers and testing teams. This paper presents an automated approach (RAGTAG) for test scenario generation using Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs). RAG allows the integration of specific domain knowledge with LLMs' generation capabilities. We evaluate RAGTAG on two industrial projects from Austrian Post with bilingual requirements in German and English. Our results from an interview survey conducted with four experts on five dimensions -- relevance, coverage, correctness, coherence and feasibility, affirm the potential of RAGTAG in automating test scenario generation. Specifically, our results indicate that, despite the difficult task of analyzing bilingual requirements, RAGTAG is able to produce scenarios that are well-aligned with the underlying requirements and provide coverage of different aspects of the intended functionality. The generated scenarios are easily understandable to experts and feasible for testing in the project environment. The overall correctness is deemed satisfactory; however, gaps in capturing exact action sequences and domain nuances remain, underscoring the need for domain expertise when applying LLMs.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
LLM-based agents for automating the enhancement of user story quality: An early report
Authors:
Zheying Zhang,
Maruf Rayhan,
Tomas Herda,
Manuel Goisauf,
Pekka Abrahamsson
Abstract:
In agile software development, maintaining high-quality user stories is crucial, but also challenging. This study explores the use of large language models to automatically improve the user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the e…
▽ More
In agile software development, maintaining high-quality user stories is crucial, but also challenging. This study explores the use of large language models to automatically improve the user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the effectiveness of these agents for user story quality improvement was assessed by 11 participants across six agile teams. Our findings demonstrate the potential of LLMs in improving user story quality, contributing to the research on AI role in agile development, and providing a practical example of the transformative impact of AI in an industry setting.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Generative Artificial Intelligence for Software Engineering -- A Research Agenda
Authors:
Anh Nguyen-Duc,
Beatriz Cabrero-Daniel,
Adam Przybylek,
Chetan Arora,
Dron Khanna,
Tomas Herda,
Usman Rafiq,
Jorge Melegati,
Eduardo Guerra,
Kai-Kristian Kemell,
Mika Saari,
Zheying Zhang,
Huy Le,
Tho Quan,
Pekka Abrahamsson
Abstract:
Generative Artificial Intelligence (GenAI) tools have become increasingly prevalent in software development, offering assistance to various managerial and technical project activities. Notable examples of these tools include OpenAIs ChatGPT, GitHub Copilot, and Amazon CodeWhisperer. Although many recent publications have explored and evaluated the application of GenAI, a comprehensive understandin…
▽ More
Generative Artificial Intelligence (GenAI) tools have become increasingly prevalent in software development, offering assistance to various managerial and technical project activities. Notable examples of these tools include OpenAIs ChatGPT, GitHub Copilot, and Amazon CodeWhisperer. Although many recent publications have explored and evaluated the application of GenAI, a comprehensive understanding of the current development, applications, limitations, and open challenges remains unclear to many. Particularly, we do not have an overall picture of the current state of GenAI technology in practical software engineering usage scenarios. We conducted a literature review and focus groups for a duration of five months to develop a research agenda on GenAI for Software Engineering. We identified 78 open Research Questions (RQs) in 11 areas of Software Engineering. Our results show that it is possible to explore the adoption of GenAI in partial automation and support decision-making in all software development activities. While the current literature is skewed toward software implementation, quality assurance and software maintenance, other areas, such as requirements engineering, software design, and software engineering education, would need further research attention. Common considerations when implementing GenAI include industry-level assessment, dependability and accuracy, data accessibility, transparency, and sustainability aspects associated with the technology. GenAI is bringing significant changes to the field of software engineering. Nevertheless, the state of research on the topic still remains immature. We believe that this research agenda holds significance and practical value for informing both researchers and practitioners about current applications and guiding future research.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.