-
Mapping Trustworthiness in Large Language Models: A Bibliometric Analysis Bridging Theory to Practice
Authors:
José Siqueira de Cerqueira,
Kai-Kristian Kemell,
Rebekah Rousi,
Nannan Xi,
Juho Hamari,
Pekka Abrahamsson
Abstract:
The rapid proliferation of Large Language Models (LLMs) has raised significant trustworthiness and ethical concerns. Despite the widespread adoption of LLMs across domains, there is still no clear consensus on how to define and operationalise trustworthiness. This study aims to bridge the gap between theoretical discussion and practical implementation by analysing research trends, definitions of t…
▽ More
The rapid proliferation of Large Language Models (LLMs) has raised significant trustworthiness and ethical concerns. Despite the widespread adoption of LLMs across domains, there is still no clear consensus on how to define and operationalise trustworthiness. This study aims to bridge the gap between theoretical discussion and practical implementation by analysing research trends, definitions of trustworthiness, and practical techniques. We conducted a bibliometric mapping analysis of 2,006 publications from Web of Science (2019-2025) using the Bibliometrix, and manually reviewed 68 papers. We found a shift from traditional AI ethics discussion to LLM trustworthiness frameworks. We identified 18 different definitions of trust/trustworthiness, with transparency, explainability and reliability emerging as the most common dimensions. We identified 20 strategies to enhance LLM trustworthiness, with fine-tuning and retrieval-augmented generation (RAG) being the most prominent. Most of the strategies are developer-driven and applied during the post-training phase. Several authors propose fragmented terminologies rather than unified frameworks, leading to the risks of "ethics washing," where ethical discourse is adopted without a genuine regulatory commitment. Our findings highlight: persistent gaps between theoretical taxonomies and practical implementation, the crucial role of the developer in operationalising trust, and call for standardised frameworks and stronger regulatory measures to enable trustworthy and ethical deployment of LLMs.
△ Less
Submitted 4 May, 2025; v1 submitted 27 February, 2025;
originally announced March 2025.
-
GPT versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems
Authors:
Rebekah Rousi,
Niko Makitalo,
Hooman Samani,
Kai-Kristian Kemell,
Jose Siqueira de Cerqueira,
Ville Vakkuri,
Tommi Mikkonen,
Pekka Abrahamsson
Abstract:
The emergence of generative artificial intelligence (GAI) and large language models (LLMs) such ChatGPT has enabled the realization of long-harbored desires in software and robotic development. The technology however, has brought with it novel ethical challenges. These challenges are compounded by the application of LLMs in other machine learning systems, such as multi-robot systems. The objective…
▽ More
The emergence of generative artificial intelligence (GAI) and large language models (LLMs) such ChatGPT has enabled the realization of long-harbored desires in software and robotic development. The technology however, has brought with it novel ethical challenges. These challenges are compounded by the application of LLMs in other machine learning systems, such as multi-robot systems. The objectives of the study were to examine novel ethical issues arising from the application of LLMs in multi-robot systems. Unfolding ethical issues in GPT agent behavior (deliberation of ethical concerns) was observed, and GPT output was compared with human experts. The article also advances a model for ethical development of multi-robot systems. A qualitative workshop-based method was employed in three workshops for the collection of ethical concerns: two human expert workshops (N=16 participants) and one GPT-agent-based workshop (N=7 agents; two teams of 6 agents plus one judge). Thematic analysis was used to analyze the qualitative data. The results reveal differences between the human-produced and GPT-based ethical concerns. Human experts placed greater emphasis on new themes related to deviance, data privacy, bias and unethical corporate conduct. GPT agents emphasized concerns present in existing AI ethics guidelines. The study contributes to a growing body of knowledge in context-specific AI ethics and GPT application. It demonstrates the gap between human expert thinking and LLM output, while emphasizing new ethical concerns emerging in novel technology.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
The EU AI Act is a good start but falls short
Authors:
Chalisa Veesommai Sillberg,
Jose Siqueira De Cerqueira,
Pekka Sillberg,
Kai-Kristian Kemell,
Pekka Abrahamsson
Abstract:
The EU AI Act was created to ensure ethical and safe Artificial Intelligence (AI) development and deployment across the EU. This study aims to identify key challenges and strategies for helping enterprises focus on resources effectively. To achieve this aim, we conducted a Multivocal Literature Review (MLR) to explore the sentiments of both the industry and the academia. From 130 articles, 56 met…
▽ More
The EU AI Act was created to ensure ethical and safe Artificial Intelligence (AI) development and deployment across the EU. This study aims to identify key challenges and strategies for helping enterprises focus on resources effectively. To achieve this aim, we conducted a Multivocal Literature Review (MLR) to explore the sentiments of both the industry and the academia. From 130 articles, 56 met the criteria. Our key findings are three-fold. First, liability. Second, discrimination. Third, tool adequacy. Additionally, some negative sentiments were expressed by industry and academia regarding regulatory interpretations, specific requirements, and transparency issues. Next, our findings are three essential themes for enterprises. First, risk-based regulatory compliance. Second, ethical frameworks and principles in technology development. Third, policies and systems for regulatory risk management. These results identify the key challenges and strategies and provide less commonly discussed themes, enabling enterprises to align with the requirements and minimize their distance from the EU market.
△ Less
Submitted 9 December, 2024; v1 submitted 13 November, 2024;
originally announced November 2024.
-
TimeLess: A Vision for the Next Generation of Software Development
Authors:
Zeeshan Rasheed,
Malik Abdul Sami,
Jussi Rasku,
Kai-Kristian Kemell,
Zheying Zhang,
Janne Harjamaki,
Shahbaz Siddeeq,
Sami Lahti,
Tomas Herda,
Mikko Nurminen,
Niklas Lavesson,
Jose Siqueira de Cerqueira,
Toufique Hasan,
Ayman Khan,
Mahade Hasan,
Mika Saari,
Petri Rantanen,
Jari Soini,
Pekka Abrahamsson
Abstract:
Present-day software development faces three major challenges: complexity, time consumption, and high costs. Developing large software systems often requires battalions of teams and considerable time for meetings, which end without any action, resulting in unproductive cycles, delayed progress, and increased cost. What if, instead of large meetings with no immediate results, the software product i…
▽ More
Present-day software development faces three major challenges: complexity, time consumption, and high costs. Developing large software systems often requires battalions of teams and considerable time for meetings, which end without any action, resulting in unproductive cycles, delayed progress, and increased cost. What if, instead of large meetings with no immediate results, the software product is completed by the end of the meeting? In response, we present a vision for a system called TimeLess, designed to reshape the software development process by enabling immediate action during meetings. The goal is to shift meetings from planning discussions to productive, action-oriented sessions. This approach minimizes the time and effort required for development, allowing teams to focus on critical decision-making while AI agents execute development tasks based on the meeting discussions. We will employ multiple AI agents that work collaboratively to capture human discussions and execute development tasks in real time. This represents a step toward next-generation software development environments, where human expertise drives strategy and AI accelerates task execution.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.