-
Understanding Mental Models of Generative Conversational Search and The Effect of Interface Transparency
Authors:
Chadha Degachi,
Samuel Kernan Freire,
Evangelos Niforatos,
Gerd Kortuem
Abstract:
The experience and adoption of conversational search is tied to the accuracy and completeness of users' mental models -- their internal frameworks for understanding and predicting system behaviour. Thus, understanding these models can reveal areas for design interventions. Transparency is one such intervention which can improve system interpretability and enable mental model alignment. While past…
▽ More
The experience and adoption of conversational search is tied to the accuracy and completeness of users' mental models -- their internal frameworks for understanding and predicting system behaviour. Thus, understanding these models can reveal areas for design interventions. Transparency is one such intervention which can improve system interpretability and enable mental model alignment. While past research has explored mental models of search engines, those of generative conversational search remain underexplored, even while the popularity of these systems soars. To address this, we conducted a study with 16 participants, who performed 4 search tasks using 4 conversational interfaces of varying transparency levels. Our analysis revealed that most user mental models were too abstract to support users in explaining individual search instances. These results suggest that 1) mental models may pose a barrier to appropriate trust in conversational search, and 2) hybrid web-conversational search is a promising novel direction for future search interface design.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Factory Operators' Perspectives on Cognitive Assistants for Knowledge Sharing: Challenges, Risks, and Impact on Work
Authors:
Samuel Kernan Freire,
Tianhao He,
Chaofan Wang,
Evangelos Niforatos,
Alessandro Bozzon
Abstract:
In the shift towards human-centered manufacturing, our two-year longitudinal study investigates the real-world impact of deploying Cognitive Assistants (CAs) in factories. The CAs were designed to facilitate knowledge sharing among factory operators. Our investigation focused on smartphone-based voice assistants and LLM-powered chatbots, examining their usability and utility in a real-world factor…
▽ More
In the shift towards human-centered manufacturing, our two-year longitudinal study investigates the real-world impact of deploying Cognitive Assistants (CAs) in factories. The CAs were designed to facilitate knowledge sharing among factory operators. Our investigation focused on smartphone-based voice assistants and LLM-powered chatbots, examining their usability and utility in a real-world factory setting. Based on the qualitative feedback we collected during the deployments of CAs at the factories, we conducted a thematic analysis to investigate the perceptions, challenges, and overall impact on workflow and knowledge sharing.
Our results indicate that while CAs have the potential to significantly improve efficiency through knowledge sharing and quicker resolution of production issues, they also introduce concerns around workplace surveillance, the types of knowledge that can be shared, and shortcomings compared to human-to-human knowledge sharing. Additionally, our findings stress the importance of addressing privacy, knowledge contribution burdens, and tensions between factory operators and their managers.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Enhancing ICU Patient Recovery: Using LLMs to Assist Nurses in Diary Writing
Authors:
Samuel Kernan Freire,
Margo MC van Mol,
Carola Schol,
Elif Özcan Vieira
Abstract:
Intensive care unit (ICU) patients often develop new health-related problems in their long-term recovery. Health care professionals keeping a diary of a patient's stay is a proven strategy to tackle this but faces several adoption barriers, such as lack of time and difficulty in knowing what to write. Large language models (LLMs), with their ability to generate human-like text and adaptability, co…
▽ More
Intensive care unit (ICU) patients often develop new health-related problems in their long-term recovery. Health care professionals keeping a diary of a patient's stay is a proven strategy to tackle this but faces several adoption barriers, such as lack of time and difficulty in knowing what to write. Large language models (LLMs), with their ability to generate human-like text and adaptability, could solve these challenges. However, realizing this vision involves addressing several socio-technical and practical research challenges. This paper discusses these challenges and proposes future research directions to utilize the potential of LLMs in ICU diary writing, ultimately improving the long-term recovery outcomes for ICU patients.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Conversational Assistants in Knowledge-Intensive Contexts: An Evaluation of LLM- versus Intent-based Systems
Authors:
Samuel Kernan Freire,
Chaofan Wang,
Evangelos Niforatos
Abstract:
Conversational Assistants (CA) are increasingly supporting human workers in knowledge management. Traditionally, CAs respond in specific ways to predefined user intents and conversation patterns. However, this rigidness does not handle the diversity of natural language well. Recent advances in natural language processing, namely Large Language Models (LLMs), enable CAs to converse in a more flexib…
▽ More
Conversational Assistants (CA) are increasingly supporting human workers in knowledge management. Traditionally, CAs respond in specific ways to predefined user intents and conversation patterns. However, this rigidness does not handle the diversity of natural language well. Recent advances in natural language processing, namely Large Language Models (LLMs), enable CAs to converse in a more flexible, human-like manner, extracting relevant information from texts and capturing information from expert humans but introducing new challenges such as ``hallucinations''. To assess the potential of using LLMs for knowledge management tasks, we conducted a user study comparing an LLM-based CA to an intent-based system regarding interaction efficiency, user experience, workload, and usability. This revealed that LLM-based CAs exhibited better user experience, task completion rate, usability, and perceived performance than intent-based systems, suggesting that switching NLP techniques can be beneficial in the context of knowledge management.
△ Less
Submitted 12 July, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Knowledge Sharing in Manufacturing using Large Language Models: User Evaluation and Model Benchmarking
Authors:
Samuel Kernan Freire,
Chaofan Wang,
Mina Foosherian,
Stefan Wellsandt,
Santiago Ruiz-Arenas,
Evangelos Niforatos
Abstract:
Recent advances in natural language processing enable more intelligent ways to support knowledge sharing in factories. In manufacturing, operating production lines has become increasingly knowledge-intensive, putting strain on a factory's capacity to train and support new operators. This paper introduces a Large Language Model (LLM)-based system designed to retrieve information from the extensive…
▽ More
Recent advances in natural language processing enable more intelligent ways to support knowledge sharing in factories. In manufacturing, operating production lines has become increasingly knowledge-intensive, putting strain on a factory's capacity to train and support new operators. This paper introduces a Large Language Model (LLM)-based system designed to retrieve information from the extensive knowledge contained in factory documentation and knowledge shared by expert operators. The system aims to efficiently answer queries from operators and facilitate the sharing of new knowledge. We conducted a user study at a factory to assess its potential impact and adoption, eliciting several perceived benefits, namely, enabling quicker information retrieval and more efficient resolution of issues. However, the study also highlighted a preference for learning from a human expert when such an option is available. Furthermore, we benchmarked several commercial and open-sourced LLMs for this system. The current state-of-the-art model, GPT-4, consistently outperformed its counterparts, with open-source models trailing closely, presenting an attractive option given their data privacy and customization benefits. In summary, this work offers preliminary insights and a system design for factories considering using LLM tools for knowledge management.
△ Less
Submitted 26 February, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Safeguarding Crowdsourcing Surveys from ChatGPT with Prompt Injection
Authors:
Chaofan Wang,
Samuel Kernan Freire,
Mo Zhang,
Jing Wei,
Jorge Goncalves,
Vassilis Kostakos,
Zhanna Sarsenbayeva,
Christina Schneegass,
Alessandro Bozzon,
Evangelos Niforatos
Abstract:
ChatGPT and other large language models (LLMs) have proven useful in crowdsourcing tasks, where they can effectively annotate machine learning training data. However, this means that they also have the potential for misuse, specifically to automatically answer surveys. LLMs can potentially circumvent quality assurance measures, thereby threatening the integrity of methodologies that rely on crowds…
▽ More
ChatGPT and other large language models (LLMs) have proven useful in crowdsourcing tasks, where they can effectively annotate machine learning training data. However, this means that they also have the potential for misuse, specifically to automatically answer surveys. LLMs can potentially circumvent quality assurance measures, thereby threatening the integrity of methodologies that rely on crowdsourcing surveys. In this paper, we propose a mechanism to detect LLM-generated responses to surveys. The mechanism uses "prompt injection", such as directions that can mislead LLMs into giving predictable responses. We evaluate our technique against a range of question scenarios, types, and positions, and find that it can reliably detect LLM-generated responses with more than 93% effectiveness. We also provide an open-source software to help survey designers use our technique to detect LLM responses. Our work is a step in ensuring that survey methodologies remain rigorous vis-a-vis LLMs.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.