Search | arXiv e-print repository

arXiv:2505.20311 [pdf, ps, other]

The EU AI Act, Stakeholder Needs, and Explainable AI: Aligning Regulatory Compliance in a Clinical Decision Support System

Authors: Anton Hummel, Håkan Burden, Susanne Stenberg, Jan-Philipp Steghöfer, Niklas Kühl

Abstract: Explainable AI (XAI) is a promising solution to ensure compliance with the EU AI Act, the first multi-national regulation for AI. XAI aims to enhance transparency and human oversight of AI systems, particularly ``black-box models'', which are criticized as incomprehensible. However, the discourse around the main stakeholders in the AI Act and XAI appears disconnected. While XAI prioritizes the end… ▽ More Explainable AI (XAI) is a promising solution to ensure compliance with the EU AI Act, the first multi-national regulation for AI. XAI aims to enhance transparency and human oversight of AI systems, particularly ``black-box models'', which are criticized as incomprehensible. However, the discourse around the main stakeholders in the AI Act and XAI appears disconnected. While XAI prioritizes the end user's needs as the primary goal, the AI Act focuses on the obligations of the provider and deployer of the AI system. We aim to bridge this divide and provide guidance on how these two worlds are related. By fostering an interdisciplinary discussion in a cross-functional team with XAI, AI Act, legal, and requirements engineering experts, we walk through the steps necessary to analyze an AI-based clinical decision support system to clarify the end-user needs and assess AI Act applicability. By analyzing our justified understanding using an AI system under development as a case, we show that XAI techniques can fill a gap between stakeholder needs and the requirements of the AI Act. We look at the similarities and contrasts between the legal requirements and the needs of stakeholders. In doing so, we encourage researchers and practitioners from the XAI community to reflect on their role towards the AI Act by achieving a mutual understanding of the implications of XAI and the AI Act within different disciplines. △ Less

Submitted 22 May, 2025; originally announced May 2025.

Comments: 17 pages, 2 figures

arXiv:2505.10661 [pdf]

doi 10.1145/3715275.3732084

It's only fair when I think it's fair: How Gender Bias Alignment Undermines Distributive Fairness in Human-AI Collaboration

Authors: Domenique Zipperling, Luca Deck, Julia Lanzl, Niklas Kühl

Abstract: Human-AI collaboration is increasingly relevant in consequential areas where AI recommendations support human discretion. However, human-AI teams' effectiveness, capability, and fairness highly depend on human perceptions of AI. Positive fairness perceptions have been shown to foster trust and acceptance of AI recommendations. Yet, work on confirmation bias highlights that humans selectively adher… ▽ More Human-AI collaboration is increasingly relevant in consequential areas where AI recommendations support human discretion. However, human-AI teams' effectiveness, capability, and fairness highly depend on human perceptions of AI. Positive fairness perceptions have been shown to foster trust and acceptance of AI recommendations. Yet, work on confirmation bias highlights that humans selectively adhere to AI recommendations that align with their expectations and beliefs -- despite not being necessarily correct or fair. This raises the question whether confirmation bias also transfers to the alignment of gender bias between human and AI decisions. In our study, we examine how gender bias alignment influences fairness perceptions and reliance. The results of a 2x2 between-subject study highlight the connection between gender bias alignment, fairness perceptions, and reliance, demonstrating that merely constructing a ``formally fair'' AI system is insufficient for optimal human-AI collaboration; ultimately, AI recommendations will likely be overridden if biases do not align. △ Less

Submitted 4 June, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

Journal ref: ACM Conference on Fairness, Accountability, and Transparency 2025 (ACM FAccT 2025)

arXiv:2504.16056 [pdf, other]

Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability

Authors: Daniel Hendriks, Philipp Spitzer, Niklas Kühl, Gerhard Satzger

Abstract: Artificial Intelligence (AI) has increasingly influenced modern society, recently in particular through significant advancements in Large Language Models (LLMs). However, high computational and storage demands of LLMs still limit their deployment in resource-constrained environments. Knowledge distillation addresses this challenge by training a small student model from a larger teacher model. Prev… ▽ More Artificial Intelligence (AI) has increasingly influenced modern society, recently in particular through significant advancements in Large Language Models (LLMs). However, high computational and storage demands of LLMs still limit their deployment in resource-constrained environments. Knowledge distillation addresses this challenge by training a small student model from a larger teacher model. Previous research has introduced several distillation methods for both generating training data and for training the student model. Despite their relevance, the effects of state-of-the-art distillation methods on model performance and explainability have not been thoroughly investigated and compared. In this work, we enlarge the set of available methods by applying critique-revision prompting to distillation for data generation and by synthesizing existing methods for training. For these methods, we provide a systematic comparison based on the widely used Commonsense Question-Answering (CQA) dataset. While we measure performance via student model accuracy, we employ a human-grounded study to evaluate explainability. We contribute new distillation methods and their comparison in terms of both performance and explainability. This should further advance the distillation of small language models and, thus, contribute to broader applicability and faster diffusion of LLM technology. △ Less

Submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.06791 [pdf, other]

Beware of "Explanations" of AI

Authors: David Martens, Galit Shmueli, Theodoros Evgeniou, Kevin Bauer, Christian Janiesch, Stefan Feuerriegel, Sebastian Gabel, Sofie Goethals, Travis Greene, Nadja Klein, Mathias Kraus, Niklas Kühl, Claudia Perlich, Wouter Verbeke, Alona Zharova, Patrick Zschech, Foster Provost

Abstract: Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a "good" explanation is dependent on the goal… ▽ More Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a "good" explanation is dependent on the goals, stakeholders, and context. At a high level, psychological insights such as the concept of mental model alignment can offer guidance, but success in practice is challenging due to social and technical factors. As a result of this ill-defined nature of the problem, explanations can be of poor quality (e.g. unfaithful, irrelevant, or incoherent), potentially leading to substantial risks. Instead of fostering trust and safety, poorly designed explanations can actually cause harm, including wrong decisions, privacy violations, manipulation, and even reduced AI adoption. Therefore, we caution stakeholders to beware of explanations of AI: while they can be vital, they are not automatically a remedy for transparency or responsible AI adoption, and their misuse or limitations can exacerbate harm. Attention to these caveats can help guide future research to improve the quality and impact of AI explanations. △ Less

Submitted 9 April, 2025; originally announced April 2025.

Comments: This work was inspired by Dagstuhl Seminar 24342

arXiv:2503.18629 [pdf, other]

Towards Human-Understandable Multi-Dimensional Concept Discovery

Authors: Arne Grobrügge, Niklas Kühl, Gerhard Satzger, Philipp Spitzer

Abstract: Concept-based eXplainable AI (C-XAI) aims to overcome the limitations of traditional saliency maps by converting pixels into human-understandable concepts that are consistent across an entire dataset. A crucial aspect of C-XAI is completeness, which measures how well a set of concepts explains a model's decisions. Among C-XAI methods, Multi-Dimensional Concept Discovery (MCD) effectively improves… ▽ More Concept-based eXplainable AI (C-XAI) aims to overcome the limitations of traditional saliency maps by converting pixels into human-understandable concepts that are consistent across an entire dataset. A crucial aspect of C-XAI is completeness, which measures how well a set of concepts explains a model's decisions. Among C-XAI methods, Multi-Dimensional Concept Discovery (MCD) effectively improves completeness by breaking down the CNN latent space into distinct and interpretable concept subspaces. However, MCD's explanations can be difficult for humans to understand, raising concerns about their practical utility. To address this, we propose Human-Understandable Multi-dimensional Concept Discovery (HU-MCD). HU-MCD uses the Segment Anything Model for concept identification and implements a CNN-specific input masking technique to reduce noise introduced by traditional masking methods. These changes to MCD, paired with the completeness relation, enable HU-MCD to enhance concept understandability while maintaining explanation faithfulness. Our experiments, including human subject studies, show that HU-MCD provides more precise and reliable explanations than existing C-XAI methods. The code is available at https://github.com/grobruegge/hu-mcd. △ Less

Submitted 24 March, 2025; originally announced March 2025.

arXiv:2502.16280 [pdf, other]

Human Preferences in Large Language Model Latent Space: A Technical Analysis on the Reliability of Synthetic Data in Voting Outcome Prediction

Authors: Sarah Ball, Simeon Allmendinger, Frauke Kreuter, Niklas Kühl

Abstract: Generative AI (GenAI) is increasingly used in survey contexts to simulate human preferences. While many research endeavors evaluate the quality of synthetic GenAI data by comparing model-generated responses to gold-standard survey results, fundamental questions about the validity and reliability of using LLMs as substitutes for human respondents remain. Our study provides a technical analysis of h… ▽ More Generative AI (GenAI) is increasingly used in survey contexts to simulate human preferences. While many research endeavors evaluate the quality of synthetic GenAI data by comparing model-generated responses to gold-standard survey results, fundamental questions about the validity and reliability of using LLMs as substitutes for human respondents remain. Our study provides a technical analysis of how demographic attributes and prompt variations influence latent opinion mappings in large language models (LLMs) and evaluates their suitability for survey-based predictions. Using 14 different models, we find that LLM-generated data fails to replicate the variance observed in real-world human responses, particularly across demographic subgroups. In the political space, persona-to-party mappings exhibit limited differentiation, resulting in synthetic data that lacks the nuanced distribution of opinions found in survey data. Moreover, we show that prompt sensitivity can significantly alter outputs for some models, further undermining the stability and predictiveness of LLM-based simulations. As a key contribution, we adapt a probe-based methodology that reveals how LLMs encode political affiliations in their latent space, exposing the systematic distortions introduced by these models. Our findings highlight critical limitations in AI-generated survey data, urging caution in its use for public opinion research, social science experimentation, and computational behavioral modeling. △ Less

Submitted 22 February, 2025; originally announced February 2025.

arXiv:2501.04528 [pdf, other]

Towards a Problem-Oriented Domain Adaptation Framework for Machine Learning

Authors: Philipp Spitzer, Dominik Martin, Laurin Eichberger, Niklas Kühl

Abstract: Domain adaptation is a sub-field of machine learning that involves transferring knowledge from a source domain to perform the same task in the target domain. It is a typical challenge in machine learning that arises, e.g., when data is obtained from various sources or when using a data basis that changes over time. Recent advances in the field offer promising methods, but it is still challenging f… ▽ More Domain adaptation is a sub-field of machine learning that involves transferring knowledge from a source domain to perform the same task in the target domain. It is a typical challenge in machine learning that arises, e.g., when data is obtained from various sources or when using a data basis that changes over time. Recent advances in the field offer promising methods, but it is still challenging for researchers and practitioners to determine if domain adaptation is suitable for a given problem -- and, subsequently, to select the appropriate approach. This article employs design science research to develop a problem-oriented framework for domain adaptation, which is matured in three evaluation episodes. We describe a framework that distinguishes between five domain adaptation scenarios, provides recommendations for addressing each scenario, and offers guidelines for determining if a problem falls into one of these scenarios. During the multiple evaluation episodes, the framework is tested on artificial and real-world datasets and an experimental study involving 100 participants. The evaluation demonstrates that the framework has the explanatory power to capture any domain adaptation problem effectively. In summary, we provide clear guidance for researchers and practitioners who want to employ domain adaptation but lack in-depth knowledge of the possibilities. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2409.12809 [pdf, other]

Don't be Fooled: The Misinformation Effect of Explanations in Human-AI Collaboration

Authors: Philipp Spitzer, Joshua Holstein, Katelyn Morrison, Kenneth Holstein, Gerhard Satzger, Niklas Kühl

Abstract: Across various applications, humans increasingly use black-box artificial intelligence (AI) systems without insight into these systems' reasoning. To counter this opacity, explainable AI (XAI) methods promise enhanced transparency and interpretability. While recent studies have explored how XAI affects human-AI collaboration, few have examined the potential pitfalls caused by incorrect explanation… ▽ More Across various applications, humans increasingly use black-box artificial intelligence (AI) systems without insight into these systems' reasoning. To counter this opacity, explainable AI (XAI) methods promise enhanced transparency and interpretability. While recent studies have explored how XAI affects human-AI collaboration, few have examined the potential pitfalls caused by incorrect explanations. The implications for humans can be far-reaching but have not been explored extensively. To investigate this, we ran a study (n=160) on AI-assisted decision-making in which humans were supported by XAI. Our findings reveal a misinformation effect when incorrect explanations accompany correct AI advice with implications post-collaboration. This effect causes humans to infer flawed reasoning strategies, hindering task execution and demonstrating impaired procedural knowledge. Additionally, incorrect explanations compromise human-AI team-performance during collaboration. With our work, we contribute to HCI by providing empirical evidence for the negative consequences of incorrect explanations on humans post-collaboration and outlining guidelines for designers of AI. △ Less

Submitted 8 January, 2025; v1 submitted 19 September, 2024; originally announced September 2024.

arXiv:2409.08636 [pdf, other]

Utilizing Data Fingerprints for Privacy-Preserving Algorithm Selection in Time Series Classification: Performance and Uncertainty Estimation on Unseen Datasets

Authors: Lars Böcking, Leopold Müller, Niklas Kühl

Abstract: The selection of algorithms is a crucial step in designing AI services for real-world time series classification use cases. Traditional methods such as neural architecture search, automated machine learning, combined algorithm selection, and hyperparameter optimizations are effective but require considerable computational resources and necessitate access to all data points to run their optimizatio… ▽ More The selection of algorithms is a crucial step in designing AI services for real-world time series classification use cases. Traditional methods such as neural architecture search, automated machine learning, combined algorithm selection, and hyperparameter optimizations are effective but require considerable computational resources and necessitate access to all data points to run their optimizations. In this work, we introduce a novel data fingerprint that describes any time series classification dataset in a privacy-preserving manner and provides insight into the algorithm selection problem without requiring training on the (unseen) dataset. By decomposing the multi-target regression problem, only our data fingerprints are used to estimate algorithm performance and uncertainty in a scalable and adaptable manner. Our approach is evaluated on the 112 University of California riverside benchmark datasets, demonstrating its effectiveness in predicting the performance of 35 state-of-the-art algorithms and providing valuable insights for effective algorithm selection in time series classification service systems, improving a naive baseline by 7.32% on average in estimating the mean performance and 15.81% in estimating the uncertainty. △ Less

Submitted 30 September, 2024; v1 submitted 13 September, 2024; originally announced September 2024.

Comments: Hawaii International Conference on System Sciences (HICSS-58) 2025

arXiv:2408.08666 [pdf]

A Multivocal Literature Review on Privacy and Fairness in Federated Learning

Authors: Beatrice Balbierer, Lukas Heinlein, Domenique Zipperling, Niklas Kühl

Abstract: Federated Learning presents a way to revolutionize AI applications by eliminating the necessity for data sharing. Yet, research has shown that information can still be extracted during training, making additional privacy-preserving measures such as differential privacy imperative. To implement real-world federated learning applications, fairness, ranging from a fair distribution of performance to… ▽ More Federated Learning presents a way to revolutionize AI applications by eliminating the necessity for data sharing. Yet, research has shown that information can still be extracted during training, making additional privacy-preserving measures such as differential privacy imperative. To implement real-world federated learning applications, fairness, ranging from a fair distribution of performance to non-discriminative behaviour, must be considered. Particularly in high-risk applications (e.g. healthcare), avoiding the repetition of past discriminatory errors is paramount. As recent research has demonstrated an inherent tension between privacy and fairness, we conduct a multivocal literature review to examine the current methods to integrate privacy and fairness in federated learning. Our analyses illustrate that the relationship between privacy and fairness has been neglected, posing a critical risk for real-world applications. We highlight the need to explore the relationship between privacy, fairness, and performance, advocating for the creation of integrated federated learning frameworks. △ Less

Submitted 27 October, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

Comments: Proceedings of the 19th International Conference on Wirtschaftsinformatik (WI), 2024

arXiv:2408.03948 [pdf, other]

A Survey of AI Reliance

Authors: Sven Eckhardt, Niklas Kühl, Mateusz Dolata, Gerhard Schwabe

Abstract: Artificial intelligence (AI) systems have become an indispensable component of modern technology. However, research on human behavioral responses is lagging behind, i.e., the research into human reliance on AI advice (AI reliance). Current shortcomings in the literature include the unclear influences on AI reliance, lack of external validity, conflicting approaches to measuring reliance, and disre… ▽ More Artificial intelligence (AI) systems have become an indispensable component of modern technology. However, research on human behavioral responses is lagging behind, i.e., the research into human reliance on AI advice (AI reliance). Current shortcomings in the literature include the unclear influences on AI reliance, lack of external validity, conflicting approaches to measuring reliance, and disregard for a change in reliance over time. Promising avenues for future research include reliance on generative AI output and reliance in multi-user situations. In conclusion, we present a morphological box that serves as a guide for research on AI reliance. △ Less

Submitted 22 July, 2024; originally announced August 2024.

arXiv:2406.14429 [pdf, other]

CollaFuse: Collaborative Diffusion Models

Authors: Simeon Allmendinger, Domenique Zipperling, Lukas Struppek, Niklas Kühl

Abstract: In the landscape of generative artificial intelligence, diffusion-based models have emerged as a promising method for generating synthetic images. However, the application of diffusion models poses numerous challenges, particularly concerning data availability, computational requirements, and privacy. Traditional approaches to address these shortcomings, like federated learning, often impose signi… ▽ More In the landscape of generative artificial intelligence, diffusion-based models have emerged as a promising method for generating synthetic images. However, the application of diffusion models poses numerous challenges, particularly concerning data availability, computational requirements, and privacy. Traditional approaches to address these shortcomings, like federated learning, often impose significant computational burdens on individual clients, especially those with constrained resources. In response to these challenges, we introduce a novel approach for distributed collaborative diffusion models inspired by split learning. Our approach facilitates collaborative training of diffusion models while alleviating client computational burdens during image synthesis. This reduced computational burden is achieved by retaining data and computationally inexpensive processes locally at each client while outsourcing the computationally expensive processes to shared, more efficient server resources. Through experiments on the common CelebA dataset, our approach demonstrates enhanced privacy by reducing the necessity for sharing raw data. These capabilities hold significant potential across various application areas, including the design of edge computing solutions. Thus, our work advances distributed machine learning by contributing to the evolution of collaborative diffusion models. △ Less

Submitted 27 October, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: 13 pages, 7 figures

arXiv:2406.12660 [pdf]

Investigating the Role of Explainability and AI Literacy in User Compliance

Authors: Niklas Kühl, Christian Meske, Maximilian Nitsche, Jodie Lobana

Abstract: AI is becoming increasingly common across different domains. However, as sophisticated AI-based systems are often black-boxed, rendering the decision-making logic opaque, users find it challenging to comply with their recommendations. Although researchers are investigating Explainable AI (XAI) to increase the transparency of the underlying machine learning models, it is unclear what types of expla… ▽ More AI is becoming increasingly common across different domains. However, as sophisticated AI-based systems are often black-boxed, rendering the decision-making logic opaque, users find it challenging to comply with their recommendations. Although researchers are investigating Explainable AI (XAI) to increase the transparency of the underlying machine learning models, it is unclear what types of explanations are effective and what other factors increase compliance. To better understand the interplay of these factors, we conducted an experiment with 562 participants who were presented with the recommendations of an AI and two different types of XAI. We find that users' compliance increases with the introduction of XAI but is also affected by AI literacy. We also find that the relationships between AI literacy XAI and users' compliance are mediated by the users' mental model of AI. Our study has several implications for successfully designing AI-based systems utilizing XAI. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.06537 [pdf, other]

Interactive Generation of Laparoscopic Videos with Diffusion Models

Authors: Ivan Iliash, Simeon Allmendinger, Felix Meissen, Niklas Kühl, Daniel Rückert

Abstract: Generative AI, in general, and synthetic visual data generation, in specific, hold much promise for benefiting surgical training by providing photorealism to simulation environments. Current training methods primarily rely on reading materials and observing live surgeries, which can be time-consuming and impractical. In this work, we take a significant step towards improving the training process.… ▽ More Generative AI, in general, and synthetic visual data generation, in specific, hold much promise for benefiting surgical training by providing photorealism to simulation environments. Current training methods primarily rely on reading materials and observing live surgeries, which can be time-consuming and impractical. In this work, we take a significant step towards improving the training process. Specifically, we use diffusion models in combination with a zero-shot video diffusion method to interactively generate realistic laparoscopic images and videos by specifying a surgical action through text and guiding the generation with tool positions through segmentation masks. We demonstrate the performance of our approach using the publicly available Cholec dataset family and evaluate the fidelity and factual correctness of our generated images using a surgical action recognition model as well as the pixel-wise F1-score for the spatial control of tool generation. We achieve an FID of 38.097 and an F1-score of 0.71. △ Less

Submitted 23 April, 2024; originally announced June 2024.

Comments: 7 pages, 4 figures

arXiv:2406.01329 [pdf, other]

Transferring Domain Knowledge with (X)AI-Based Learning Systems

Authors: Philipp Spitzer, Niklas Kühl, Marc Goutier, Manuel Kaschura, Gerhard Satzger

Abstract: In numerous high-stakes domains, training novices via conventional learning systems does not suffice. To impart tacit knowledge, experts' hands-on guidance is imperative. However, training novices by experts is costly and time-consuming, increasing the need for alternatives. Explainable artificial intelligence (XAI) has conventionally been used to make black-box artificial intelligence systems int… ▽ More In numerous high-stakes domains, training novices via conventional learning systems does not suffice. To impart tacit knowledge, experts' hands-on guidance is imperative. However, training novices by experts is costly and time-consuming, increasing the need for alternatives. Explainable artificial intelligence (XAI) has conventionally been used to make black-box artificial intelligence systems interpretable. In this work, we utilize XAI as an alternative: An (X)AI system is trained on experts' past decisions and is then employed to teach novices by providing examples coupled with explanations. In a study with 249 participants, we measure the effectiveness of such an approach for a classification task. We show that (X)AI-based learning systems are able to induce learning in novices and that their cognitive styles moderate learning. Thus, we take the first steps to reveal the impact of XAI on human learning and point AI developers to future options to tailor the design of (X)AI-based learning systems. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Thirty-Second European Conference on Information Systems (ECIS 2024), Paphos, Cyprus

arXiv:2405.07658 [pdf]

Understanding Data Understanding: A Framework to Navigate the Intricacies of Data Analytics

Authors: Joshua Holstein, Philipp Spitzer, Marieke Hoell, Michael Vössing, Niklas Kühl

Abstract: As organizations face the challenges of processing exponentially growing data volumes, their reliance on analytics to unlock value from this data has intensified. However, the intricacies of big data, such as its extensive feature sets, pose significant challenges. A crucial step in leveraging this data for insightful analysis is an in-depth understanding of both the data and its domain. Yet, exis… ▽ More As organizations face the challenges of processing exponentially growing data volumes, their reliance on analytics to unlock value from this data has intensified. However, the intricacies of big data, such as its extensive feature sets, pose significant challenges. A crucial step in leveraging this data for insightful analysis is an in-depth understanding of both the data and its domain. Yet, existing literature presents a fragmented picture of what comprises an effective understanding of data and domain, varying significantly in depth and focus. To address this research gap, we conduct a systematic literature review, aiming to delineate the dimensions of data understanding. We identify five dimensions: Foundations, Collection & Selection, Contextualization & Integration, Exploration & Discovery, and Insights. These dimensions collectively form a comprehensive framework for data understanding, providing guidance for organizations seeking meaningful insights from complex datasets. This study synthesizes the current state of knowledge and lays the groundwork for further exploration. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: Accepted at 32nd European Conference on Information Systems (2024)

arXiv:2404.18736 [pdf, other]

Mapping the Potential of Explainable AI for Fairness Along the AI Lifecycle

Authors: Luca Deck, Astrid Schomäcker, Timo Speith, Jakob Schöffer, Lena Kästner, Niklas Kühl

Abstract: The widespread use of artificial intelligence (AI) systems across various domains is increasingly surfacing issues related to algorithmic fairness, especially in high-stakes scenarios. Thus, critical considerations of how fairness in AI systems might be improved -- and what measures are available to aid this process -- are overdue. Many researchers and policymakers see explainable AI (XAI) as a pr… ▽ More The widespread use of artificial intelligence (AI) systems across various domains is increasingly surfacing issues related to algorithmic fairness, especially in high-stakes scenarios. Thus, critical considerations of how fairness in AI systems might be improved -- and what measures are available to aid this process -- are overdue. Many researchers and policymakers see explainable AI (XAI) as a promising way to increase fairness in AI systems. However, there is a wide variety of XAI methods and fairness conceptions expressing different desiderata, and the precise connections between XAI and fairness remain largely nebulous. Besides, different measures to increase algorithmic fairness might be applicable at different points throughout an AI system's lifecycle. Yet, there currently is no coherent mapping of fairness desiderata along the AI lifecycle. In this paper, we we distill eight fairness desiderata, map them along the AI lifecycle, and discuss how XAI could help address each of them. We hope to provide orientation for practical applications and to inspire XAI research specifically focused on these fairness desiderata. △ Less

Submitted 27 June, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.00029 [pdf, other]

Complementarity in Human-AI Collaboration: Concept, Sources, and Evidence

Authors: Patrick Hemmer, Max Schemmer, Niklas Kühl, Michael Vössing, Gerhard Satzger

Abstract: Artificial intelligence (AI) has the potential to significantly enhance human performance across various domains. Ideally, collaboration between humans and AI should result in complementary team performance (CTP) -- a level of performance that neither of them can attain individually. So far, however, CTP has rarely been observed, suggesting an insufficient understanding of the principle and the ap… ▽ More Artificial intelligence (AI) has the potential to significantly enhance human performance across various domains. Ideally, collaboration between humans and AI should result in complementary team performance (CTP) -- a level of performance that neither of them can attain individually. So far, however, CTP has rarely been observed, suggesting an insufficient understanding of the principle and the application of complementarity. Therefore, we develop a general concept of complementarity and formalize its theoretical potential as well as the actual realized effect in decision-making situations. Moreover, we identify information and capability asymmetry as the two key sources of complementarity. Finally, we illustrate the impact of each source on complementarity potential and effect in two empirical studies. Our work provides researchers with a comprehensive theoretical foundation of human-AI complementarity in decision-making and demonstrates that leveraging these sources constitutes a viable pathway towards designing effective human-AI collaboration, i.e., the realization of CTP. △ Less

Submitted 25 November, 2024; v1 submitted 21 March, 2024; originally announced April 2024.

arXiv:2403.20089 [pdf, other]

Implications of the AI Act for Non-Discrimination Law and Algorithmic Fairness

Authors: Luca Deck, Jan-Laurin Müller, Conradin Braun, Domenique Zipperling, Niklas Kühl

Abstract: The topic of fairness in AI, as debated in the FATE (Fairness, Accountability, Transparency, and Ethics in AI) communities, has sparked meaningful discussions in the past years. However, from a legal perspective, particularly from the perspective of European Union law, many open questions remain. Whereas algorithmic fairness aims to mitigate structural inequalities at design-level, European non-di… ▽ More The topic of fairness in AI, as debated in the FATE (Fairness, Accountability, Transparency, and Ethics in AI) communities, has sparked meaningful discussions in the past years. However, from a legal perspective, particularly from the perspective of European Union law, many open questions remain. Whereas algorithmic fairness aims to mitigate structural inequalities at design-level, European non-discrimination law is tailored to individual cases of discrimination after an AI model has been deployed. The AI Act might present a tremendous step towards bridging these two approaches by shifting non-discrimination responsibilities into the design stage of AI models. Based on an integrative reading of the AI Act, we comment on legal as well as technical enforcement problems and propose practical implications on bias detection and bias correction in order to specify and comply with specific technical requirements. △ Less

Submitted 26 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

arXiv:2402.19105 [pdf, other]

CollaFuse: Navigating Limited Resources and Privacy in Collaborative Generative AI

Authors: Domenique Zipperling, Simeon Allmendinger, Lukas Struppek, Niklas Kühl

Abstract: In the landscape of generative artificial intelligence, diffusion-based models present challenges for socio-technical systems in data requirements and privacy. Traditional approaches like federated learning distribute the learning process but strain individual clients, especially with constrained resources (e.g., edge devices). In response to these challenges, we introduce CollaFuse, a novel frame… ▽ More In the landscape of generative artificial intelligence, diffusion-based models present challenges for socio-technical systems in data requirements and privacy. Traditional approaches like federated learning distribute the learning process but strain individual clients, especially with constrained resources (e.g., edge devices). In response to these challenges, we introduce CollaFuse, a novel framework inspired by split learning. Tailored for efficient and collaborative use of denoising diffusion probabilistic models, CollaFuse enables shared server training and inference, alleviating client computational burdens. This is achieved by retaining data and computationally inexpensive GPU processes locally at each client while outsourcing the computationally expensive processes to the shared server. Demonstrated in a healthcare context, CollaFuse enhances privacy by highly reducing the need for sensitive information sharing. These capabilities hold the potential to impact various application areas, such as the design of edge computing solutions, healthcare research, or autonomous driving. In essence, our work advances distributed machine learning, shaping the future of collaborative GenAI networks. △ Less

Submitted 16 August, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: Thirty-Second European Conference on Information Systems (ECIS 2024)

arXiv:2401.04729 [pdf, other]

Human Delegation Behavior in Human-AI Collaboration: The Effect of Contextual Information

Authors: Philipp Spitzer, Joshua Holstein, Patrick Hemmer, Michael Vössing, Niklas Kühl, Dominik Martin, Gerhard Satzger

Abstract: The integration of artificial intelligence (AI) into human decision-making processes at the workplace presents both opportunities and challenges. One promising approach to leverage existing complementary capabilities is allowing humans to delegate individual instances of decision tasks to AI. However, enabling humans to delegate instances effectively requires them to assess several factors. One ke… ▽ More The integration of artificial intelligence (AI) into human decision-making processes at the workplace presents both opportunities and challenges. One promising approach to leverage existing complementary capabilities is allowing humans to delegate individual instances of decision tasks to AI. However, enabling humans to delegate instances effectively requires them to assess several factors. One key factor is the analysis of both their own capabilities and those of the AI in the context of the given task. In this work, we conduct a behavioral study to explore the effects of providing contextual information to support this delegation decision. Specifically, we investigate how contextual information about the AI and the task domain influence humans' delegation decisions to an AI and their impact on the human-AI team performance. Our findings reveal that access to contextual information significantly improves human-AI team performance in delegation settings. Finally, we show that the delegation behavior changes with the different types of contextual information. Overall, this research advances the understanding of computer-supported, collaborative work and provides actionable insights for designing more effective collaborative systems. △ Less

Submitted 9 January, 2025; v1 submitted 9 January, 2024; originally announced January 2024.

arXiv:2312.03043 [pdf, other]

Navigating the Synthetic Realm: Harnessing Diffusion-based Models for Laparoscopic Text-to-Image Generation

Authors: Simeon Allmendinger, Patrick Hemmer, Moritz Queisner, Igor Sauer, Leopold Müller, Johannes Jakubik, Michael Vössing, Niklas Kühl

Abstract: Recent advances in synthetic imaging open up opportunities for obtaining additional data in the field of surgical imaging. This data can provide reliable supplements supporting surgical applications and decision-making through computer vision. Particularly the field of image-guided surgery, such as laparoscopic and robotic-assisted surgery, benefits strongly from synthetic image datasets and virtu… ▽ More Recent advances in synthetic imaging open up opportunities for obtaining additional data in the field of surgical imaging. This data can provide reliable supplements supporting surgical applications and decision-making through computer vision. Particularly the field of image-guided surgery, such as laparoscopic and robotic-assisted surgery, benefits strongly from synthetic image datasets and virtual surgical training methods. Our study presents an intuitive approach for generating synthetic laparoscopic images from short text prompts using diffusion-based generative models. We demonstrate the usage of state-of-the-art text-to-image architectures in the context of laparoscopic imaging with regard to the surgical removal of the gallbladder as an example. Results on fidelity and diversity demonstrate that diffusion-based models can acquire knowledge about the style and semantics in the field of image-guided surgery. A validation study with a human assessment survey underlines the realistic nature of our synthetic data, as medical personnel detects actual images in a pool with generated images causing a false-positive rate of 66%. In addition, the investigation of a state-of-the-art machine learning model to recognize surgical actions indicates enhanced results when trained with additional generated images of up to 5.20%. Overall, the achieved image quality contributes to the usage of computer-generated images in surgical applications and enhances its path to maturity. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2311.09744 [pdf, other]

Redefining the Laparoscopic Spatial Sense: AI-based Intra- and Postoperative Measurement from Stereoimages

Authors: Leopold Müller, Patrick Hemmer, Moritz Queisner, Igor Sauer, Simeon Allmendinger, Johannes Jakubik, Michael Vössing, Niklas Kühl

Abstract: A significant challenge in image-guided surgery is the accurate measurement task of relevant structures such as vessel segments, resection margins, or bowel lengths. While this task is an essential component of many surgeries, it involves substantial human effort and is prone to inaccuracies. In this paper, we develop a novel human-AI-based method for laparoscopic measurements utilizing stereo vis… ▽ More A significant challenge in image-guided surgery is the accurate measurement task of relevant structures such as vessel segments, resection margins, or bowel lengths. While this task is an essential component of many surgeries, it involves substantial human effort and is prone to inaccuracies. In this paper, we develop a novel human-AI-based method for laparoscopic measurements utilizing stereo vision that has been guided by practicing surgeons. Based on a holistic qualitative requirements analysis, this work proposes a comprehensive measurement method, which comprises state-of-the-art machine learning architectures, such as RAFT-Stereo and YOLOv8. The developed method is assessed in various realistic experimental evaluation environments. Our results outline the potential of our method achieving high accuracies in distance measurements with errors below 1 mm. Furthermore, on-surface measurements demonstrate robustness when applied in challenging environments with textureless regions. Overall, by addressing the inherent challenges of image-guided surgery, we lay the foundation for a more robust and accurate solution for intra- and postoperative measurements, enabling more precise, safe, and efficient surgical procedures. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 38th AAAI Conference on Artificial Intelligence (AAAI-24)

arXiv:2310.13007 [pdf, other]

doi 10.1145/3630106.3658990

A Critical Survey on Fairness Benefits of Explainable AI

Authors: Luca Deck, Jakob Schoeffer, Maria De-Arteaga, Niklas Kühl

Abstract: In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 scientific articles on the alleged fairness benefits of XAI. We present crucia… ▽ More In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 scientific articles on the alleged fairness benefits of XAI. We present crucial caveats with respect to these claims and provide an entry point for future discussions around the potentials and limitations of XAI for specific fairness desiderata. Importantly, we notice that claims are often (i) vague and simplistic, (ii) lacking normative grounding, or (iii) poorly aligned with the actual capabilities of XAI. We suggest to conceive XAI not as an ethical panacea but as one of many tools to approach the multidimensional, sociotechnical challenge of algorithmic fairness. Moreover, when making a claim about XAI and fairness, we emphasize the need to be more specific about what kind of XAI method is used, which fairness desideratum it refers to, how exactly it enables fairness, and who is the stakeholder that benefits from XAI. △ Less

Submitted 7 May, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

Comments: ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT '24)

arXiv:2310.02108 [pdf]

Towards Effective Human-AI Decision-Making: The Role of Human Learning in Appropriate Reliance on AI Advice

Authors: Max Schemmer, Andrea Bartos, Philipp Spitzer, Patrick Hemmer, Niklas Kühl, Jonas Liebschner, Gerhard Satzger

Abstract: The true potential of human-AI collaboration lies in exploiting the complementary capabilities of humans and AI to achieve a joint performance superior to that of the individual AI or human, i.e., to achieve complementary team performance (CTP). To realize this complementarity potential, humans need to exercise discretion in following AI 's advice, i.e., appropriately relying on the AI's advice. W… ▽ More The true potential of human-AI collaboration lies in exploiting the complementary capabilities of humans and AI to achieve a joint performance superior to that of the individual AI or human, i.e., to achieve complementary team performance (CTP). To realize this complementarity potential, humans need to exercise discretion in following AI 's advice, i.e., appropriately relying on the AI's advice. While previous work has focused on building a mental model of the AI to assess AI recommendations, recent research has shown that the mental model alone cannot explain appropriate reliance. We hypothesize that, in addition to the mental model, human learning is a key mediator of appropriate reliance and, thus, CTP. In this study, we demonstrate the relationship between learning and appropriate reliance in an experiment with 100 participants. This work provides fundamental concepts for analyzing reliance and derives implications for the effective design of human-AI decision-making. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Journal ref: International Conference on Information Systems (ICIS 2023)

arXiv:2309.01066 [pdf, other]

AB2CD: AI for Building Climate Damage Classification and Detection

Authors: Maximilian Nitsche, S. Karthik Mukkavilli, Niklas Kühl, Thomas Brunschwiler

Abstract: We explore the implementation of deep learning techniques for precise building damage assessment in the context of natural hazards, utilizing remote sensing data. The xBD dataset, comprising diverse disaster events from across the globe, serves as the primary focus, facilitating the evaluation of deep learning models. We tackle the challenges of generalization to novel disasters and regions while… ▽ More We explore the implementation of deep learning techniques for precise building damage assessment in the context of natural hazards, utilizing remote sensing data. The xBD dataset, comprising diverse disaster events from across the globe, serves as the primary focus, facilitating the evaluation of deep learning models. We tackle the challenges of generalization to novel disasters and regions while accounting for the influence of low-quality and noisy labels inherent in natural hazard data. Furthermore, our investigation quantitatively establishes that the minimum satellite imagery resolution essential for effective building damage detection is 3 meters and below 1 meter for classification using symmetric and asymmetric resolution perturbation analyses. To achieve robust and accurate evaluations of building damage detection and classification, we evaluated different deep learning models with residual, squeeze and excitation, and dual path network backbones, as well as ensemble techniques. Overall, the U-Net Siamese network ensemble with F-1 score of 0.812 performed the best against the xView2 challenge benchmark. Additionally, we evaluate a Universal model trained on all hazards against a flood expert model and investigate generalization gaps across events, and out of distribution from field data in the Ahr Valley. Our research findings showcase the potential and limitations of advanced AI solutions in enhancing the impact assessment of climate change-induced extreme weather events, such as floods and hurricanes. These insights have implications for disaster impact assessment in the face of escalating climate challenges. △ Less

Submitted 2 September, 2023; originally announced September 2023.

Comments: 9 pages, 4 figures

MSC Class: 68T07 (Primary); 68T45; 86A08; 74A45 (Secondary) ACM Class: I.2.10; I.4.8; I.4.6; I.5.4; I.2.6

arXiv:2307.13566 [pdf, other]

doi 10.1145/3641022

The Impact of Imperfect XAI on Human-AI Decision-Making

Authors: Katelyn Morrison, Philipp Spitzer, Violet Turri, Michelle Feng, Niklas Kühl, Adam Perer

Abstract: Explainability techniques are rapidly being developed to improve human-AI decision-making across various cooperative work settings. Consequently, previous research has evaluated how decision-makers collaborate with imperfect AI by investigating appropriate reliance and task performance with the aim of designing more human-centered computer-supported collaborative tools. Several human-centered expl… ▽ More Explainability techniques are rapidly being developed to improve human-AI decision-making across various cooperative work settings. Consequently, previous research has evaluated how decision-makers collaborate with imperfect AI by investigating appropriate reliance and task performance with the aim of designing more human-centered computer-supported collaborative tools. Several human-centered explainable AI (XAI) techniques have been proposed in hopes of improving decision-makers' collaboration with AI; however, these techniques are grounded in findings from previous studies that primarily focus on the impact of incorrect AI advice. Few studies acknowledge the possibility of the explanations being incorrect even if the AI advice is correct. Thus, it is crucial to understand how imperfect XAI affects human-AI decision-making. In this work, we contribute a robust, mixed-methods user study with 136 participants to evaluate how incorrect explanations influence humans' decision-making behavior in a bird species identification task, taking into account their level of expertise and an explanation's level of assertiveness. Our findings reveal the influence of imperfect XAI and humans' level of expertise on their reliance on AI and human-AI team performance. We also discuss how explanations can deceive decision-makers during human-AI collaboration. Hence, we shed light on the impacts of imperfect XAI in the field of computer-supported cooperative work and provide guidelines for designers of human-AI collaboration systems. △ Less

Submitted 8 May, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: Accepted to ACM CSCW 2024. 27 pages, 9 figures, 1 table, additional figures/table in the appendix

arXiv:2305.07681 [pdf, other]

ML-Based Teaching Systems: A Conceptual Framework

Authors: Philipp Spitzer, Niklas Kühl, Daniel Heinz, Gerhard Satzger

Abstract: As the shortage of skilled workers continues to be a pressing issue, exacerbated by demographic change, it is becoming a critical challenge for organizations to preserve the knowledge of retiring experts and to pass it on to novices. While this knowledge transfer has traditionally taken place through personal interaction, it lacks scalability and requires significant resources and time. IT-based t… ▽ More As the shortage of skilled workers continues to be a pressing issue, exacerbated by demographic change, it is becoming a critical challenge for organizations to preserve the knowledge of retiring experts and to pass it on to novices. While this knowledge transfer has traditionally taken place through personal interaction, it lacks scalability and requires significant resources and time. IT-based teaching systems have addressed this scalability issue, but their development is still tedious and time-consuming. In this work, we investigate the potential of machine learning (ML) models to facilitate knowledge transfer in an organizational context, leading to more cost-effective IT-based teaching systems. Through a systematic literature review, we examine key concepts, themes, and dimensions to better understand and design ML-based teaching systems. To do so, we capture and consolidate the capabilities of ML models in IT-based teaching systems, inductively analyze relevant concepts in this context, and determine their interrelationships. We present our findings in the form of a review of the key concepts, themes, and dimensions to understand and inform on ML-based teaching systems. Building on these results, our work contributes to research on computer-supported cooperative work by conceptualizing how ML-based teaching systems can preserve expert knowledge and facilitate its transfer from SMEs to human novices. In this way, we shed light on this emerging subfield of human-computer interaction and serve to build an interdisciplinary research agenda. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Forthcoming at The 26th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW 2023)

arXiv:2305.07399 [pdf]

Conceptualizing A Multi-Sided Platform For Cloud Computing Resource Trading

Authors: Franziska Haller, Max Schemmer, Niklas Kühl, Carsten Holtmann

Abstract: Cost-effective and responsible use of cloud computing resources (CCR) is on the business agenda of many companies. Despite this strategic goal, two geopolitical strategy decisions mainly influence the continuous existence of overcapacity: Europe's General Data Protection Regulation and the US's Cloud Act. Given the circumstances, a typical data center produces approximately 30% overcapacity annual… ▽ More Cost-effective and responsible use of cloud computing resources (CCR) is on the business agenda of many companies. Despite this strategic goal, two geopolitical strategy decisions mainly influence the continuous existence of overcapacity: Europe's General Data Protection Regulation and the US's Cloud Act. Given the circumstances, a typical data center produces approximately 30% overcapacity annually. This overcapacity has severe environmental and economic consequences. Our work addresses this overcapacity by proposing a multi-sided platform for CCR trading. We initiate our research by conducting a literature review to explore the existing body of knowledge which indicates a lack of recent and evaluated platform design knowledge for CCR trading. We address this research gap by deriving design requirements and design principles. We instantiate and evaluate the design knowledge in a respective platform framework. Thus, we contribute to research and practice by deriving and evaluating design knowledge and proposing a platform framework. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Accepted at ECIS 2023

arXiv:2304.09803 [pdf, other]

On the Perception of Difficulty: Differences between Humans and AI

Authors: Philipp Spitzer, Joshua Holstein, Michael Vössing, Niklas Kühl

Abstract: With the increased adoption of artificial intelligence (AI) in industry and society, effective human-AI interaction systems are becoming increasingly important. A central challenge in the interaction of humans with AI is the estimation of difficulty for human and AI agents for single task instances.These estimations are crucial to evaluate each agent's capabilities and, thus, required to facilitat… ▽ More With the increased adoption of artificial intelligence (AI) in industry and society, effective human-AI interaction systems are becoming increasingly important. A central challenge in the interaction of humans with AI is the estimation of difficulty for human and AI agents for single task instances.These estimations are crucial to evaluate each agent's capabilities and, thus, required to facilitate effective collaboration. So far, research in the field of human-AI interaction estimates the perceived difficulty of humans and AI independently from each other. However, the effective interaction of human and AI agents depends on metrics that accurately reflect each agent's perceived difficulty in achieving valuable outcomes. Research to date has not yet adequately examined the differences in the perceived difficulty of humans and AI. Thus, this work reviews recent research on the perceived difficulty in human-AI interaction and contributing factors to consistently compare each agent's perceived difficulty, e.g., creating the same prerequisites. Furthermore, we present an experimental design to thoroughly examine the perceived difficulty of both agents and contribute to a better understanding of the design of such systems. △ Less

Submitted 19 April, 2023; originally announced April 2023.

arXiv:2304.08804 [pdf, other]

doi 10.1613/jair.1.15873

AI Reliance and Decision Quality: Fundamentals, Interdependence, and the Effects of Interventions

Authors: Jakob Schoeffer, Johannes Jakubik, Michael Voessing, Niklas Kuehl, Gerhard Satzger

Abstract: In AI-assisted decision-making, a central promise of having a human-in-the-loop is that they should be able to complement the AI system by overriding its wrong recommendations. In practice, however, we often see that humans cannot assess the correctness of AI recommendations and, as a result, adhere to wrong or override correct advice. Different ways of relying on AI recommendations have immediate… ▽ More In AI-assisted decision-making, a central promise of having a human-in-the-loop is that they should be able to complement the AI system by overriding its wrong recommendations. In practice, however, we often see that humans cannot assess the correctness of AI recommendations and, as a result, adhere to wrong or override correct advice. Different ways of relying on AI recommendations have immediate, yet distinct, implications for decision quality. Unfortunately, reliance and decision quality are often inappropriately conflated in the current literature on AI-assisted decision-making. In this work, we disentangle and formalize the relationship between reliance and decision quality, and we characterize the conditions under which human-AI complementarity is achievable. To illustrate how reliance and decision quality relate to one another, we propose a visual framework and demonstrate its usefulness for interpreting empirical findings, including the effects of interventions like explanations. Overall, our research highlights the importance of distinguishing between reliance behavior and decision quality in AI-assisted decision-making. △ Less

Submitted 4 February, 2025; v1 submitted 18 April, 2023; originally announced April 2023.

Journal ref: Journal of Artificial Intelligence Research 82 (2025) 471-501

arXiv:2304.07306 [pdf, other]

Learning to Defer with Limited Expert Predictions

Authors: Patrick Hemmer, Lukas Thede, Michael Vössing, Johannes Jakubik, Niklas Kühl

Abstract: Recent research suggests that combining AI models with a human expert can exceed the performance of either alone. The combination of their capabilities is often realized by learning to defer algorithms that enable the AI to learn to decide whether to make a prediction for a particular instance or defer it to the human expert. However, to accurately learn which instances should be deferred to the h… ▽ More Recent research suggests that combining AI models with a human expert can exceed the performance of either alone. The combination of their capabilities is often realized by learning to defer algorithms that enable the AI to learn to decide whether to make a prediction for a particular instance or defer it to the human expert. However, to accurately learn which instances should be deferred to the human expert, a large number of expert predictions that accurately reflect the expert's capabilities are required -- in addition to the ground truth labels needed to train the AI. This requirement shared by many learning to defer algorithms hinders their adoption in scenarios where the responsible expert regularly changes or where acquiring a sufficient number of expert predictions is costly. In this paper, we propose a three-step approach to reduce the number of expert predictions required to train learning to defer algorithms. It encompasses (1) the training of an embedding model with ground truth labels to generate feature representations that serve as a basis for (2) the training of an expertise predictor model to approximate the expert's capabilities. (3) The expertise predictor generates artificial expert predictions for instances not yet labeled by the expert, which are required by the learning to defer algorithms. We evaluate our approach on two public datasets. One with "synthetically" generated human experts and another from the medical domain containing real-world radiologists' predictions. Our experiments show that the approach allows the training of various learning to defer algorithms with a minimal number of human expert predictions. Furthermore, we demonstrate that even a small number of expert predictions per class is sufficient for these algorithms to exceed the performance the AI and the human expert can achieve individually. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: 37th AAAI Conference on Artificial Intelligence (AAAI-23)

arXiv:2303.15834 [pdf, other]

Enabling Inter-organizational Analytics in Business Networks Through Meta Machine Learning

Authors: Robin Hirt, Niklas Kühl, Dominik Martin, Gerhard Satzger

Abstract: Successful analytics solutions that provide valuable insights often hinge on the connection of various data sources. While it is often feasible to generate larger data pools within organizations, the application of analytics within (inter-organizational) business networks is still severely constrained. As data is distributed across several legal units, potentially even across countries, the fear o… ▽ More Successful analytics solutions that provide valuable insights often hinge on the connection of various data sources. While it is often feasible to generate larger data pools within organizations, the application of analytics within (inter-organizational) business networks is still severely constrained. As data is distributed across several legal units, potentially even across countries, the fear of disclosing sensitive information as well as the sheer volume of the data that would need to be exchanged are key inhibitors for the creation of effective system-wide solutions -- all while still reaching superior prediction performance. In this work, we propose a meta machine learning method that deals with these obstacles to enable comprehensive analyses within a business network. We follow a design science research approach and evaluate our method with respect to feasibility and performance in an industrial use case. First, we show that it is feasible to perform network-wide analyses that preserve data confidentiality as well as limit data transfer volume. Second, we demonstrate that our method outperforms a conventional isolated analysis and even gets close to a (hypothetical) scenario where all data could be shared within the network. Thus, we provide a fundamental contribution for making business networks more effective, as we remove a key obstacle to tap the huge potential of learning from data that is scattered throughout the network. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: Preprint, forthcoming at Information Technology and Management

arXiv:2303.13540 [pdf, other]

doi 10.1016/j.jclepro.2023.136748

Artificial Intelligence for Sustainability: Facilitating Sustainable Smart Product-Service Systems with Computer Vision

Authors: Jannis Walk, Niklas Kühl, Michael Saidani, Jürgen Schatte

Abstract: The usage and impact of deep learning for cleaner production and sustainability purposes remain little explored. This work shows how deep learning can be harnessed to increase sustainability in production and product usage. Specifically, we utilize deep learning-based computer vision to determine the wear states of products. The resulting insights serve as a basis for novel product-service systems… ▽ More The usage and impact of deep learning for cleaner production and sustainability purposes remain little explored. This work shows how deep learning can be harnessed to increase sustainability in production and product usage. Specifically, we utilize deep learning-based computer vision to determine the wear states of products. The resulting insights serve as a basis for novel product-service systems with improved integration and result orientation. Moreover, these insights are expected to facilitate product usage improvements and R&D innovations. We demonstrate our approach on two products: machining tools and rotating X-ray anodes. From a technical standpoint, we show that it is possible to recognize the wear state of these products using deep-learning-based computer vision. In particular, we detect wear through microscopic images of the two products. We utilize a U-Net for semantic segmentation to detect wear based on pixel granularity. The resulting mean dice coefficients of 0.631 and 0.603 demonstrate the feasibility of the proposed approach. Consequently, experts can now make better decisions, for example, to improve the machining process parameters. To assess the impact of the proposed approach on environmental sustainability, we perform life cycle assessments that show gains for both products. The results indicate that the emissions of CO2 equivalents are reduced by 12% for machining tools and by 44% for rotating anodes. This work can serve as a guideline and inspire researchers and practitioners to utilize computer vision in similar scenarios to develop sustainable smart product-service systems and enable cleaner production. △ Less

Submitted 27 March, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

arXiv:2302.12100 [pdf, other]

doi 10.3390/aerospace10090751

Parameter-free shape optimization: various shape updates for engineering applications

Authors: Lars Radtke, Georgios Bletsos, Niklas Kühl, Tim Suchan, Thomas Rung, Alexander Düster, Kathrin Welker

Abstract: In the last decade, parameter-free approaches to shape optimization problems have matured to a state where they provide a versatile tool for complex engineering applications. However, sensitivity distributions obtained from shape derivatives in this context cannot be directly used as a shape update in gradient-based optimization strategies. Instead, an auxiliary problem has to be solved to obtain… ▽ More In the last decade, parameter-free approaches to shape optimization problems have matured to a state where they provide a versatile tool for complex engineering applications. However, sensitivity distributions obtained from shape derivatives in this context cannot be directly used as a shape update in gradient-based optimization strategies. Instead, an auxiliary problem has to be solved to obtain a gradient from the sensitivity. While several choices for these auxiliary problems were investigated mathematically, the complexity of the concepts behind their derivation has often prevented their application in engineering. This work aims at an explanation of several approaches to compute shape updates from an engineering perspective. We introduce the corresponding auxiliary problems in a formal way and compare the choices by means of numerical examples. To this end, a test case and exemplary applications from computational fluid dynamics are considered. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: 10 pages, 22 figures

arXiv:2302.09149 [pdf, other]

An Incremental Singular Value Decomposition Approach for Large-Scale Spatially Parallel & Distributed but Temporally Serial Data -- Applied to Technical Flows

Authors: Niklas Kühl, Hendrik Fischer, Michael Hinze, Thomas Rung

Abstract: The paper presents a strategy to construct an incremental Singular Value Decomposition (SVD) for time-evolving, spatially 3D discrete data sets. A low memory access procedure for reducing and deploying the snapshot data is presented. Considered examples refer to Computational Fluid Dynamic (CFD) results extracted from unsteady flow simulations, which are computed spatially parallel using domain de… ▽ More The paper presents a strategy to construct an incremental Singular Value Decomposition (SVD) for time-evolving, spatially 3D discrete data sets. A low memory access procedure for reducing and deploying the snapshot data is presented. Considered examples refer to Computational Fluid Dynamic (CFD) results extracted from unsteady flow simulations, which are computed spatially parallel using domain decomposition strategies. The framework addresses state of the art PDE-solvers dedicated to practical applications. Although the approach is applied to technical flows, it is applicable in similar applications under the umbrella of Computational Science and Engineering (CSE). To this end, we introduce a bunch matrix that allows the aggregation of multiple time steps and SVD updates, and significantly increases the computational efficiency. The incremental SVD strategy is initially verified and validated by simulating the 2D laminar single-phase flow around a circular cylinder. Subsequent studies analyze the proposed strategy for a 2D submerged hydrofoil located in turbulent two-phase flows. Attention is directed to the accuracy of the SVD-based reconstruction based on local and global flow quantities, their physical realizability, the independence of the domain partitioning, and related implementation aspects. Moreover, the influence of lower and (adaptive) upper construction rank thresholds on both the effort and the accuracy are assessed. The incremental SVD process is applied to analyze and compress the predicted flow field around a Kriso container ship in harmonic head waves at Fn = 0.26 and ReL = 1.4E+07. With a numerical overhead of O(10%), the snapshot matrix of size O(R10E+08 x 10E+04) computed on approximately 3000 processors can be incrementally compressed by O(95%). The storage reduction is accompanied by errors in integral force and local wave elevation quantities of O(1E-02%). △ Less

Submitted 17 February, 2023; originally announced February 2023.

arXiv:2302.03302 [pdf, other]

Towards Meaningful Anomaly Detection: The Effect of Counterfactual Explanations on the Investigation of Anomalies in Multivariate Time Series

Authors: Max Schemmer, Joshua Holstein, Niklas Bauer, Niklas Kühl, Gerhard Satzger

Abstract: Detecting rare events is essential in various fields, e.g., in cyber security or maintenance. Often, human experts are supported by anomaly detection systems as continuously monitoring the data is an error-prone and tedious task. However, among the anomalies detected may be events that are rare, e.g., a planned shutdown of a machine, but are not the actual event of interest, e.g., breakdowns of a… ▽ More Detecting rare events is essential in various fields, e.g., in cyber security or maintenance. Often, human experts are supported by anomaly detection systems as continuously monitoring the data is an error-prone and tedious task. However, among the anomalies detected may be events that are rare, e.g., a planned shutdown of a machine, but are not the actual event of interest, e.g., breakdowns of a machine. Therefore, human experts are needed to validate whether the detected anomalies are relevant. We propose to support this anomaly investigation by providing explanations of anomaly detection. Related work only focuses on the technical implementation of explainable anomaly detection and neglects the subsequent human anomaly investigation. To address this research gap, we conduct a behavioral experiment using records of taxi rides in New York City as a testbed. Participants are asked to differentiate extreme weather events from other anomalous events such as holidays or sporting events. Our results show that providing counterfactual explanations do improve the investigation of anomalies, indicating potential for explainable anomaly detection in general. △ Less

Submitted 7 February, 2023; originally announced February 2023.

arXiv:2302.02187 [pdf, other]

doi 10.1145/3581641.3584066

Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations

Authors: Max Schemmer, Niklas Kühl, Carina Benz, Andrea Bartos, Gerhard Satzger

Abstract: AI advice is becoming increasingly popular, e.g., in investment and medical treatment decisions. As this advice is typically imperfect, decision-makers have to exert discretion as to whether actually follow that advice: they have to "appropriately" rely on correct and turn down incorrect advice. However, current research on appropriate reliance still lacks a common definition as well as an operati… ▽ More AI advice is becoming increasingly popular, e.g., in investment and medical treatment decisions. As this advice is typically imperfect, decision-makers have to exert discretion as to whether actually follow that advice: they have to "appropriately" rely on correct and turn down incorrect advice. However, current research on appropriate reliance still lacks a common definition as well as an operational measurement concept. Additionally, no in-depth behavioral experiments have been conducted that help understand the factors influencing this behavior. In this paper, we propose Appropriateness of Reliance (AoR) as an underlying, quantifiable two-dimensional measurement concept. We develop a research model that analyzes the effect of providing explanations for AI advice. In an experiment with 200 participants, we demonstrate how these explanations influence the AoR, and, thus, the effectiveness of AI advice. Our work contributes fundamental concepts for the analysis of reliance behavior and the purposeful design of AI advisors. △ Less

Submitted 13 April, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

Comments: arXiv admin note: text overlap with arXiv:2204.06916

Journal ref: ACM 28th International Conference on Intelligent User Interfaces (IUI), 2023

arXiv:2302.01713 [pdf, other]

Towards Avoiding the Data Mess: Industry Insights from Data Mesh Implementations

Authors: Jan Bode, Niklas Kühl, Dominik Kreuzberger, Sebastian Hirschl, Carsten Holtmann

Abstract: With the increasing importance of data and artificial intelligence, organizations strive to become more data-driven. However, current data architectures are not necessarily designed to keep up with the scale and scope of data and analytics use cases. In fact, existing architectures often fail to deliver the promised value associated with them. Data mesh is a socio-technical, decentralized, distrib… ▽ More With the increasing importance of data and artificial intelligence, organizations strive to become more data-driven. However, current data architectures are not necessarily designed to keep up with the scale and scope of data and analytics use cases. In fact, existing architectures often fail to deliver the promised value associated with them. Data mesh is a socio-technical, decentralized, distributed concept for enterprise data management. As the concept of data mesh is still novel, it lacks empirical insights from the field. Specifically, an understanding of the motivational factors for introducing data mesh, the associated challenges, implementation strategies, its business impact, and potential archetypes is missing. To address this gap, we conduct 15 semi-structured interviews with industry experts. Our results show, among other insights, that organizations have difficulties with the transition toward federated governance associated with the data mesh concept, the shift of responsibility for the development, provision, and maintenance of data products, and the comprehension of the overall concept. In our work, we derive multiple implementation strategies and suggest organizations introduce a cross-domain steering unit, observe the data product usage, create quick wins in the early phases, and favor small dedicated teams that prioritize data products. While we acknowledge that organizations need to apply implementation strategies according to their individual needs, we also deduct two archetypes that provide suggestions in more detail. Our findings synthesize insights from industry experts and provide researchers and professionals with preliminary guidelines for the successful adoption of data mesh. △ Less

Submitted 6 June, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

arXiv:2301.09318 [pdf, other]

Toward Foundation Models for Earth Monitoring: Generalizable Deep Learning Models for Natural Hazard Segmentation

Authors: Johannes Jakubik, Michal Muszynski, Michael Vössing, Niklas Kühl, Thomas Brunschwiler

Abstract: Climate change results in an increased probability of extreme weather events that put societies and businesses at risk on a global scale. Therefore, near real-time mapping of natural hazards is an emerging priority for the support of natural disaster relief, risk management, and informing governmental policy decisions. Recent methods to achieve near real-time mapping increasingly leverage deep lea… ▽ More Climate change results in an increased probability of extreme weather events that put societies and businesses at risk on a global scale. Therefore, near real-time mapping of natural hazards is an emerging priority for the support of natural disaster relief, risk management, and informing governmental policy decisions. Recent methods to achieve near real-time mapping increasingly leverage deep learning (DL). However, DL-based approaches are designed for one specific task in a single geographic region based on specific frequency bands of satellite data. Therefore, DL models used to map specific natural hazards struggle with their generalization to other types of natural hazards in unseen regions. In this work, we propose a methodology to significantly improve the generalizability of DL natural hazards mappers based on pre-training on a suitable pre-task. Without access to any data from the target domain, we demonstrate this improved generalizability across four U-Net architectures for the segmentation of unseen natural hazards. Importantly, our method is invariant to geographic differences and differences in the type of frequency bands of satellite data. By leveraging characteristics of unlabeled images from the target domain that are publicly available, our approach is able to further improve the generalization behavior without fine-tuning. Thereby, our approach supports the development of foundation models for earth monitoring with the objective of directly segmenting unseen natural hazards across novel geographic regions given different sources of satellite imagery. △ Less

Submitted 1 June, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: Accepted at IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2023)

arXiv:2212.11854 [pdf, other]

Data-Centric Artificial Intelligence

Authors: Johannes Jakubik, Michael Vössing, Niklas Kühl, Jannis Walk, Gerhard Satzger

Abstract: Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems. The objective of this article is to introduce practitioners and researchers from the field of Information Systems (IS) to data-centric AI. We define relevant terms, provide key chara… ▽ More Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems. The objective of this article is to introduce practitioners and researchers from the field of Information Systems (IS) to data-centric AI. We define relevant terms, provide key characteristics to contrast the data-centric paradigm to the model-centric one, and introduce a framework for data-centric AI. We distinguish data-centric AI from related concepts and discuss its longer-term implications for the IS community. △ Less

Submitted 18 January, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

Comments: Accepted for publication at Business & Information Systems Engineering

arXiv:2209.11812 [pdf, other]

doi 10.1145/3613904.3642621

Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-Making

Authors: Jakob Schoeffer, Maria De-Arteaga, Niklas Kuehl

Abstract: In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, i… ▽ More In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, in turn, relate to humans' tendency to adhere to AI recommendations. However, we see that such explanations do not enable humans to discern correct and incorrect AI recommendations. Instead, we show that they may affect reliance irrespective of the correctness of AI recommendations. Depending on which features an explanation highlights, this can foster or hinder distributive fairness: when explanations highlight features that are task-irrelevant and evidently associated with the sensitive attribute, this prompts overrides that counter AI recommendations that align with gender stereotypes. Meanwhile, if explanations appear task-relevant, this induces reliance behavior that reinforces stereotype-aligned errors. These results imply that feature-based explanations are not a reliable mechanism to improve distributive fairness. △ Less

Submitted 18 March, 2024; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: ACM CHI Conference on Human Factors in Computing Systems (CHI '24)

arXiv:2209.11299 [pdf, other]

Deep Domain Adaptation for Detecting Bomb Craters in Aerial Images

Authors: Marco Geiger, Dominik Martin, Niklas Kühl

Abstract: The aftermath of air raids can still be seen for decades after the devastating events. Unexploded ordnance (UXO) is an immense danger to human life and the environment. Through the assessment of wartime images, experts can infer the occurrence of a dud. The current manual analysis process is expensive and time-consuming, thus automated detection of bomb craters by using deep learning is a promisin… ▽ More The aftermath of air raids can still be seen for decades after the devastating events. Unexploded ordnance (UXO) is an immense danger to human life and the environment. Through the assessment of wartime images, experts can infer the occurrence of a dud. The current manual analysis process is expensive and time-consuming, thus automated detection of bomb craters by using deep learning is a promising way to improve the UXO disposal process. However, these methods require a large amount of manually labeled training data. This work leverages domain adaptation with moon surface images to address the problem of automated bomb crater detection with deep learning under the constraint of limited training data. This paper contributes to both academia and practice (1) by providing a solution approach for automated bomb crater detection with limited training data and (2) by demonstrating the usability and associated challenges of using synthetic images for domain adaptation. △ Less

Submitted 22 September, 2022; originally announced September 2022.

Comments: 56th Annual Hawaii International Conference on System Sciences (HICSS-56)

arXiv:2208.04181 [pdf, other]

An Empirical Evaluation of Predicted Outcomes as Explanations in Human-AI Decision-Making

Authors: Johannes Jakubik, Jakob Schöffer, Vincent Hoge, Michael Vössing, Niklas Kühl

Abstract: In this work, we empirically examine human-AI decision-making in the presence of explanations based on predicted outcomes. This type of explanation provides a human decision-maker with expected consequences for each decision alternative at inference time - where the predicted outcomes are typically measured in a problem-specific unit (e.g., profit in U.S. dollars). We conducted a pilot study in th… ▽ More In this work, we empirically examine human-AI decision-making in the presence of explanations based on predicted outcomes. This type of explanation provides a human decision-maker with expected consequences for each decision alternative at inference time - where the predicted outcomes are typically measured in a problem-specific unit (e.g., profit in U.S. dollars). We conducted a pilot study in the context of peer-to-peer lending to assess the effects of providing predicted outcomes as explanations to lay study participants. Our preliminary findings suggest that people's reliance on AI recommendations increases compared to cases where no explanation or feature-based explanations are provided, especially when the AI recommendations are incorrect. This results in a hampered ability to distinguish correct from incorrect AI recommendations, which can ultimately affect decision quality in a negative way. △ Less

Submitted 30 August, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

Comments: Accepted at ECML XKDD workshop

arXiv:2207.00497 [pdf, other]

Training Novices: The Role of Human-AI Collaboration and Knowledge Transfer

Authors: Philipp Spitzer, Niklas Kühl, Marc Goutier

Abstract: Across a multitude of work environments, expert knowledge is imperative for humans to conduct tasks with high performance and ensure business success. These humans possess task-specific expert knowledge (TSEK) and hence, represent subject matter experts (SMEs). However, not only demographic changes but also personnel downsizing strategies lead and will continue to lead to departures of SMEs within… ▽ More Across a multitude of work environments, expert knowledge is imperative for humans to conduct tasks with high performance and ensure business success. These humans possess task-specific expert knowledge (TSEK) and hence, represent subject matter experts (SMEs). However, not only demographic changes but also personnel downsizing strategies lead and will continue to lead to departures of SMEs within organizations, which constitutes the challenge of how to retain that expert knowledge and train novices to keep the competitive advantage elicited by that expert knowledge. SMEs training novices is time- and cost-intensive, which intensifies the need for alternatives. Human-AI collaboration (HAIC) poses a way out of this dilemma, facilitating alternatives to preserve expert knowledge and teach it to novices for tasks conducted by SMEs beforehand. In this workshop paper, we (1) propose a framework on how HAIC can be utilized to train novices on particular tasks, (2) illustrate the role of explicit and tacit knowledge in this training process via HAIC, and (3) outline a preliminary experiment design to assess the ability of AI systems in HAIC to act as a trainer to transfer TSEK to novices who do not possess prior TSEK. △ Less

Submitted 1 July, 2022; originally announced July 2022.

Comments: This is a workshop paper: Workshop on Human-Machine Collaboration and Teaming (HM-CaT 2022). The 39th International Conference on Machine Learning. Link to the workshop: https://icml.cc/Conferences/2022/Schedule?showEvent=13478

arXiv:2205.05758 [pdf, other]

doi 10.1145/3531146.3533218

"There Is Not Enough Information": On the Effects of Explanations on Perceptions of Informational Fairness and Trustworthiness in Automated Decision-Making

Authors: Jakob Schoeffer, Niklas Kuehl, Yvette Machowski

Abstract: Automated decision systems (ADS) are increasingly used for consequential decision-making. These systems often rely on sophisticated yet opaque machine learning models, which do not allow for understanding how a given decision was arrived at. In this work, we conduct a human subject study to assess people's perceptions of informational fairness (i.e., whether people think they are given adequate in… ▽ More Automated decision systems (ADS) are increasingly used for consequential decision-making. These systems often rely on sophisticated yet opaque machine learning models, which do not allow for understanding how a given decision was arrived at. In this work, we conduct a human subject study to assess people's perceptions of informational fairness (i.e., whether people think they are given adequate information on and explanation of the process and its outcomes) and trustworthiness of an underlying ADS when provided with varying types of information about the system. More specifically, we instantiate an ADS in the area of automated loan approval and generate different explanations that are commonly used in the literature. We randomize the amount of information that study participants get to see by providing certain groups of people with the same explanations as others plus additional explanations. From our quantitative analyses, we observe that different amounts of information as well as people's (self-assessed) AI literacy significantly influence the perceived informational fairness, which, in turn, positively relates to perceived trustworthiness of the ADS. A comprehensive analysis of qualitative feedback sheds light on people's desiderata for explanations, among which are (i) consistency (both with people's expectations and across different explanations), (ii) disclosure of monotonic relationships between features and outcome, and (iii) actionability of recommendations. △ Less

Submitted 11 May, 2022; originally announced May 2022.

Comments: 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22), June 21--24, 2022, Seoul, Republic of Korea

arXiv:2205.05126 [pdf, other]

doi 10.1145/3514094.3534128

A Meta-Analysis of the Utility of Explainable Artificial Intelligence in Human-AI Decision-Making

Authors: Max Schemmer, Patrick Hemmer, Maximilian Nitsche, Niklas Kühl, Michael Vössing

Abstract: Research in artificial intelligence (AI)-assisted decision-making is experiencing tremendous growth with a constantly rising number of studies evaluating the effect of AI with and without techniques from the field of explainable AI (XAI) on human decision-making performance. However, as tasks and experimental setups vary due to different objectives, some studies report improved user decision-makin… ▽ More Research in artificial intelligence (AI)-assisted decision-making is experiencing tremendous growth with a constantly rising number of studies evaluating the effect of AI with and without techniques from the field of explainable AI (XAI) on human decision-making performance. However, as tasks and experimental setups vary due to different objectives, some studies report improved user decision-making performance through XAI, while others report only negligible effects. Therefore, in this article, we present an initial synthesis of existing research on XAI studies using a statistical meta-analysis to derive implications across existing research. We observe a statistically positive impact of XAI on users' performance. Additionally, the first results indicate that human-AI decision-making tends to yield better task performance on text data. However, we find no effect of explanations on users' performance compared to sole AI predictions. Our initial synthesis gives rise to future research investigating the underlying causes and contributes to further developing algorithms that effectively benefit human decision-makers by providing meaningful explanations. △ Less

Submitted 1 June, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: AAAI/ACM Conference on AI, Ethics, and Society (AIES'22)

arXiv:2205.02302 [pdf]

Machine Learning Operations (MLOps): Overview, Definition, and Architecture

Authors: Dominik Kreuzberger, Niklas Kühl, Sebastian Hirschl

Abstract: The final goal of all industrial machine learning (ML) projects is to develop ML products and rapidly bring them into production. However, it is highly challenging to automate and operationalize ML products and thus many ML endeavors fail to deliver on their expectations. The paradigm of Machine Learning Operations (MLOps) addresses this issue. MLOps includes several aspects, such as best practice… ▽ More The final goal of all industrial machine learning (ML) projects is to develop ML products and rapidly bring them into production. However, it is highly challenging to automate and operationalize ML products and thus many ML endeavors fail to deliver on their expectations. The paradigm of Machine Learning Operations (MLOps) addresses this issue. MLOps includes several aspects, such as best practices, sets of concepts, and development culture. However, MLOps is still a vague term and its consequences for researchers and professionals are ambiguous. To address this gap, we conduct mixed-method research, including a literature review, a tool review, and expert interviews. As a result of these investigations, we provide an aggregated overview of the necessary principles, components, and roles, as well as the associated architecture and workflows. Furthermore, we furnish a definition of MLOps and highlight open challenges in the field. Finally, this work provides guidance for ML researchers and practitioners who want to automate and operate their ML products with a designated set of technologies. △ Less

Submitted 14 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

arXiv:2205.01467 [pdf, other]

On the Effect of Information Asymmetry in Human-AI Teams

Authors: Patrick Hemmer, Max Schemmer, Niklas Kühl, Michael Vössing, Gerhard Satzger

Abstract: Over the last years, the rising capabilities of artificial intelligence (AI) have improved human decision-making in many application areas. Teaming between AI and humans may even lead to complementary team performance (CTP), i.e., a level of performance beyond the ones that can be reached by AI or humans individually. Many researchers have proposed using explainable AI (XAI) to enable humans to re… ▽ More Over the last years, the rising capabilities of artificial intelligence (AI) have improved human decision-making in many application areas. Teaming between AI and humans may even lead to complementary team performance (CTP), i.e., a level of performance beyond the ones that can be reached by AI or humans individually. Many researchers have proposed using explainable AI (XAI) to enable humans to rely on AI advice appropriately and thereby reach CTP. However, CTP is rarely demonstrated in previous work as often the focus is on the design of explainability, while a fundamental prerequisite -- the presence of complementarity potential between humans and AI -- is often neglected. Therefore, we focus on the existence of this potential for effective human-AI decision-making. Specifically, we identify information asymmetry as an essential source of complementarity potential, as in many real-world situations, humans have access to different contextual information. By conducting an online experiment, we demonstrate that humans can use such contextual information to adjust the AI's decision, finally resulting in CTP. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: CHI Conference on Human Factors in Computing Systems (CHI '22), Workshop on Human-Centered Explainable AI (HCXAI)

arXiv:2204.13156 [pdf, other]

On the Relationship Between Explanations, Fairness Perceptions, and Decisions

Authors: Jakob Schoeffer, Maria De-Arteaga, Niklas Kuehl

Abstract: It is known that recommendations of AI-based systems can be incorrect or unfair. Hence, it is often proposed that a human be the final decision-maker. Prior work has argued that explanations are an essential pathway to help human decision-makers enhance decision quality and mitigate bias, i.e., facilitate human-AI complementarity. For these benefits to materialize, explanations should enable human… ▽ More It is known that recommendations of AI-based systems can be incorrect or unfair. Hence, it is often proposed that a human be the final decision-maker. Prior work has argued that explanations are an essential pathway to help human decision-makers enhance decision quality and mitigate bias, i.e., facilitate human-AI complementarity. For these benefits to materialize, explanations should enable humans to appropriately rely on AI recommendations and override the algorithmic recommendation when necessary to increase distributive fairness of decisions. The literature, however, does not provide conclusive empirical evidence as to whether explanations enable such complementarity in practice. In this work, we (a) provide a conceptual framework to articulate the relationships between explanations, fairness perceptions, reliance, and distributive fairness, (b) apply it to understand (seemingly) contradictory research findings at the intersection of explanations and fairness, and (c) derive cohesive implications for the formulation of research questions and the design of experiments. △ Less

Submitted 6 May, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

Comments: ACM CHI 2022 Workshop on Human-Centered Explainable AI (HCXAI), May 12--13, 2022, New Orleans, LA, USA

Showing 1–50 of 87 results for author: Kuhl, N