Ranking the Top-K Realizations of Stochastically Known Event Logs
Authors:
Arvid Lepsien,
Marco Pegoraro,
Frederik Fonger,
Dominic Langhammer,
Milda Aleknonytė-Resch,
Agnes Koschmider
Abstract:
Various kinds of uncertainty can occur in event logs, e.g., due to flawed recording, data quality issues, or the use of probabilistic models for activity recognition. Stochastically known event logs make these uncertainties transparent by encoding multiple possible realizations for events. However, the number of realizations encoded by a stochastically known log grows exponentially with its size,…
▽ More
Various kinds of uncertainty can occur in event logs, e.g., due to flawed recording, data quality issues, or the use of probabilistic models for activity recognition. Stochastically known event logs make these uncertainties transparent by encoding multiple possible realizations for events. However, the number of realizations encoded by a stochastically known log grows exponentially with its size, making exhaustive exploration infeasible even for moderately sized event logs. Thus, considering only the top-K most probable realizations has been proposed in the literature. In this paper, we implement an efficient algorithm to calculate a top-K realization ranking of an event log under event independence within O(Kn), where n is the number of uncertain events in the log. This algorithm is used to investigate the benefit of top-K rankings over top-1 interpretations of stochastically known event logs. Specifically, we analyze the usefulness of top-K rankings against different properties of the input data. We show that the benefit of a top-K ranking depends on the length of the input event log and the distribution of the event probabilities. The results highlight the potential of top-K rankings to enhance uncertainty-aware process mining techniques.
△ Less
Submitted 30 September, 2024;
originally announced October 2024.
Process Mining for Unstructured Data: Challenges and Research Directions
Authors:
Agnes Koschmider,
Milda Aleknonytė-Resch,
Frederik Fonger,
Christian Imenkamp,
Arvid Lepsien,
Kaan Apaydin,
Maximilian Harms,
Dominik Janssen,
Dominic Langhammer,
Tobias Ziolkowski,
Yorck Zisgen
Abstract:
The application of process mining for unstructured data might significantly elevate novel insights into disciplines where unstructured data is a common data format. To efficiently analyze unstructured data by process mining and to convey confidence into the analysis result, requires bridging multiple challenges. The purpose of this paper is to discuss these challenges, present initial solutions an…
▽ More
The application of process mining for unstructured data might significantly elevate novel insights into disciplines where unstructured data is a common data format. To efficiently analyze unstructured data by process mining and to convey confidence into the analysis result, requires bridging multiple challenges. The purpose of this paper is to discuss these challenges, present initial solutions and describe future research directions. We hope that this article lays the foundations for future collaboration on this topic.
△ Less
Submitted 30 November, 2023;
originally announced January 2024.