-
Machine Learning and Statistical Insights into Hospital Stay Durations: The Italian EHR Case
Authors:
Marina Andric,
Mauro Dragoni
Abstract:
Length of hospital stay is a critical metric for assessing healthcare quality and optimizing hospital resource management. This study aims to identify factors influencing LoS within the Italian healthcare context, using a dataset of hospitalization records from over 60 healthcare facilities in the Piedmont region, spanning from 2020 to 2023. We explored a variety of features, including patient cha…
▽ More
Length of hospital stay is a critical metric for assessing healthcare quality and optimizing hospital resource management. This study aims to identify factors influencing LoS within the Italian healthcare context, using a dataset of hospitalization records from over 60 healthcare facilities in the Piedmont region, spanning from 2020 to 2023. We explored a variety of features, including patient characteristics, comorbidities, admission details, and hospital-specific factors. Significant correlations were found between LoS and features such as age group, comorbidity score, admission type, and the month of admission. Machine learning models, specifically CatBoost and Random Forest, were used to predict LoS. The highest R2 score, 0.49, was achieved with CatBoost, demonstrating good predictive performance.
△ Less
Submitted 25 April, 2025;
originally announced April 2025.
-
A Pattern to Align Them All: Integrating Different Modalities to Define Multi-Modal Entities
Authors:
Gianluca Apriceno,
Valentina Tamma,
Tania Bailoni,
Jacopo de Berardinis,
Mauro Dragoni
Abstract:
The ability to reason with and integrate different sensory inputs is the foundation underpinning human intelligence and it is the reason for the growing interest in modelling multi-modal information within Knowledge Graphs. Multi-Modal Knowledge Graphs extend traditional Knowledge Graphs by associating an entity with its possible modal representations, including text, images, audio, and videos, al…
▽ More
The ability to reason with and integrate different sensory inputs is the foundation underpinning human intelligence and it is the reason for the growing interest in modelling multi-modal information within Knowledge Graphs. Multi-Modal Knowledge Graphs extend traditional Knowledge Graphs by associating an entity with its possible modal representations, including text, images, audio, and videos, all of which are used to convey the semantics of the entity. Despite the increasing attention that Multi-Modal Knowledge Graphs have received, there is a lack of consensus about the definitions and modelling of modalities, whose definition is often determined by application domains. In this paper, we propose a novel ontology design pattern that captures the separation of concerns between an entity (and the information it conveys), whose semantics can have different manifestations across different media, and its realisation in terms of a physical information entity. By introducing this abstract model, we aim to facilitate the harmonisation and integration of different existing multi-modal ontologies which is crucial for many intelligent applications across different domains spanning from medicine to digital humanities.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Large Language Models and Knowledge Graphs: Opportunities and Challenges
Authors:
Jeff Z. Pan,
Simon Razniewski,
Jan-Christoph Kalo,
Sneha Singhania,
Jiaoyan Chen,
Stefan Dietze,
Hajira Jabeen,
Janna Omeliyanenko,
Wen Zhang,
Matteo Lissandrini,
Russa Biswas,
Gerard de Melo,
Angela Bonifati,
Edlira Vakaj,
Mauro Dragoni,
Damien Graux
Abstract:
Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and…
▽ More
Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Leveraging pre-trained language models for conversational information seeking from text
Authors:
Patrizio Bellan,
Mauro Dragoni,
Chiara Ghidini
Abstract:
Recent advances in Natural Language Processing, and in particular on the construction of very large pre-trained language representation models, is opening up new perspectives on the construction of conversational information seeking (CIS) systems. In this paper we investigate the usage of in-context learning and pre-trained language representation models to address the problem of information extra…
▽ More
Recent advances in Natural Language Processing, and in particular on the construction of very large pre-trained language representation models, is opening up new perspectives on the construction of conversational information seeking (CIS) systems. In this paper we investigate the usage of in-context learning and pre-trained language representation models to address the problem of information extraction from process description documents, in an incremental question and answering oriented fashion. In particular we investigate the usage of the native GPT-3 (Generative Pre-trained Transformer 3) model, together with two in-context learning customizations that inject conceptual definitions and a limited number of samples in a few shot-learning fashion. The results highlight the potential of the approach and the usefulness of the in-context learning customizations, which can substantially contribute to address the "training data challenge" of deep learning based NLP techniques the BPM field. It also highlight the challenge posed by control flow relations for which further training needs to be devised.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
PET: An Annotated Dataset for Process Extraction from Natural Language Text
Authors:
Patrizio Bellan,
Han van der Aa,
Mauro Dragoni,
Chiara Ghidini,
Simone Paolo Ponzetto
Abstract:
Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, in contrast to other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions that are carefully annotated with all the entities and relationships of interest. Due to this, it is currently hard to compare t…
▽ More
Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, in contrast to other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions that are carefully annotated with all the entities and relationships of interest. Due to this, it is currently hard to compare the results obtained by extraction approaches in an objective manner, whereas the lack of annotated texts also prevents the application of data-driven information extraction methodologies, typical of the natural language processing field. Therefore, to bridge this gap, we present the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information. We present our new resource, including a variety of baselines to benchmark the difficulty and challenges of business process extraction from text. PET can be accessed via huggingface.co/datasets/patriziobellan/PET
△ Less
Submitted 13 June, 2022; v1 submitted 9 March, 2022;
originally announced March 2022.
-
Machine Learning for Utility Prediction in Argument-Based Computational Persuasion
Authors:
Ivan Donadello,
Anthony Hunter,
Stefano Teso,
Mauro Dragoni
Abstract:
Automated persuasion systems (APS) aim to persuade a user to believe something by entering into a dialogue in which arguments and counterarguments are exchanged. To maximize the probability that an APS is successful in persuading a user, it can identify a global policy that will allow it to select the best arguments it presents at each stage of the dialogue whatever arguments the user presents. Ho…
▽ More
Automated persuasion systems (APS) aim to persuade a user to believe something by entering into a dialogue in which arguments and counterarguments are exchanged. To maximize the probability that an APS is successful in persuading a user, it can identify a global policy that will allow it to select the best arguments it presents at each stage of the dialogue whatever arguments the user presents. However, in real applications, such as for healthcare, it is unlikely the utility of the outcome of the dialogue will be the same, or the exact opposite, for the APS and user. In order to deal with this situation, games in extended form have been harnessed for argumentation in Bi-party Decision Theory. This opens new problems that we address in this paper: (1) How can we use Machine Learning (ML) methods to predict utility functions for different subpopulations of users? and (2) How can we identify for a new user the best utility function from amongst those that we have learned? To this extent, we develop two ML methods, EAI and EDS, that leverage information coming from the users to predict their utilities. EAI is restricted to a fixed amount of information, whereas EDS can choose the information that best detects the subpopulations of a user. We evaluate EAI and EDS in a simulation setting and in a realistic case study concerning healthy eating habits. Results are promising in both cases, but EDS is more effective at predicting useful utility functions.
△ Less
Submitted 15 December, 2021; v1 submitted 9 December, 2021;
originally announced December 2021.
-
A Practical guide on Explainable AI Techniques applied on Biomedical use case applications
Authors:
Adrien Bennetot,
Ivan Donadello,
Ayoub El Qadi,
Mauro Dragoni,
Thomas Frossard,
Benedikt Wagner,
Anna Saranti,
Silvia Tulli,
Maria Trocan,
Raja Chatila,
Andreas Holzinger,
Artur d'Avila Garcez,
Natalia Díaz-Rodríguez
Abstract:
Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments…
▽ More
Last years have been characterized by an upsurge of opaque automatic decision support systems, such as Deep Neural Networks (DNNs). Although they have great generalization and prediction skills, their functioning does not allow obtaining detailed explanations of their behaviour. As opaque machine learning models are increasingly being employed to make important predictions in critical environments, the danger is to create and use decisions that are not justifiable or legitimate. Therefore, there is a general agreement on the importance of endowing machine learning models with explainability. EXplainable Artificial Intelligence (XAI) techniques can serve to verify and certify model outputs and enhance them with desirable notions such as trustworthiness, accountability, transparency and fairness. This guide is meant to be the go-to handbook for any audience with a computer science background aiming at getting intuitive insights on machine learning models, accompanied with straight, fast, and intuitive explanations out of the box. This article aims to fill the lack of compelling XAI guide by applying XAI techniques in their particular day-to-day models, datasets and use-cases. Figure 1 acts as a flowchart/map for the reader and should help him to find the ideal method to use according to his type of data. In each chapter, the reader will find a description of the proposed method as well as an example of use on a Biomedical application and a Python notebook. It can be easily modified in order to be applied to specific applications.
△ Less
Submitted 5 September, 2022; v1 submitted 13 November, 2021;
originally announced November 2021.
-
Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future Challenges
Authors:
Patrizio Bellan,
Mauro Dragoni,
Chiara Ghidini,
Han van der Aa,
Simone Paolo Ponzetto
Abstract:
The extraction of process models from text refers to the problem of turning the information contained in an unstructured textual process descriptions into a formal representation,i.e.,a process model. Several automated approaches have been proposed to tackle this problem, but they are highly heterogeneous in scope and underlying assumptions,i.e., differences in input, target output, and data used…
▽ More
The extraction of process models from text refers to the problem of turning the information contained in an unstructured textual process descriptions into a formal representation,i.e.,a process model. Several automated approaches have been proposed to tackle this problem, but they are highly heterogeneous in scope and underlying assumptions,i.e., differences in input, target output, and data used in their evaluation.As a result, it is currently unclear how well existing solutions are able to solve the model-extraction problem and how they compare to each other.We overcome this issue by comparing 10 state-of-the-art approaches for model extraction in a systematic manner, covering both qualitative and quantitative aspects.The qualitative evaluation compares the analysis of the primary studies on: 1 the main characteristics of each solution;2 the type of process model elements extracted from the input data;3 the experimental evaluation performed to evaluate the proposed framework.The results show a heterogeneity of techniques, elements extracted and evaluations conducted, that are often impossible to compare.To overcome this difficulty we propose a quantitative comparison of the tools proposed by the papers on the unifying task of process model entity and relation extraction so as to be able to compare them directly.The results show three distinct groups of tools in terms of performance, with no tool obtaining very good scores and also serious limitations.Moreover, the proposed evaluation pipeline can be considered a reference task on a well-defined dataset and metrics that can be used to compare new tools. The paper also presents a reflection on the results of the qualitative and quantitative evaluation on the limitations and challenges that the community needs to address in the future to produce significant advances in this area.
△ Less
Submitted 25 October, 2023; v1 submitted 7 October, 2021;
originally announced October 2021.