-
A Systematic Review of User-Centred Evaluation of Explainable AI in Healthcare
Authors:
Ivania Donoso-Guzmán,
Kristýna Sirka Kacafírková,
Maxwell Szymanski,
An Jacobs,
Denis Parra,
Katrien Verbert
Abstract:
Despite promising developments in Explainable Artificial Intelligence, the practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential, not only to produce understandable explanations but also to ensure their trustworthiness and usability for intended users, but tends to be overlooked because of no clea…
▽ More
Despite promising developments in Explainable Artificial Intelligence, the practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential, not only to produce understandable explanations but also to ensure their trustworthiness and usability for intended users, but tends to be overlooked because of no clear guidelines on how to design an evaluation with users.
This study addresses this gap with two main goals: (1) to develop a framework of well-defined, atomic properties that characterise the user experience of XAI in healthcare; and (2) to provide clear, context-sensitive guidelines for defining evaluation strategies based on system characteristics.
We conducted a systematic review of 82 user studies, sourced from five databases, all situated within healthcare settings and focused on evaluating AI-generated explanations. The analysis was guided by a predefined coding scheme informed by an existing evaluation framework, complemented by inductive codes developed iteratively.
The review yields three key contributions: (1) a synthesis of current evaluation practices, highlighting a growing focus on human-centred approaches in healthcare XAI; (2) insights into the interrelations among explanation properties; and (3) an updated framework and a set of actionable guidelines to support interdisciplinary teams in designing and implementing effective evaluation strategies for XAI systems tailored to specific application contexts.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray
Authors:
Mingquan Lin,
Gregory Holste,
Song Wang,
Yiliang Zhou,
Yishu Wei,
Imon Banerjee,
Pengyi Chen,
Tianjie Dai,
Yuexi Du,
Nicha C. Dvornek,
Yuyan Ge,
Zuowei Guo,
Shouhei Hanaoka,
Dongkyun Kim,
Pablo Messina,
Yang Lu,
Denis Parra,
Donghyun Son,
Álvaro Soto,
Aisha Urooj,
René Vidal,
Yosuke Yamagishi,
Zefan Yang,
Ruichi Zhang,
Yang Zhou
, et al. (8 additional authors not shown)
Abstract:
The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting c…
▽ More
The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting comprehensive evaluations to identify ongoing issues impacting lung disease classification performance. Building on the success of CXR-LT 2023, the CXR-LT 2024 expands the dataset to 377,110 chest X-rays (CXRs) and 45 disease labels, including 19 new rare disease findings. It also introduces a new focus on zero-shot learning to address limitations identified in the previous event. Specifically, CXR-LT 2024 features three tasks: (i) long-tailed classification on a large, noisy test set, (ii) long-tailed classification on a manually annotated "gold standard" subset, and (iii) zero-shot generalization to five previously unseen disease findings. This paper provides an overview of CXR-LT 2024, detailing the data curation process and consolidating state-of-the-art solutions, including the use of multimodal models for rare disease detection, advanced generative approaches to handle noisy labels, and zero-shot learning strategies for unseen diseases. Additionally, the expanded dataset enhances disease coverage to better represent real-world clinical settings, offering a valuable resource for future research. By synthesizing the insights and innovations of participating teams, we aim to advance the development of clinically realistic and generalizable diagnostic models for chest radiography.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
The Role of Organizations in Networked Mobilization: Examining the 2011 Chilean Student Movement Through The Logic of Connective Action
Authors:
Diego Gomez-Zara,
Carolina Perez-Arredondo,
Denis Parra
Abstract:
This study examines the communication mechanisms that shape the formation of digitally-enabled mobilization networks. Informed by the logic of connective action, we postulate that the emergence of networks enabled by organizations and individuals is differentiated by network and framing mechanisms. From a case comparison within two mobilization networks -- one crowd-enabled and one organizationall…
▽ More
This study examines the communication mechanisms that shape the formation of digitally-enabled mobilization networks. Informed by the logic of connective action, we postulate that the emergence of networks enabled by organizations and individuals is differentiated by network and framing mechanisms. From a case comparison within two mobilization networks -- one crowd-enabled and one organizationally-enabled -- of the 2011 Chilean student movement, we analyze their network structures and users' communication roles. We found that organizationally-enabled networks are likely to form from hierarchical cascades and crowd-enabled networks are likely to form from triadic closure mechanisms. Moreover, we found that organizations are essential for both kinds of networks: compared to individuals, organizations spread more messages among unconnected users, and organizations' messages are more likely to be spread. We discuss our findings in light of the network mechanisms and participation of organizations and influential users.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
A Compressive-Expressive Communication Framework for Compositional Representations
Authors:
Rafael Elberg,
Felipe del Rio,
Mircea Petrache,
Denis Parra
Abstract:
Compositional generalization--the ability to interpret novel combinations of familiar elements--is a hallmark of human cognition and language. Despite recent advances, deep neural networks still struggle to acquire this property reliably. In this work, we introduce CELEBI (Compressive-Expressive Language Emergence through a discrete Bottleneck and Iterated learning), a novel self-supervised framew…
▽ More
Compositional generalization--the ability to interpret novel combinations of familiar elements--is a hallmark of human cognition and language. Despite recent advances, deep neural networks still struggle to acquire this property reliably. In this work, we introduce CELEBI (Compressive-Expressive Language Emergence through a discrete Bottleneck and Iterated learning), a novel self-supervised framework for inducing compositionality in learned representations from pre-trained models, through a reconstruction-based communication game between a sender and a receiver. Building on theories of language emergence, we integrate three mechanisms that jointly promote compressibility, expressivity, and efficiency in the emergent language. First, interactive decoding incentivizes intermediate reasoning by requiring the receiver to produce partial reconstructions after each symbol. Second, a reconstruction-based imitation phase, inspired by iterated learning, trains successive generations of agents to imitate reconstructions rather than messages, enforcing a tighter communication bottleneck. Third, pairwise distance maximization regularizes message diversity by encouraging high distances between messages, with formal links to entropy maximization. Our method significantly improves both the efficiency and compositionality of the learned messages on the Shapes3D and MPI3D datasets, surpassing prior discrete communication frameworks in both reconstruction accuracy and topographic similarity. This work provides new theoretical and empirical evidence for the emergence of structured, generalizable communication protocols from simplicity-based inductive biases.
△ Less
Submitted 5 June, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
Authors:
Pablo Messina,
René Vidal,
Denis Parra,
Álvaro Soto,
Vladimir Araujo
Abstract:
Advancing representation learning in specialized fields like medicine remains challenging due to the scarcity of expert annotations for text and images. To tackle this issue, we present a novel two-stage framework designed to extract high-quality factual statements from free-text radiology reports in order to improve the representations of text encoders and, consequently, their performance on vari…
▽ More
Advancing representation learning in specialized fields like medicine remains challenging due to the scarcity of expert annotations for text and images. To tackle this issue, we present a novel two-stage framework designed to extract high-quality factual statements from free-text radiology reports in order to improve the representations of text encoders and, consequently, their performance on various downstream tasks. In the first stage, we propose a \textit{Fact Extractor} that leverages large language models (LLMs) to identify factual statements from well-curated domain-specific datasets. In the second stage, we introduce a \textit{Fact Encoder} (CXRFE) based on a BERT model fine-tuned with objective functions designed to improve its representations using the extracted factual data. Our framework also includes a new embedding-based metric (CXRFEScore) for evaluating chest X-ray text generation systems, leveraging both stages of our approach. Extensive evaluations show that our fact extractor and encoder outperform current state-of-the-art methods in tasks such as sentence ranking, natural language inference, and label extraction from radiology reports. Additionally, our metric proves to be more robust and effective than existing metrics commonly used in the radiology report generation literature. The code of this project is available at \url{https://github.com/PabloMessina/CXR-Fact-Encoder}.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning
Authors:
Rafael Elberg,
Denis Parra,
Mircea Petrache
Abstract:
Image and multimodal machine learning tasks are very challenging to solve in the case of poorly distributed data. In particular, data availability and privacy restrictions exacerbate these hurdles in the medical domain. The state of the art in image generation quality is held by Latent Diffusion models, making them prime candidates for tackling this problem. However, a few key issues still need to…
▽ More
Image and multimodal machine learning tasks are very challenging to solve in the case of poorly distributed data. In particular, data availability and privacy restrictions exacerbate these hurdles in the medical domain. The state of the art in image generation quality is held by Latent Diffusion models, making them prime candidates for tackling this problem. However, a few key issues still need to be solved, such as the difficulty in generating data from under-represented classes and a slow inference process. To mitigate these issues, we propose a new method for image augmentation in long-tailed data based on leveraging the rich latent space of pre-trained Stable Diffusion Models. We create a modified separable latent space to mix head and tail class examples. We build this space via Iterated Learning of underlying sparsified embeddings, which we apply to task-specific saliency maps via a K-NN approach. Code is available at https://github.com/SugarFreeManatee/Feature-Space-Augmentation-and-Iterated-Learning
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Causative Insights into Open Source Software Security using Large Language Code Embeddings and Semantic Vulnerability Graph
Authors:
Nafis Tanveer Islam,
Gonzalo De La Torre Parra,
Dylan Manual,
Murtuza Jadliwala,
Peyman Najafirad
Abstract:
Open Source Software (OSS) security and resilience are worldwide phenomena hampering economic and technological innovation. OSS vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations, rendering any benefits worthless. While recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code, it is…
▽ More
Open Source Software (OSS) security and resilience are worldwide phenomena hampering economic and technological innovation. OSS vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations, rendering any benefits worthless. While recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code, it is unclear how effective these research techniques are from a usability perspective due to a lack of proper methodological analysis. Usually, these methods offload a developer's task of classifying and localizing vulnerable code; still, a reasonable study to measure the actual effectiveness of these systems to the end user has yet to be conducted. To address the challenge of proper developer training from the prior methods, we propose a system to link vulnerabilities to their root cause, thereby intuitively educating the developers to code more securely. Furthermore, we provide a comprehensive usability study to test the effectiveness of our system in fixing vulnerabilities and its capability to assist developers in writing more secure code. We demonstrate the effectiveness of our system by showing its efficacy in helping developers fix source code with vulnerabilities. Our study shows a 24% improvement in code repair capabilities compared to previous methods. We also show that, when trained by our system, on average, approximately 9% of the developers naturally tend to write more secure code with fewer vulnerabilities.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward
Authors:
Nafis Tanveer Islam,
Joseph Khoury,
Andrew Seong,
Mohammad Bahrami Karkevandi,
Gonzalo De La Torre Parra,
Elias Bou-Harb,
Peyman Najafirad
Abstract:
In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because…
▽ More
In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because of pre-training on publicly available repositories with vulnerable code. Moreover, developers are called the "weakest link in the chain" since they have very minimal knowledge of code security. Although existing solutions provide a reasonable solution to vulnerable code, they must adequately describe and educate the developers on code security to ensure that the security issues are not repeated. Therefore we introduce a multipurpose code vulnerability analysis system \texttt{SecRepair}, powered by a large language model, CodeGen2 assisting the developer in identifying and generating fixed code along with a complete description of the vulnerability with a code comment. Our innovative methodology uses a reinforcement learning paradigm to generate code comments augmented by a semantic reward mechanism. Inspired by how humans fix code issues, we propose an instruction-based dataset suitable for vulnerability analysis with LLMs. We further identify zero-day and N-day vulnerabilities in 6 Open Source IoT Operating Systems on GitHub. Our findings underscore that incorporating reinforcement learning coupled with semantic reward augments our model's performance, thereby fortifying its capacity to address code vulnerabilities with improved efficacy.
△ Less
Submitted 21 February, 2024; v1 submitted 6 January, 2024;
originally announced January 2024.
-
Towards a Comprehensive Human-Centred Evaluation Framework for Explainable AI
Authors:
Ivania Donoso-Guzmán,
Jeroen Ooge,
Denis Parra,
Katrien Verbert
Abstract:
While research on explainable AI (XAI) is booming and explanation techniques have proven promising in many application domains, standardised human-centred evaluation procedures are still missing. In addition, current evaluation procedures do not assess XAI methods holistically in the sense that they do not treat explanations' effects on humans as a complex user experience. To tackle this challenge…
▽ More
While research on explainable AI (XAI) is booming and explanation techniques have proven promising in many application domains, standardised human-centred evaluation procedures are still missing. In addition, current evaluation procedures do not assess XAI methods holistically in the sense that they do not treat explanations' effects on humans as a complex user experience. To tackle this challenge, we propose to adapt the User-Centric Evaluation Framework used in recommender systems: we integrate explanation aspects, summarise explanation properties, indicate relations between them, and categorise metrics that measure these properties. With this comprehensive evaluation framework, we hope to contribute to the human-centred standardisation of XAI evaluation.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Learning Difference Equations with Structured Grammatical Evolution for Postprandial Glycaemia Prediction
Authors:
Daniel Parra,
David Joedicke,
J. Manuel Velasco,
Gabriel Kronberger,
J. Ignacio Hidalgo
Abstract:
People with diabetes must carefully monitor their blood glucose levels, especially after eating. Blood glucose regulation requires a proper combination of food intake and insulin boluses. Glucose prediction is vital to avoid dangerous post-meal complications in treating individuals with diabetes. Although traditional methods, such as artificial neural networks, have shown high accuracy rates, some…
▽ More
People with diabetes must carefully monitor their blood glucose levels, especially after eating. Blood glucose regulation requires a proper combination of food intake and insulin boluses. Glucose prediction is vital to avoid dangerous post-meal complications in treating individuals with diabetes. Although traditional methods, such as artificial neural networks, have shown high accuracy rates, sometimes they are not suitable for developing personalised treatments by physicians due to their lack of interpretability. In this study, we propose a novel glucose prediction method emphasising interpretability: Interpretable Sparse Identification by Grammatical Evolution. Combined with a previous clustering stage, our approach provides finite difference equations to predict postprandial glucose levels up to two hours after meals. We divide the dataset into four-hour segments and perform clustering based on blood glucose values for the twohour window before the meal. Prediction models are trained for each cluster for the two-hour windows after meals, allowing predictions in 15-minute steps, yielding up to eight predictions at different time horizons. Prediction safety was evaluated based on Parkes Error Grid regions. Our technique produces safe predictions through explainable expressions, avoiding zones D (0.2% average) and E (0%) and reducing predictions on zone C (6.2%). In addition, our proposal has slightly better accuracy than other techniques, including sparse identification of non-linear dynamics and artificial neural networks. The results demonstrate that our proposal provides interpretable solutions without sacrificing prediction accuracy, offering a promising approach to glucose prediction in diabetes management that balances accuracy, interpretability, and computational efficiency.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
An Unbiased Transformer Source Code Learning with Semantic Vulnerability Graph
Authors:
Nafis Tanveer Islam,
Gonzalo De La Torre Parra,
Dylan Manuel,
Elias Bou-Harb,
Peyman Najafirad
Abstract:
Over the years, open-source software systems have become prey to threat actors. Even as open-source communities act quickly to patch the breach, code vulnerability screening should be an integral part of agile software development from the beginning. Unfortunately, current vulnerability screening techniques are ineffective at identifying novel vulnerabilities or providing developers with code vuln…
▽ More
Over the years, open-source software systems have become prey to threat actors. Even as open-source communities act quickly to patch the breach, code vulnerability screening should be an integral part of agile software development from the beginning. Unfortunately, current vulnerability screening techniques are ineffective at identifying novel vulnerabilities or providing developers with code vulnerability and classification. Furthermore, the datasets used for vulnerability learning often exhibit distribution shifts from the real-world testing distribution due to novel attack strategies deployed by adversaries and as a result, the machine learning model's performance may be hindered or biased. To address these issues, we propose a joint interpolated multitasked unbiased vulnerability classifier comprising a transformer "RoBERTa" and graph convolution neural network (GCN). We present a training process utilizing a semantic vulnerability graph (SVG) representation from source code, created by integrating edges from a sequential flow, control flow, and data flow, as well as a novel flow dubbed Poacher Flow (PF). Poacher flow edges reduce the gap between dynamic and static program analysis and handle complex long-range dependencies. Moreover, our approach reduces biases of classifiers regarding unbalanced datasets by integrating Focal Loss objective function along with SVG. Remarkably, experimental results show that our classifier outperforms state-of-the-art results on vulnerability detection with fewer false negatives and false positives. After testing our model across multiple datasets, it shows an improvement of at least 2.41% and 18.75% in the best-case scenario. Evaluations using N-day program samples demonstrate that our proposed approach achieves a 93% accuracy and was able to detect 4, zero-day vulnerabilities from popular GitHub repositories.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Identifying Differential Equations to predict Blood Glucose using Sparse Identification of Nonlinear Systems
Authors:
David Jödicke,
Daniel Parra,
Gabriel Kronberger,
Stephan Winkler
Abstract:
Describing dynamic medical systems using machine learning is a challenging topic with a wide range of applications. In this work, the possibility of modeling the blood glucose level of diabetic patients purely on the basis of measured data is described. A combination of the influencing variables insulin and calories are used to find an interpretable model. The absorption speed of external substanc…
▽ More
Describing dynamic medical systems using machine learning is a challenging topic with a wide range of applications. In this work, the possibility of modeling the blood glucose level of diabetic patients purely on the basis of measured data is described. A combination of the influencing variables insulin and calories are used to find an interpretable model. The absorption speed of external substances in the human body depends strongly on external influences, which is why time-shifts are added for the influencing variables. The focus is put on identifying the best timeshifts that provide robust models with good prediction accuracy that are independent of other unknown external influences. The modeling is based purely on the measured data using Sparse Identification of Nonlinear Dynamics. A differential equation is determined which, starting from an initial value, simulates blood glucose dynamics. By applying the best model to test data, we can show that it is possible to simulate the long-term blood glucose dynamics using differential equations and few, influencing variables.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Stress Test Evaluation of Biomedical Word Embeddings
Authors:
Vladimir Araujo,
Andrés Carvallo,
Carlos Aspillaga,
Camilo Thorne,
Denis Parra
Abstract:
The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized embeddings yielding remarkable results in several biomedical NLP tasks. However, there is a lack of research on quantifying their behavior under severe "stress" scenarios. In this work, we systematically evaluate three language models with adversarial examples -- automatically constructe…
▽ More
The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized embeddings yielding remarkable results in several biomedical NLP tasks. However, there is a lack of research on quantifying their behavior under severe "stress" scenarios. In this work, we systematically evaluate three language models with adversarial examples -- automatically constructed tests that allow us to examine how robust the models are. We propose two types of stress scenarios focused on the biomedical named entity recognition (NER) task, one inspired by spelling errors and another based on the use of synonyms for medical terms. Our experiments with three benchmarks show that the performance of the original models decreases considerably, in addition to revealing their weaknesses and strengths. Finally, we show that adversarial training causes the models to improve their robustness and even to exceed the original performance in some cases.
△ Less
Submitted 24 July, 2021;
originally announced July 2021.
-
Graphing else matters: exploiting aspect opinions and ratings in explainable graph-based recommendations
Authors:
Iván Cantador,
Andrés Carvallo,
Fernando Diez,
Denis Parra
Abstract:
The success of neural network embeddings has entailed a renewed interest in using knowledge graphs for a wide variety of machine learning and information retrieval tasks. In particular, current recommendation methods based on graph embeddings have shown state-of-the-art performance. These methods commonly encode latent rating patterns and content features. Different from previous work, in this pap…
▽ More
The success of neural network embeddings has entailed a renewed interest in using knowledge graphs for a wide variety of machine learning and information retrieval tasks. In particular, current recommendation methods based on graph embeddings have shown state-of-the-art performance. These methods commonly encode latent rating patterns and content features. Different from previous work, in this paper, we propose to exploit embeddings extracted from graphs that combine information from ratings and aspect-based opinions expressed in textual reviews. We then adapt and evaluate state-of-the-art graph embedding techniques over graphs generated from Amazon and Yelp reviews on six domains, outperforming baseline recommenders. Our approach has the advantage of providing explanations which leverage aspect-based opinions given by users about recommended items. Furthermore, we also provide examples of the applicability of recommendations utilizing aspect opinions as explanations in a visualization dashboard, which allows obtaining information about the most and least liked aspects of similar users obtained from the embeddings of an input graph.
△ Less
Submitted 28 July, 2022; v1 submitted 7 July, 2021;
originally announced July 2021.
-
AHMoSe: A Knowledge-Based Visual Support System for Selecting Regression Machine Learning Models
Authors:
Diego Rojo,
Nyi Nyi Htun,
Denis Parra,
Robin De Croon,
Katrien Verbert
Abstract:
Decision support systems have become increasingly popular in the domain of agriculture. With the development of automated machine learning, agricultural experts are now able to train, evaluate and make predictions using cutting edge machine learning (ML) models without the need for much ML knowledge. Although this automated approach has led to successful results in many scenarios, in certain cases…
▽ More
Decision support systems have become increasingly popular in the domain of agriculture. With the development of automated machine learning, agricultural experts are now able to train, evaluate and make predictions using cutting edge machine learning (ML) models without the need for much ML knowledge. Although this automated approach has led to successful results in many scenarios, in certain cases (e.g., when few labeled datasets are available) choosing among different models with similar performance metrics is a difficult task. Furthermore, these systems do not commonly allow users to incorporate their domain knowledge that could facilitate the task of model selection, and to gain insight into the prediction system for eventual decision making. To address these issues, in this paper we present AHMoSe, a visual support system that allows domain experts to better understand, diagnose and compare different regression models, primarily by enriching model-agnostic explanations with domain knowledge. To validate AHMoSe, we describe a use case scenario in the viticulture domain, grape quality prediction, where the system enables users to diagnose and select prediction models that perform better. We also discuss feedback concerning the design of the tool from both ML and viticulture experts.
△ Less
Submitted 30 November, 2021; v1 submitted 28 January, 2021;
originally announced January 2021.
-
Neural language models for text classification in evidence-based medicine
Authors:
Andres Carvallo,
Denis Parra,
Gabriel Rada,
Daniel Perez,
Juan Ignacio Vasquez,
Camilo Vergara
Abstract:
The COVID-19 has brought about a significant challenge to the whole of humanity, but with a special burden upon the medical community. Clinicians must keep updated continuously about symptoms, diagnoses, and effectiveness of emergent treatments under a never-ending flood of scientific literature. In this context, the role of evidence-based medicine (EBM) for curating the most substantial evidence…
▽ More
The COVID-19 has brought about a significant challenge to the whole of humanity, but with a special burden upon the medical community. Clinicians must keep updated continuously about symptoms, diagnoses, and effectiveness of emergent treatments under a never-ending flood of scientific literature. In this context, the role of evidence-based medicine (EBM) for curating the most substantial evidence to support public health and clinical practice turns essential but is being challenged as never before due to the high volume of research articles published and pre-prints posted daily. Artificial Intelligence can have a crucial role in this situation. In this article, we report the results of an applied research project to classify scientific articles to support Epistemonikos, one of the most active foundations worldwide conducting EBM. We test several methods, and the best one, based on the XLNet neural language model, improves the current approach by 93\% on average F1-score, saving valuable time from physicians who volunteer to curate COVID-19 research articles manually.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Inspecting state of the art performance and NLP metrics in image-based medical report generation
Authors:
Pablo Pino,
Denis Parra,
Pablo Messina,
Cecilia Besa,
Sergio Uribe
Abstract:
Several deep learning architectures have been proposed over the last years to deal with the problem of generating a written report given an imaging exam as input. Most works evaluate the generated reports using standard Natural Language Processing (NLP) metrics (e.g. BLEU, ROUGE), reporting significant progress. In this article, we contrast this progress by comparing state of the art (SOTA) models…
▽ More
Several deep learning architectures have been proposed over the last years to deal with the problem of generating a written report given an imaging exam as input. Most works evaluate the generated reports using standard Natural Language Processing (NLP) metrics (e.g. BLEU, ROUGE), reporting significant progress. In this article, we contrast this progress by comparing state of the art (SOTA) models against weak baselines. We show that simple and even naive approaches yield near SOTA performance on most traditional NLP metrics. We conclude that evaluation methods in this task should be further studied towards correctly measuring clinical accuracy, ideally involving physicians to contribute to this end.
△ Less
Submitted 15 January, 2022; v1 submitted 18 November, 2020;
originally announced November 2020.
-
A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical Images
Authors:
Pablo Messina,
Pablo Pino,
Denis Parra,
Alvaro Soto,
Cecilia Besa,
Sergio Uribe,
Marcelo andía,
Cristian Tejos,
Claudia Prieto,
Daniel Capurro
Abstract:
Every year physicians face an increasing demand of image-based diagnosis from patients, a problem that can be addressed with recent artificial intelligence methods. In this context, we survey works in the area of automatic report generation from medical images, with emphasis on methods using deep neural networks, with respect to: (1) Datasets, (2) Architecture Design, (3) Explainability and (4) Ev…
▽ More
Every year physicians face an increasing demand of image-based diagnosis from patients, a problem that can be addressed with recent artificial intelligence methods. In this context, we survey works in the area of automatic report generation from medical images, with emphasis on methods using deep neural networks, with respect to: (1) Datasets, (2) Architecture Design, (3) Explainability and (4) Evaluation Metrics. Our survey identifies interesting developments, but also remaining challenges. Among them, the current evaluation of generated reports is especially weak, since it mostly relies on traditional Natural Language Processing (NLP) metrics, which do not accurately capture medical correctness.
△ Less
Submitted 8 January, 2022; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Scalable Recommendation of Wikipedia Articles to Editors Using Representation Learning
Authors:
Oleksii Moskalenko,
Denis Parra,
Diego Saez-Trumper
Abstract:
Wikipedia is edited by volunteer editors around the world. Considering the large amount of existing content (e.g. over 5M articles in English Wikipedia), deciding what to edit next can be difficult, both for experienced users that usually have a huge backlog of articles to prioritize, as well as for newcomers who that might need guidance in selecting the next article to contribute. Therefore, help…
▽ More
Wikipedia is edited by volunteer editors around the world. Considering the large amount of existing content (e.g. over 5M articles in English Wikipedia), deciding what to edit next can be difficult, both for experienced users that usually have a huge backlog of articles to prioritize, as well as for newcomers who that might need guidance in selecting the next article to contribute. Therefore, helping editors to find relevant articles should improve their performance and help in the retention of new editors. In this paper, we address the problem of recommending relevant articles to editors. To do this, we develop a scalable system on top of Graph Convolutional Networks and Doc2Vec, learning how to represent Wikipedia articles and deliver personalized recommendations for editors. We test our model on editors' histories, predicting their most recent edits based on their prior edits. We outperform competitive implicit-feedback collaborative-filtering methods such as WMRF based on ALS, as well as a traditional IR-method such as content-based filtering based on BM25. All of the data used on this paper is publicly available, including graph embeddings for Wikipedia articles, and we release our code to support replication of our experiments. Moreover, we contribute with a scalable implementation of a state-of-art graph embedding algorithm as current ones cannot efficiently handle the sheer size of the Wikipedia graph.
△ Less
Submitted 24 September, 2020;
originally announced September 2020.
-
CuratorNet: Visually-aware Recommendation of Art Images
Authors:
Pablo Messina,
Manuel Cartagena,
Patricio Cerda-Mardini,
Felipe del Rio,
Denis Parra
Abstract:
Although there are several visually-aware recommendation models in domains like fashion or even movies, the art domain lacks thesame level of research attention, despite the recent growth of the online artwork market. To reduce this gap, in this article we introduceCuratorNet, a neural network architecture for visually-aware recommendation of art images. CuratorNet is designed at the core withthe…
▽ More
Although there are several visually-aware recommendation models in domains like fashion or even movies, the art domain lacks thesame level of research attention, despite the recent growth of the online artwork market. To reduce this gap, in this article we introduceCuratorNet, a neural network architecture for visually-aware recommendation of art images. CuratorNet is designed at the core withthe goal of maximizing generalization: the network has a fixed set of parameters that only need to be trained once, and thereafter themodel is able to generalize to new users or items never seen before, without further training. This is achieved by leveraging visualcontent: items are mapped to item vectors through visual embeddings, and users are mapped to user vectors by aggregating the visualcontent of items they have consumed. Besides the model architecture, we also introduce novel triplet sampling strategies to build atraining set for rank learning in the art domain, resulting in more effective learning than naive random sampling. With an evaluationover a real-world dataset of physical paintings, we show that CuratorNet achieves the best performance among several baselines,including the state-of-the-art model VBPR. CuratorNet is motivated and evaluated in the art domain, but its architecture and trainingscheme could be adapted to recommend images in other areas
△ Less
Submitted 30 September, 2020; v1 submitted 9 September, 2020;
originally announced September 2020.
-
Interpretable Contextual Team-aware Item Recommendation: Application in Multiplayer Online Battle Arena Games
Authors:
Andrés Villa,
Vladimir Araujo,
Francisca Cattan,
Denis Parra
Abstract:
The video game industry has adopted recommendation systems to boost users interest with a focus on game sales. Other exciting applications within video games are those that help the player make decisions that would maximize their playing experience, which is a desirable feature in real-time strategy video games such as Multiplayer Online Battle Arena (MOBA) like as DotA and LoL. Among these tasks,…
▽ More
The video game industry has adopted recommendation systems to boost users interest with a focus on game sales. Other exciting applications within video games are those that help the player make decisions that would maximize their playing experience, which is a desirable feature in real-time strategy video games such as Multiplayer Online Battle Arena (MOBA) like as DotA and LoL. Among these tasks, the recommendation of items is challenging, given both the contextual nature of the game and how it exposes the dependence on the formation of each team. Existing works on this topic do not take advantage of all the available contextual match data and dismiss potentially valuable information. To address this problem we develop TTIR, a contextual recommender model derived from the Transformer neural architecture that suggests a set of items to every team member, based on the contexts of teams and roles that describe the match. TTIR outperforms several approaches and provides interpretable recommendations through visualization of attention weights. Our evaluation indicates that both the Transformer architecture and the contextual information are essential to get the best results for this item recommendation task. Furthermore, a preliminary user survey indicates the usefulness of attention weights for explaining recommendations as well as ideas for future work. The code and dataset are available at: https://github.com/ojedaf/IC-TIR-Lol.
△ Less
Submitted 30 July, 2020;
originally announced July 2020.
-
On Adversarial Examples for Biomedical NLP Tasks
Authors:
Vladimir Araujo,
Andres Carvallo,
Carlos Aspillaga,
Denis Parra
Abstract:
The success of pre-trained word embeddings has motivated its use in tasks in the biomedical domain. The BERT language model has shown remarkable results on standard performance metrics in tasks such as Named Entity Recognition (NER) and Semantic Textual Similarity (STS), which has brought significant progress in the field of NLP. However, it is unclear whether these systems work seemingly well in…
▽ More
The success of pre-trained word embeddings has motivated its use in tasks in the biomedical domain. The BERT language model has shown remarkable results on standard performance metrics in tasks such as Named Entity Recognition (NER) and Semantic Textual Similarity (STS), which has brought significant progress in the field of NLP. However, it is unclear whether these systems work seemingly well in critical domains, such as legal or medical. For that reason, in this work, we propose an adversarial evaluation scheme on two well-known datasets for medical NER and STS. We propose two types of attacks inspired by natural spelling errors and typos made by humans. We also propose another type of attack that uses synonyms of medical terms. Under these adversarial settings, the accuracy of the models drops significantly, and we quantify the extent of this performance loss. We also show that we can significantly improve the robustness of the models by training them with adversarial examples. We hope our work will motivate the use of adversarial examples to evaluate and develop models with increased robustness for medical tasks.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Analyzing Network Effects on a Fanfiction Community
Authors:
Andrés Carvallo,
Denis Parra
Abstract:
Since the early days of the Web 2.0, online communities have been growing quickly and have become important part of life for large number of people. In one of these communities, fanfiction.net, users can read and write stories which are adapted, recreated and modified from original famous books, tv series, movies, among others. By following stories and their authors, the fanfiction community creat…
▽ More
Since the early days of the Web 2.0, online communities have been growing quickly and have become important part of life for large number of people. In one of these communities, fanfiction.net, users can read and write stories which are adapted, recreated and modified from original famous books, tv series, movies, among others. By following stories and their authors, the fanfiction community creates a social network. Previous research on online communities has shown how features of the social network can help explain the behavior of the community, so we are interested in studying fanfiction's social network as well as its influence in aspects of the community. In particular, in this article we describe several properties of the members of the community, and we also try to discover which factors explain the popularity of the authors. We discover that time since joining fanfiction and the size of the authors' biography, has a negative effect on the authors popularity. Moreover, we show that the users' network metrics help to explain better authors' popularity.
△ Less
Submitted 7 August, 2020; v1 submitted 6 September, 2019;
originally announced September 2019.
-
Scaling notifications beyond alerts: from subtly drawing attention up to forcing the user to take action
Authors:
Denys J. C. Matthies,
Laura Milena Daza Parra,
Bodo Urban
Abstract:
New computational devices, in particular wearable devices, offer the unique property of always being available and thus to be able to constantly update the user with information, such as by notifications. While research has been done in sophisticated notifications, devices today mainly stick to a binary level of information, while they are either attention drawing or silent. In this paper, we want…
▽ More
New computational devices, in particular wearable devices, offer the unique property of always being available and thus to be able to constantly update the user with information, such as by notifications. While research has been done in sophisticated notifications, devices today mainly stick to a binary level of information, while they are either attention drawing or silent. In this paper, we want to go further and propose scalable notifications, which adjust the intensity reaching from subtle to obtrusive and even going beyond that level, while forcing the user to take action. To illustrate the technical feasibility and validity of this concept, we developed three prototypes providing mechano-pressure, thermal, and electrical feedback and evaluated them in different lab studies. Our first prototype provides subtle poking through to high and frequent pressure on the user's spine, which creates a significantly improved back posture. In a second scenario, the users are enabled to perceive the overuse of a drill by an increased temperature on the palm of a hand until the heat is intolerable and the users are forced to eventually put down the tool. The last project comprises a speed control in a driving simulation, while electric muscle stimulation on the users' legs conveys information on changing the car's speed by a perceived tingling until the system independently forces the foot to move. Although our selected scenarios are long way from being realistic, we see these lab studies as a means to validate our proof-of-concept. In conclusion, all studies' findings support the feasibility of our concept of a scalable notification system, including the system of forced intervention. While we envisage the implementation of our proof-of-concept into future wearables, more realistic application scenarios are worthy of exploration.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
Do Better ImageNet Models Transfer Better... for Image Recommendation?
Authors:
Felipe del Rio,
Pablo Messina,
Vicente Dominguez,
Denis Parra
Abstract:
Visual embeddings from Convolutional Neural Networks (CNN) trained on the ImageNet dataset for the ILSVRC challenge have shown consistently good performance for transfer learning and are widely used in several tasks, including image recommendation. However, some important questions have not yet been answered in order to use these embeddings for a larger scope of recommendation domains: a) Do CNNs…
▽ More
Visual embeddings from Convolutional Neural Networks (CNN) trained on the ImageNet dataset for the ILSVRC challenge have shown consistently good performance for transfer learning and are widely used in several tasks, including image recommendation. However, some important questions have not yet been answered in order to use these embeddings for a larger scope of recommendation domains: a) Do CNNs that perform better in ImageNet are also better for transfer learning in content-based image recommendation?, b) Does fine-tuning help to improve performance? and c) Which is the best way to perform the fine-tuning?
In this paper we compare several CNN models pre-trained with ImageNet to evaluate their transfer learning performance to an artwork image recommendation task. Our results indicate that models with better performance in the ImageNet challenge do not always imply better transfer learning for recommendation tasks (e.g. NASNet vs. ResNet). Our results also show that fine-tuning can be helpful even with a small dataset, but not every fine-tuning works. Our results can inform other researchers and practitioners on how to train their CNNs for better transfer learning towards image recommendation systems.
△ Less
Submitted 25 September, 2018; v1 submitted 25 July, 2018;
originally announced July 2018.
-
Toward Finding Latent Cities with Non-Negative Matrix Factorization
Authors:
Eduardo Graells-Garrido,
Diego Caro,
Denis Parra
Abstract:
In the last decade, digital footprints have been used to cluster population activity into functional areas of cities.
However, a key aspect has been overlooked: we experience our cities not only by performing activities at specific destinations, but also by moving from one place to another.
In this paper, we propose to analyze and cluster the city based on how people move through it. Particula…
▽ More
In the last decade, digital footprints have been used to cluster population activity into functional areas of cities.
However, a key aspect has been overlooked: we experience our cities not only by performing activities at specific destinations, but also by moving from one place to another.
In this paper, we propose to analyze and cluster the city based on how people move through it. Particularly, we introduce Mobilicities, automatically generated travel patterns inferred from mobile phone network data using NMF, a matrix factorization model.
We evaluate our method in a large city and we find that mobilicities reveal latent but at the same time interpretable mobility structures of the city. Our results provide evidence on how clustering and visualization of aggregated phone logs could be used in planning systems to interactively analyze city structure and population activity.
△ Less
Submitted 27 January, 2018;
originally announced January 2018.
-
Comparing Neural and Attractiveness-based Visual Features for Artwork Recommendation
Authors:
Vicente Dominguez,
Pablo Messina,
Denis Parra,
Domingo Mery,
Christoph Trattner,
Alvaro Soto
Abstract:
Advances in image processing and computer vision in the latest years have brought about the use of visual features in artwork recommendation. Recent works have shown that visual features obtained from pre-trained deep neural networks (DNNs) perform very well for recommending digital art. Other recent works have shown that explicit visual features (EVF) based on attractiveness can perform well in p…
▽ More
Advances in image processing and computer vision in the latest years have brought about the use of visual features in artwork recommendation. Recent works have shown that visual features obtained from pre-trained deep neural networks (DNNs) perform very well for recommending digital art. Other recent works have shown that explicit visual features (EVF) based on attractiveness can perform well in preference prediction tasks, but no previous work has compared DNN features versus specific attractiveness-based visual features (e.g. brightness, texture) in terms of recommendation performance. In this work, we study and compare the performance of DNN and EVF features for the purpose of physical artwork recommendation using transactional data from UGallery, an online store of physical paintings. In addition, we perform an exploratory analysis to understand if DNN embedded features have some relation with certain EVF. Our results show that DNN features outperform EVF, that certain EVF features are more suited for physical artwork recommendation and, finally, we show evidence that certain neurons in the DNN might be partially encoding visual features such as brightness, providing an opportunity for explaining recommendations based on visual neural models.
△ Less
Submitted 21 July, 2017; v1 submitted 22 June, 2017;
originally announced June 2017.
-
Towards a Recommender System for Undergraduate Research
Authors:
Felipe del-Rio,
Denis Parra,
Jovan Kuzmicic,
Erick Svec
Abstract:
Several studies indicate that attracting students to research careers requires to engage them from early undergraduate years. Following this paradigm, our Engineering School has developed an undergraduate research program that allows students to enroll in research in exchange for course credits. Moreover, we developed a web portal to inform students about the program and the opportunities, but par…
▽ More
Several studies indicate that attracting students to research careers requires to engage them from early undergraduate years. Following this paradigm, our Engineering School has developed an undergraduate research program that allows students to enroll in research in exchange for course credits. Moreover, we developed a web portal to inform students about the program and the opportunities, but participation remains lower than expected. In order to promote student engagement, we attempt to build a personalized recommender system of research opportunities to undergraduates. With this goal in mind we investigate two tasks. First, one that identifies students that are more willing to participate on this kind of program. A second task is generating a personalized list of recommendations of research opportunities for each student. To evaluate our approach, we perform a simulated prediction experiment with data from our School, which has more than 4,000 active undergraduate students nowadays. Our results indicate that there is a big potential to create a personalized recommender system for this purpose. Our results can be used as a baseline for colleges seeking strategies to encourage research activities within undergraduate students.
△ Less
Submitted 20 June, 2017;
originally announced June 2017.
-
pyRecLab: A Software Library for Quick Prototyping of Recommender Systems
Authors:
Gabriel Sepulveda,
Vicente Dominguez,
Denis Parra
Abstract:
This paper introduces pyRecLab, a software library written in C++ with Python bindings which allows to quickly train, test and develop recommender systems. Although there are several software libraries for this purpose, only a few let developers to get quickly started with the most traditional methods, permitting them to try different parameters and approach several tasks without a significant los…
▽ More
This paper introduces pyRecLab, a software library written in C++ with Python bindings which allows to quickly train, test and develop recommender systems. Although there are several software libraries for this purpose, only a few let developers to get quickly started with the most traditional methods, permitting them to try different parameters and approach several tasks without a significant loss of performance. Among the few libraries that have all these features, they are available in languages such as Java, Scala or C#, what is a disadvantage for less experienced programmers more used to the popular Python programming language. In this article we introduce details of pyRecLab, showing as well performance analysis in terms of error metrics (MAE and RMSE) and train/test time. We benchmark it against the popular Java-based library LibRec, showing similar results. We expect programmers with little experience and people interested in quickly prototyping recommender systems to be benefited from pyRecLab.
△ Less
Submitted 11 July, 2017; v1 submitted 20 June, 2017;
originally announced June 2017.
-
Exploring Content-based Artwork Recommendation with Metadata and Visual Features
Authors:
Pablo Messina,
Vicente Dominguez,
Denis Parra,
Christoph Trattner,
Alvaro Soto
Abstract:
Compared to other areas, artwork recommendation has received little attention, despite the continuous growth of the artwork market. Previous research has relied on ratings and metadata to make artwork recommendations, as well as visual features extracted with deep neural networks (DNN). However, these features have no direct interpretation to explicit visual features (e.g. brightness, texture) whi…
▽ More
Compared to other areas, artwork recommendation has received little attention, despite the continuous growth of the artwork market. Previous research has relied on ratings and metadata to make artwork recommendations, as well as visual features extracted with deep neural networks (DNN). However, these features have no direct interpretation to explicit visual features (e.g. brightness, texture) which might hinder explainability and user-acceptance. In this work, we study the impact of artwork metadata as well as visual features (DNN-based and attractiveness-based) for physical artwork recommendation, using images and transaction data from the UGallery online artwork store.
Our results indicate that: (i) visual features perform better than manually curated data, (ii) DNN-based visual features perform better than attractiveness-based ones, and (iii) a hybrid approach improves the performance further. Our research can inform the development of new artwork recommenders relying on diverse content data.
△ Less
Submitted 23 October, 2017; v1 submitted 19 June, 2017;
originally announced June 2017.
-
EpistAid: Interactive Interface for Document Filtering in Evidence-based Health Care
Authors:
Ivania Donoso,
Denis Parra
Abstract:
Evidence-based health care (EBHC) is an important practice of medicine which attempts to provide systematic scientific evidence to answer clinical questions. In this context, Epistemonikos (www.epistemonikos.org) is one of the first and most important online systems in the field, providing an interface that supports users on searching and filtering scientific articles for practicing EBHC. The syst…
▽ More
Evidence-based health care (EBHC) is an important practice of medicine which attempts to provide systematic scientific evidence to answer clinical questions. In this context, Epistemonikos (www.epistemonikos.org) is one of the first and most important online systems in the field, providing an interface that supports users on searching and filtering scientific articles for practicing EBHC. The system nowadays requires a large amount of expert human effort, where close to 500 physicians manually curate articles to be utilized in the platform. In order to scale up the large and continuous amount of data to keep the system updated, we introduce EpistAid, an interactive intelligent interface which supports clinicians in the process of curating documents for Epistemonikos within lists of papers called evidence matrices. We introduce the characteristics, design and algorithms of our solution, as well as a prototype implementation and a case study to show how our solution addresses the information overload problem in this area.
△ Less
Submitted 7 November, 2016;
originally announced November 2016.
-
Language, Twitter and Academic Conferences
Authors:
Ruth García,
Diego Gómez,
Denis Parra,
Christoph Trattner,
Andreas Kaltenbrunner,
Eduardo Graells-Garrido
Abstract:
Using Twitter during academic conferences is a way of engaging and connecting an audience inherently multicultural by the nature of scientific collaboration. English is expected to be the lingua franca bridging the communication and integration between native speakers of different mother tongues. However, little research has been done to support this assumption. In this paper we analyzed how integ…
▽ More
Using Twitter during academic conferences is a way of engaging and connecting an audience inherently multicultural by the nature of scientific collaboration. English is expected to be the lingua franca bridging the communication and integration between native speakers of different mother tongues. However, little research has been done to support this assumption. In this paper we analyzed how integrated language communities are by analyzing the scholars' tweets used in 26 Computer Science conferences over a time span of five years. We found that although English is the most popular language used to tweet during conferences, a significant proportion of people also tweet in other languages. In addition, people who tweet solely in English interact mostly within the same group (English monolinguals), while people who speak other languages tend to show a more diverse interaction with other lingua groups. Finally, we also found that the people who interact with other Twitter users show a more diverse language distribution, while people who do not interact mostly post tweets in a single language. These results suggest a relation between the number of languages a user speaks, which can affect the interaction dynamics of online communities.
△ Less
Submitted 13 April, 2015;
originally announced April 2015.
-
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural Disaster Situations
Authors:
Alfredo Cobo,
Denis Parra,
Jaime Navón
Abstract:
During recent years the online social networks (in particular Twitter) have become an important alternative information channel to traditional media during natural disasters, but the amount and diversity of messages poses the challenge of information overload to end users. The goal of our research is to develop an automatic classifier of tweets to feed a mobile application that reduces the difficu…
▽ More
During recent years the online social networks (in particular Twitter) have become an important alternative information channel to traditional media during natural disasters, but the amount and diversity of messages poses the challenge of information overload to end users. The goal of our research is to develop an automatic classifier of tweets to feed a mobile application that reduces the difficulties that citizens face to get relevant information during natural disasters. In this paper, we present in detail the process to build a classifier that filters tweets relevant and non-relevant to an earthquake. By using a dataset from the Chilean earthquake of 2010, we first build and validate a ground truth, and then we contribute by presenting in detail the effect of class imbalance and dimensionality reduction over 5 classifiers. We show how the performance of these models is affected by these variables, providing important considerations at the moment of building these systems.
△ Less
Submitted 18 March, 2015;
originally announced March 2015.
-
Recommending Items in Social Tagging Systems Using Tag and Time Information
Authors:
Emanuel Lacic,
Dominik Kowald,
Paul Seitlinger,
Christoph Trattner,
Denis Parra
Abstract:
In this work we present a novel item recommendation approach that aims at improving Collaborative Filtering (CF) in social tagging systems using the information about tags and time. Our algorithm follows a two-step approach, where in the first step a potentially interesting candidate item-set is found using user-based CF and in the second step this candidate item-set is ranked using item-based CF.…
▽ More
In this work we present a novel item recommendation approach that aims at improving Collaborative Filtering (CF) in social tagging systems using the information about tags and time. Our algorithm follows a two-step approach, where in the first step a potentially interesting candidate item-set is found using user-based CF and in the second step this candidate item-set is ranked using item-based CF. Within this ranking step we integrate the information of tag usage and time using the Base-Level Learning (BLL) equation coming from human memory theory that is used to determine the reuse-probability of words and tags using a power-law forgetting function.
As the results of our extensive evaluation conducted on data-sets gathered from three social tagging systems (BibSonomy, CiteULike and MovieLens) show, the usage of tag-based and time information via the BLL equation also helps to improve the ranking and recommendation process of items and thus, can be used to realize an effective item recommender that outperforms two alternative algorithms which also exploit time and tag-based information.
△ Less
Submitted 30 June, 2014;
originally announced June 2014.
-
Utilizing Online Social Network and Location-Based Data to Recommend Products and Categories in Online Marketplaces
Authors:
Emanuel Lacic,
Dominik Kowald,
Lukas Eberhard,
Christoph Trattner,
Denis Parra,
Leandro Marinho
Abstract:
Recent research has unveiled the importance of online social networks for improving the quality of recommender systems and encouraged the research community to investigate better ways of exploiting the social information for recommendations. To contribute to this sparse field of research, in this paper we exploit users' interactions along three data sources (marketplace, social network and locat…
▽ More
Recent research has unveiled the importance of online social networks for improving the quality of recommender systems and encouraged the research community to investigate better ways of exploiting the social information for recommendations. To contribute to this sparse field of research, in this paper we exploit users' interactions along three data sources (marketplace, social network and location-based) to assess their performance in a barely studied domain: recommending products and domains of interests (i.e., product categories) to people in an online marketplace environment. To that end we defined sets of content- and network-based user similarity features for each data source and studied them isolated using an user-based Collaborative Filtering (CF) approach and in combination via a hybrid recommender algorithm, to assess which one provides the best recommendation performance. Interestingly, in our experiments conducted on a rich dataset collected from SecondLife, a popular online virtual world, we found that recommenders relying on user similarity features obtained from the social network data clearly yielded the best results in terms of accuracy in case of predicting products, whereas the features obtained from the marketplace and location-based data sources also obtained very good results in case of predicting categories. This finding indicates that all three types of data sources are important and should be taken into account depending on the level of specialization of the recommendation task.
△ Less
Submitted 8 September, 2014; v1 submitted 8 May, 2014;
originally announced May 2014.
-
Twitter in Academic Conferences: Usage, Networking and Participation over Time
Authors:
Xidao Wen,
Yu-Ru Lin,
Christoph Trattner,
Denis Parra
Abstract:
Twitter is often referred to as a backchannel for conferences. While the main conference takes place in a physical setting, attendees and virtual attendees socialize, introduce new ideas or broadcast information by microblogging on Twitter. In this paper we analyze the scholars' Twitter use in 16 Computer Science conferences over a timespan of five years. Our primary finding is that over the years…
▽ More
Twitter is often referred to as a backchannel for conferences. While the main conference takes place in a physical setting, attendees and virtual attendees socialize, introduce new ideas or broadcast information by microblogging on Twitter. In this paper we analyze the scholars' Twitter use in 16 Computer Science conferences over a timespan of five years. Our primary finding is that over the years there are increasing differences with respect to conversation use and information use in Twitter. We studied the interaction network between users to understand whether assumptions about the structure of the conversations hold over time and between different types of interactions, such as retweets, replies, and mentions. While `people come and people go', we want to understand what keeps people stay with the conference on Twitter. By casting the problem to a classification task, we find different factors that contribute to the continuing participation of users to the online Twitter conference activity. These results have implications for research communities to implement strategies for continuous and active participation among members.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.