Search | arXiv e-print repository

SCRum-9: Multilingual Stance Classification over Rumours on Social Media

Authors: Yue Li, Jake Vasilakes, Zhixue Zhao, Carolina Scarton

Abstract: We introduce SCRum-9, a multilingual dataset for Rumour Stance Classification, containing 7,516 tweet-reply pairs from X. SCRum-9 goes beyond existing stance classification datasets by covering more languages (9), linking examples to more fact-checked claims (2.1k), and including complex annotations from multiple annotators to account for intra- and inter-annotator variability. Annotations were ma… ▽ More We introduce SCRum-9, a multilingual dataset for Rumour Stance Classification, containing 7,516 tweet-reply pairs from X. SCRum-9 goes beyond existing stance classification datasets by covering more languages (9), linking examples to more fact-checked claims (2.1k), and including complex annotations from multiple annotators to account for intra- and inter-annotator variability. Annotations were made by at least three native speakers per language, totalling around 405 hours of annotation and 8,150 dollars in compensation. Experiments on SCRum-9 show that it is a challenging benchmark for both state-of-the-art LLMs (e.g. Deepseek) as well as fine-tuned pre-trained models, motivating future work in this area. △ Less

Submitted 24 May, 2025; originally announced May 2025.

arXiv:2504.00589 [pdf, ps, other]

Efficient Annotator Reliability Assessment with EffiARA

Authors: Owen Cook, Jake Vasilakes, Ian Roberts, Xingyi Song

Abstract: Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annot… ▽ More Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annotation pipeline, from understanding the resources required for an annotation task to compiling the annotated dataset and gaining insights into the reliability of individual annotators as well as the dataset as a whole. The framework's efficacy is supported by two previous studies: one improving classification performance through annotator-reliability-based soft-label aggregation and sample weighting, and the other increasing the overall agreement among annotators through removing identifying and replacing an unreliable annotator. This work introduces the EffiARA Python package and its accompanying webtool, which provides an accessible graphical user interface for the system. We open-source the EffiARA Python package at https://github.com/MiniEggz/EffiARA and the webtool is publicly accessible at https://effiara.gate.ac.uk. △ Less

Submitted 3 June, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

arXiv:2502.12225 [pdf, other]

Subjective Logic Encodings

Authors: Jake Vasilakes, Chrysoula Zerva, Sophia Ananiadou

Abstract: Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sent… ▽ More Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sentiment analysis or hate speech detection where disagreement is natural. Therefore, a new approach to learning from labeled data, called data perspectivism, seeks to leverage inter-annotator disagreement to learn models that stay true to the inherent uncertainty of the task by treating annotations as opinions of the annotators, rather than gold-standard facts. Despite this conceptual grounding, existing methods under data perspectivism are limited to using disagreement as the sole source of annotation uncertainty. To expand the possibilities of data perspectivism, we introduce Subjective Logic Encodings (SLEs), a flexible framework for constructing classification targets that explicitly encodes annotations as opinions of the annotators. Based on Subjective Logic Theory, SLEs encode labels as Dirichlet distributions and provide principled methods for encoding and aggregating various types of annotation uncertainty -- annotator confidence, reliability, and disagreement -- into the targets. We show that SLEs are a generalization of other types of label encodings as well as how to estimate models to predict SLEs using a distribution matching objective. △ Less

Submitted 20 March, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

Comments: We make our code publicly available at https://github.com/jvasilakes/SLEncodings

arXiv:2501.17654 [pdf, other]

Exploring Vision Language Models for Multimodal and Multilingual Stance Detection

Authors: Jake Vasilakes, Carolina Scarton, Zhixue Zhao

Abstract: Social media's global reach amplifies the spread of information, highlighting the need for robust Natural Language Processing tasks like stance detection across languages and modalities. Prior research predominantly focuses on text-only inputs, leaving multimodal scenarios, such as those involving both images and text, relatively underexplored. Meanwhile, the prevalence of multimodal posts has inc… ▽ More Social media's global reach amplifies the spread of information, highlighting the need for robust Natural Language Processing tasks like stance detection across languages and modalities. Prior research predominantly focuses on text-only inputs, leaving multimodal scenarios, such as those involving both images and text, relatively underexplored. Meanwhile, the prevalence of multimodal posts has increased significantly in recent years. Although state-of-the-art Vision-Language Models (VLMs) show promise, their performance on multimodal and multilingual stance detection tasks remains largely unexamined. This paper evaluates state-of-the-art VLMs on a newly extended dataset covering seven languages and multimodal inputs, investigating their use of visual cues, language-specific performance, and cross-modality interactions. Our results show that VLMs generally rely more on text than images for stance detection and this trend persists across languages. Additionally, VLMs rely significantly more on text contained within the images than other visual content. Regarding multilinguality, the models studied tend to generate consistent predictions across languages whether they are explicitly multilingual or not, although there are outliers that are incongruous with macro F1, language support, and model size. △ Less

Submitted 29 January, 2025; originally announced January 2025.

Comments: Submitted to the International AAAI Conference on Web and Social Media (ICWSM) 2025

arXiv:2406.15443 [pdf, other]

ExU: AI Models for Examining Multilingual Disinformation Narratives and Understanding their Spread

Authors: Jake Vasilakes, Zhixue Zhao, Ivan Vykopal, Michal Gregor, Martin Hyben, Carolina Scarton

Abstract: Addressing online disinformation requires analysing narratives across languages to help fact-checkers and journalists sift through large amounts of data. The ExU project focuses on developing AI-based models for multilingual disinformation analysis, addressing the tasks of rumour stance classification and claim retrieval. We describe the ExU project proposal and summarise the results of a user req… ▽ More Addressing online disinformation requires analysing narratives across languages to help fact-checkers and journalists sift through large amounts of data. The ExU project focuses on developing AI-based models for multilingual disinformation analysis, addressing the tasks of rumour stance classification and claim retrieval. We describe the ExU project proposal and summarise the results of a user requirements survey regarding the design of tools to support fact-checking. △ Less

Submitted 30 May, 2024; originally announced June 2024.

Comments: Accepted at The 25th Annual Conference of The European Association for Machine Translation (EAMT 24)

arXiv:2204.00511 [pdf, other]

Learning Disentangled Representations of Negation and Uncertainty

Authors: Jake Vasilakes, Chrysoula Zerva, Makoto Miwa, Sophia Ananiadou

Abstract: Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation,… ▽ More Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation, uncertainty, and content using a Variational Autoencoder. We find that simply supervising the latent representations results in good disentanglement, but auxiliary objectives based on adversarial learning and mutual information minimization can provide additional disentanglement gains. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: Accepted to ACL 2022. 18 pages, 7 figures. Code and data are available at https://github.com/jvasilakes/disentanglement-vae

arXiv:2108.02255 [pdf]

An Empirical Study of UMLS Concept Extraction from Clinical Notes using Boolean Combination Ensembles

Authors: Greg M. Silverman, Raymond L. Finzel, Michael V. Heinz, Jake Vasilakes, Jacob C. Solinsky, Reed McEwan, Benjamin C. Knoll, Christopher J. Tignanelli, Hongfang Liu, Hua Xu, Xiaoqian Jiang, Genevieve B. Melton, Serguei VS Pakhomov

Abstract: Our objective in this study is to investigate the behavior of Boolean operators on combining annotation output from multiple Natural Language Processing (NLP) systems across multiple corpora and to assess how filtering by aggregation of Unified Medical Language System (UMLS) Metathesaurus concepts affects system performance for Named Entity Recognition (NER) of UMLS concepts. We used three corpora… ▽ More Our objective in this study is to investigate the behavior of Boolean operators on combining annotation output from multiple Natural Language Processing (NLP) systems across multiple corpora and to assess how filtering by aggregation of Unified Medical Language System (UMLS) Metathesaurus concepts affects system performance for Named Entity Recognition (NER) of UMLS concepts. We used three corpora annotated for UMLS concepts: 2010 i2b2 VA challenge set (31,161 annotations), Multi-source Integrated Platform for Answering Clinical Questions (MiPACQ) corpus (17,457 annotations including UMLS concept unique identifiers), and Fairview Health Services corpus (44,530 annotations). Our results showed that for UMLS concept matching, Boolean ensembling of the MiPACQ corpus trended towards higher performance over individual systems. Use of an approximate grid-search can help optimize the precision-recall tradeoff and can provide a set of heuristics for choosing an optimal set of ensembles. △ Less

Submitted 4 August, 2021; originally announced August 2021.

arXiv:2106.12741 [pdf]

Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Authors: Dalton Schutte, Jake Vasilakes, Anu Bompelli, Yuqi Zhou, Marcelo Fiszman, Hua Xu, Halil Kilicoglu, Jeffrey R. Bishop, Terrence Adam, Rui Zhang

Abstract: OBJECTIVE: Leverage existing biomedical NLP tools and DS domain terminology to produce a novel and comprehensive knowledge graph containing dietary supplement (DS) information for discovering interactions between DS and drugs, or Drug-Supplement Interactions (DSI). MATERIALS AND METHODS: We created SemRepDS (an extension of SemRep), capable of extracting semantic relations from abstracts by levera… ▽ More OBJECTIVE: Leverage existing biomedical NLP tools and DS domain terminology to produce a novel and comprehensive knowledge graph containing dietary supplement (DS) information for discovering interactions between DS and drugs, or Drug-Supplement Interactions (DSI). MATERIALS AND METHODS: We created SemRepDS (an extension of SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT-based model to remove incorrect relations before generating our knowledge graph (SuppKG). Two pathways are used to identify potential DS-Drug interactions which are then evaluated by medical professionals for mechanistic plausibility. RESULTS: Comparison analysis found that SemRepDS returned 206.9% more DS relations and 158.5% more DS entities than SemRep. The fine-tuned BERT model obtained an F1 score of 0.8605 and removed 43.86% of the relations, improving the precision of the relations by 26.4% compared to pre-filtering. SuppKG consists of 2,928 DS-specific nodes. Manual review of findings identified 44 (88%) proposed DS-Gene-Drug and 32 (64%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: The additional relations extracted using SemRepDS generated SuppKG that was used to find plausible DSI not found in the current literature. By the nature of the SuppKG, these interactions are unlikely to have been found using SemRep without the expanded DS terminology. CONCLUSION: We successfully extend SemRep to include DS information and produce SuppKG which can be used to find potential DS-Drug interactions. △ Less

Submitted 23 June, 2021; originally announced June 2021.

Comments: 14 pages, 4 tables, 1 figure

arXiv:1906.03171 [pdf]

Analyzing Social Media Data to Understand Consumers' Information Needs on Dietary Supplements

Authors: Rubina F. Rizvi, Yefeng Wang, Thao Nguyen, Jake Vasilakes, Jiang Bian, Zhe He, Rui Zhang

Abstract: Despite the high consumption of dietary supplements (DS), there are not many reliable, relevant, and comprehensive online resources that could satisfy information seekers. The purpose of this research study is to understand consumers' information needs on DS using topic modeling and to evaluate its accuracy in correctly identifying topics from social media. We retrieved 16,095 unique questions pos… ▽ More Despite the high consumption of dietary supplements (DS), there are not many reliable, relevant, and comprehensive online resources that could satisfy information seekers. The purpose of this research study is to understand consumers' information needs on DS using topic modeling and to evaluate its accuracy in correctly identifying topics from social media. We retrieved 16,095 unique questions posted on Yahoo! Answers relating to 438 unique DS ingredients mentioned in sub-section, "Alternative medicine" under the section, "Health". We implemented an unsupervised topic modeling method, Correlation Explanation (CorEx) to unveil the various topics consumers are most interested in. We manually reviewed the keywords of all the 200 topics generated by CorEx and assigned them to 38 health-related categories, corresponding to 12 higher-level groups. We found high accuracy (90-100%) in identifying questions that correctly align with the selected topics. The results could be used to guide us to generate a more comprehensive and structured DS resource based on consumers' information needs. △ Less

Submitted 7 June, 2019; originally announced June 2019.

Comments: This paper has been accepted by MEDINFO 2019

Showing 1–9 of 9 results for author: Vasilakes, J