Skip to main content

Showing 1–9 of 9 results for author: Vasilakes, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.18916  [pdf, ps, other

    cs.CL

    SCRum-9: Multilingual Stance Classification over Rumours on Social Media

    Authors: Yue Li, Jake Vasilakes, Zhixue Zhao, Carolina Scarton

    Abstract: We introduce SCRum-9, a multilingual dataset for Rumour Stance Classification, containing 7,516 tweet-reply pairs from X. SCRum-9 goes beyond existing stance classification datasets by covering more languages (9), linking examples to more fact-checked claims (2.1k), and including complex annotations from multiple annotators to account for intra- and inter-annotator variability. Annotations were ma… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  2. arXiv:2504.00589  [pdf, ps, other

    cs.CL cs.LG

    Efficient Annotator Reliability Assessment with EffiARA

    Authors: Owen Cook, Jake Vasilakes, Ian Roberts, Xingyi Song

    Abstract: Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annot… ▽ More

    Submitted 3 June, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

  3. arXiv:2502.12225  [pdf, other

    cs.LG cs.AI

    Subjective Logic Encodings

    Authors: Jake Vasilakes, Chrysoula Zerva, Sophia Ananiadou

    Abstract: Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sent… ▽ More

    Submitted 20 March, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: We make our code publicly available at https://github.com/jvasilakes/SLEncodings

  4. arXiv:2501.17654  [pdf, other

    cs.CL cs.AI

    Exploring Vision Language Models for Multimodal and Multilingual Stance Detection

    Authors: Jake Vasilakes, Carolina Scarton, Zhixue Zhao

    Abstract: Social media's global reach amplifies the spread of information, highlighting the need for robust Natural Language Processing tasks like stance detection across languages and modalities. Prior research predominantly focuses on text-only inputs, leaving multimodal scenarios, such as those involving both images and text, relatively underexplored. Meanwhile, the prevalence of multimodal posts has inc… ▽ More

    Submitted 29 January, 2025; originally announced January 2025.

    Comments: Submitted to the International AAAI Conference on Web and Social Media (ICWSM) 2025

  5. arXiv:2406.15443  [pdf, other

    cs.CL cs.AI

    ExU: AI Models for Examining Multilingual Disinformation Narratives and Understanding their Spread

    Authors: Jake Vasilakes, Zhixue Zhao, Ivan Vykopal, Michal Gregor, Martin Hyben, Carolina Scarton

    Abstract: Addressing online disinformation requires analysing narratives across languages to help fact-checkers and journalists sift through large amounts of data. The ExU project focuses on developing AI-based models for multilingual disinformation analysis, addressing the tasks of rumour stance classification and claim retrieval. We describe the ExU project proposal and summarise the results of a user req… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

    Comments: Accepted at The 25th Annual Conference of The European Association for Machine Translation (EAMT 24)

  6. arXiv:2204.00511  [pdf, other

    cs.CL cs.LG

    Learning Disentangled Representations of Negation and Uncertainty

    Authors: Jake Vasilakes, Chrysoula Zerva, Makoto Miwa, Sophia Ananiadou

    Abstract: Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation,… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: Accepted to ACL 2022. 18 pages, 7 figures. Code and data are available at https://github.com/jvasilakes/disentanglement-vae

  7. arXiv:2108.02255  [pdf

    cs.CL

    An Empirical Study of UMLS Concept Extraction from Clinical Notes using Boolean Combination Ensembles

    Authors: Greg M. Silverman, Raymond L. Finzel, Michael V. Heinz, Jake Vasilakes, Jacob C. Solinsky, Reed McEwan, Benjamin C. Knoll, Christopher J. Tignanelli, Hongfang Liu, Hua Xu, Xiaoqian Jiang, Genevieve B. Melton, Serguei VS Pakhomov

    Abstract: Our objective in this study is to investigate the behavior of Boolean operators on combining annotation output from multiple Natural Language Processing (NLP) systems across multiple corpora and to assess how filtering by aggregation of Unified Medical Language System (UMLS) Metathesaurus concepts affects system performance for Named Entity Recognition (NER) of UMLS concepts. We used three corpora… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

  8. arXiv:2106.12741  [pdf

    cs.IR cs.CL

    Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

    Authors: Dalton Schutte, Jake Vasilakes, Anu Bompelli, Yuqi Zhou, Marcelo Fiszman, Hua Xu, Halil Kilicoglu, Jeffrey R. Bishop, Terrence Adam, Rui Zhang

    Abstract: OBJECTIVE: Leverage existing biomedical NLP tools and DS domain terminology to produce a novel and comprehensive knowledge graph containing dietary supplement (DS) information for discovering interactions between DS and drugs, or Drug-Supplement Interactions (DSI). MATERIALS AND METHODS: We created SemRepDS (an extension of SemRep), capable of extracting semantic relations from abstracts by levera… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

    Comments: 14 pages, 4 tables, 1 figure

  9. arXiv:1906.03171  [pdf

    cs.CY

    Analyzing Social Media Data to Understand Consumers' Information Needs on Dietary Supplements

    Authors: Rubina F. Rizvi, Yefeng Wang, Thao Nguyen, Jake Vasilakes, Jiang Bian, Zhe He, Rui Zhang

    Abstract: Despite the high consumption of dietary supplements (DS), there are not many reliable, relevant, and comprehensive online resources that could satisfy information seekers. The purpose of this research study is to understand consumers' information needs on DS using topic modeling and to evaluate its accuracy in correctly identifying topics from social media. We retrieved 16,095 unique questions pos… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: This paper has been accepted by MEDINFO 2019