Search | arXiv e-print repository

Verifying Cross-modal Entity Consistency in News using Vision-language Models

Authors: Sahar Tahmasebi, David Ernst, Eric Müller-Budack, Ralph Ewerth

Abstract: The web has become a crucial source of information, but it is also used to spread disinformation, often conveyed through multiple modalities like images and text. The identification of inconsistent cross-modal information, in particular entities such as persons, locations, and events, is critical to detect disinformation. Previous works either identify out-of-context disinformation by assessing th… ▽ More The web has become a crucial source of information, but it is also used to spread disinformation, often conveyed through multiple modalities like images and text. The identification of inconsistent cross-modal information, in particular entities such as persons, locations, and events, is critical to detect disinformation. Previous works either identify out-of-context disinformation by assessing the consistency of images to the whole document, neglecting relations of individual entities, or focus on generic entities that are not relevant to news. So far, only few approaches have addressed the task of validating entity consistency between images and text in news. However, the potential of large vision-language models (LVLMs) has not been explored yet. In this paper, we propose an LVLM-based framework for verifying Cross-modal Entity Consistency~(LVLM4CEC), to assess whether persons, locations and events in news articles are consistent across both modalities. We suggest effective prompting strategies for LVLMs for entity verification that leverage reference images crawled from web. Moreover, we extend three existing datasets for the task of entity verification in news providing manual ground-truth data. Our results show the potential of LVLMs for automating cross-modal entity verification, showing improved accuracy in identifying persons and events when using evidence images. Moreover, our method outperforms a baseline for location and event verification in documents. The datasets and source code are available on GitHub at https://github.com/TIBHannover/LVLM4CEC. △ Less

Submitted 31 January, 2025; v1 submitted 20 January, 2025; originally announced January 2025.

Comments: Accepted for publication in: European Conference on Information Retrieval (ECIR) 2025

arXiv:2407.14321 [pdf, other]

Multimodal Misinformation Detection using Large Vision-Language Models

Authors: Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth

Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for misinformation detection and fact checking. Recent advances on large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with misinformation detection remains relatively underexplored. Most of existing… ▽ More The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for misinformation detection and fact checking. Recent advances on large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with misinformation detection remains relatively underexplored. Most of existing state-of-the-art approaches either do not consider evidence and solely focus on claim related features or assume the evidence to be provided. Few approaches consider evidence retrieval as part of the misinformation detection but rely on fine-tuning models. In this paper, we investigate the potential of LLMs for misinformation detection in a zero-shot setting. We incorporate an evidence retrieval component into the process as it is crucial to gather pertinent information from various sources to detect the veracity of claims. To this end, we propose a novel re-ranking approach for multimodal evidence retrieval using both LLMs and large vision-language models (LVLM). The retrieved evidence samples (images and texts) serve as the input for an LVLM-based approach for multimodal fact verification (LVLM4FV). To enable a fair evaluation, we address the issue of incomplete ground truth for evidence samples in an existing evidence retrieval dataset by annotating a more complete set of evidence samples for both image and text retrieval. Our experimental results on two datasets demonstrate the superiority of the proposed approach in both evidence retrieval and fact verification tasks and also better generalization capability across dataset compared to the supervised baseline. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: Accepted for publication in: Conference on Information and Knowledge Management (CIKM) 2024

arXiv:2311.07453 [pdf, other]

ChartCheck: Explainable Fact-Checking over Real-World Chart Images

Authors: Mubashara Akhtar, Nikesh Subedi, Vivek Gupta, Sahar Tahmasebi, Oana Cocarascu, Elena Simperl

Abstract: Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and communicate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this pa… ▽ More Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and communicate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this paper, we introduce ChartCheck, a novel, large-scale dataset for explainable fact-checking against real-world charts, consisting of 1.7k charts and 10.5k human-written claims and explanations. We systematically evaluate ChartCheck using vision-language and chart-to-table models, and propose a baseline to the community. Finally, we study chart reasoning types and visual attributes that pose a challenge to these models △ Less

Submitted 16 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

arXiv:2305.18599 [pdf, other]

doi 10.1145/3591106.3592230

Improving Generalization for Multimodal Fake News Detection

Authors: Sahar Tahmasebi, Sherzod Hakimov, Ralph Ewerth, Eric Müller-Budack

Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In thi… ▽ More The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In this paper, we propose three models that adopt and fine-tune state-of-the-art multimodal transformers for multimodal fake news detection. We conduct an in-depth analysis by manipulating the input data aimed to explore models performance in realistic use cases on social media. Our study across multiple models demonstrates that these systems suffer significant performance drops against manipulated data. To reduce the bias and improve model generalization, we suggest training data augmentation to conduct more meaningful experiments for fake news detection on social media. The proposed data augmentation techniques enable models to generalize better and yield improved state-of-the-art results. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: This paper has been accepted for ICMR 2023

arXiv:2006.03994 [pdf, other]

A Scalable Architecture for Monitoring IoT Devices Using Ethereum and Fog Computing

Authors: Shirin Tahmasebi, Jafar Habibi, Abolhassan Shamsaie

Abstract: With the recent considerable developments in the Internet of Things (IoT), billions of resource-constrained devices are interconnected through the internet. Monitoring this huge number of IoT devices that are heterogeneous in terms of underlying communication protocols and data format is challenging. The majority of existing IoT device monitoring solutions heavily rely on centralized architectures… ▽ More With the recent considerable developments in the Internet of Things (IoT), billions of resource-constrained devices are interconnected through the internet. Monitoring this huge number of IoT devices that are heterogeneous in terms of underlying communication protocols and data format is challenging. The majority of existing IoT device monitoring solutions heavily rely on centralized architectures. Since using centralized architectures comes at the expense of trusting an authority, it has several inherent drawbacks, including vulnerability to security attacks, lack of data privacy, and unauthorized data manipulation. Hence, a new decentralized approach is crucial to remedy these drawbacks. One of the most promising technologies which is widely used to provide decentralization is blockchain. Additionally, to ease the burden of communication overhead and computational power on resource-constrained IoT devices, fog computing can be exploited to decrease communication latency and provide better network scalability. In this paper, we propose a scalable blockchain-based architecture for monitoring IoT devices using fog computing. To demonstrate the feasibility and usability of the proposed solution, we have implemented a proof-of-concept prototype, leveraging Ethereum smart contracts. Finally, a comprehensive evaluation is conducted. The evaluation results indicate that the proposed solution is significantly scalable and compatible with resource-constrained IoT devices. △ Less

Submitted 11 November, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

Showing 1–5 of 5 results for author: Tahmasebi, S