Skip to main content

Showing 1–20 of 20 results for author: Müller-Budack, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.12751  [pdf, other

    cs.IR cs.CV cs.LG

    Patent Figure Classification using Large Vision-language Models

    Authors: Sushil Awale, Eric Müller-Budack, Ralph Ewerth

    Abstract: Patent figure classification facilitates faceted search in patent retrieval systems, enabling efficient prior art search. Existing approaches have explored patent figure classification for only a single aspect and for aspects with a limited number of concepts. In recent years, large vision-language models (LVLMs) have shown tremendous performance across numerous computer vision downstream tasks, h… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  2. arXiv:2501.11403  [pdf, other

    cs.CL cs.IR cs.MM

    Verifying Cross-modal Entity Consistency in News using Vision-language Models

    Authors: Sahar Tahmasebi, David Ernst, Eric Müller-Budack, Ralph Ewerth

    Abstract: The web has become a crucial source of information, but it is also used to spread disinformation, often conveyed through multiple modalities like images and text. The identification of inconsistent cross-modal information, in particular entities such as persons, locations, and events, is critical to detect disinformation. Previous works either identify out-of-context disinformation by assessing th… ▽ More

    Submitted 31 January, 2025; v1 submitted 20 January, 2025; originally announced January 2025.

    Comments: Accepted for publication in: European Conference on Information Retrieval (ECIR) 2025

  3. arXiv:2407.14321  [pdf, other

    cs.CL cs.IR cs.MM

    Multimodal Misinformation Detection using Large Vision-Language Models

    Authors: Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth

    Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for misinformation detection and fact checking. Recent advances on large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with misinformation detection remains relatively underexplored. Most of existing… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in: Conference on Information and Knowledge Management (CIKM) 2024

  4. arXiv:2307.10471  [pdf, other

    cs.CV cs.AI cs.DL cs.IR cs.LG

    Classification of Visualization Types and Perspectives in Patents

    Authors: Junaid Ahmed Ghauri, Eric Müller-Budack, Ralph Ewerth

    Abstract: Due to the swift growth of patent applications each year, information and multimedia retrieval approaches that facilitate patent exploration and retrieval are of utmost importance. Different types of visualizations (e.g., graphs, technical drawings) and perspectives (e.g., side view, perspective) are used to visualize details of innovations in patents. The classification of these images enables a… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted in International Conference on Theory and Practice of Digital Libraries (TPDL) 2023 (They have the copyright to publish camera-ready version of this work)

  5. arXiv:2305.18599  [pdf, other

    cs.CL cs.IR cs.LG cs.MM

    Improving Generalization for Multimodal Fake News Detection

    Authors: Sahar Tahmasebi, Sherzod Hakimov, Ralph Ewerth, Eric Müller-Budack

    Abstract: The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In thi… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: This paper has been accepted for ICMR 2023

  6. arXiv:2211.08042  [pdf, other

    cs.IR

    MM-Locate-News: Multimodal Focus Location Estimation in News

    Authors: Golsa Tahmasebzadeh, Eric Müller-Budack, Sherzod Hakimov, Ralph Ewerth

    Abstract: The consumption of news has changed significantly as the Web has become the most influential medium for information. To analyze and contextualize the large amount of news published every day, the geographic focus of an article is an important aspect in order to enable content-based news retrieval. There are methods and datasets for geolocation estimation from text or photos, but they are typically… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  7. arXiv:2205.01989  [pdf, other

    cs.CL cs.AI cs.CV cs.MM cs.SI

    MM-Claims: A Dataset for Multimodal Claim Detection in Social Media

    Authors: Gullal S. Cheema, Sherzod Hakimov, Abdul Sittar, Eric Müller-Budack, Christian Otto, Ralph Ewerth

    Abstract: In recent years, the problem of misinformation on the web has become widespread across languages, countries, and various social media platforms. Although there has been much work on automated fake news detection, the role of images and their variety are not well explored. In this paper, we investigate the roles of image and text at an earlier stage of the fake news detection pipeline, called claim… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted to Findings of NAACL 2022

  8. arXiv:2110.11107  [pdf, other

    cs.CV

    Extraction of Positional Player Data from Broadcast Soccer Videos

    Authors: Jonas Theiner, Wolfgang Gritz, Eric Müller-Budack, Robert Rein, Daniel Memmert, Ralph Ewerth

    Abstract: Computer-aided support and analysis are becoming increasingly important in the modern world of sports. The scouting of potential prospective players, performance as well as match analysis, and the monitoring of training programs rely more and more on data-driven technologies to ensure success. Therefore, many approaches require large amounts of data, which are, however, not easy to obtain in gener… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted for publication at WACV'22; Preprint

  9. arXiv:2106.09432  [pdf, other

    cs.CV cs.LG

    Unsupervised Training Data Generation of Handwritten Formulas using Generative Adversarial Networks with Self-Attention

    Authors: Matthias Springstein, Eric Müller-Budack, Ralph Ewerth

    Abstract: The recognition of handwritten mathematical expressions in images and video frames is a difficult and unsolved problem yet. Deep convectional neural networks are basically a promising approach, but typically require a large amount of labeled training data. However, such a large training dataset does not exist for the task of handwritten formula recognition. In this paper, we introduce a system tha… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: Accepted for publication in: ACM International Conference on Multimedia Retrieval (ICMR) Workshop 2021

  10. arXiv:2106.08829  [pdf, other

    cs.SI cs.CL cs.CV

    A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods

    Authors: Gullal S. Cheema, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth

    Abstract: Opinion and sentiment analysis is a vital task to characterize subjective information in social media posts. In this paper, we present a comprehensive experimental evaluation and comparison with six state-of-the-art methods, from which we have re-implemented one of them. In addition, we investigate different textual and visual feature embeddings that cover different aspects of the content, as well… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: Accepted in Workshop on Multi-ModalPre-Training for Multimedia Understanding (MMPT 2021), co-located with ICMR 2021

  11. arXiv:2104.14995  [pdf, other

    cs.CV

    Interpretable Semantic Photo Geolocation

    Authors: Jonas Theiner, Eric Mueller-Budack, Ralph Ewerth

    Abstract: Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their… ▽ More

    Submitted 20 October, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

    Comments: Accepted for publication at WACV'22

  12. arXiv:2104.14994  [pdf, other

    cs.IR cs.MM

    GeoWINE: Geolocation based Wiki, Image,News and Event Retrieval

    Authors: Golsa Tahmasebzadeh, Endri Kacupaj, Eric Müller-Budack, Sherzod Hakimov, Jens Lehmann, Ralph Ewerth

    Abstract: In the context of social media, geolocation inference on news or events has become a very important task. In this paper, we present the GeoWINE (Geolocation-based Wiki-Image-News-Event retrieval) demonstrator, an effective modular system for multimodal retrieval which expects only a single image as input. The GeoWINE system consists of five modules in order to retrieve related information from var… ▽ More

    Submitted 4 May, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

    Comments: Accepted for publication in: International ACM SIGIR Conference on Research and Development in Information Retrieval 2021

  13. arXiv:2104.13748  [pdf, other

    cs.IR cs.MM

    QuTI! Quantifying Text-Image Consistency in Multimodal Documents

    Authors: Matthias Springstein, Eric Müller-Budack, Ralph Ewerth

    Abstract: The World Wide Web and social media platforms have become popular sources for news and information. Typically, multimodal information, e.g., image and text is used to convey information more effectively and to attract attention. While in most cases image content is decorative or depicts additional information, it has also been leveraged to spread misinformation and rumors in recent years. In this… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: Accepted for publication in: International ACM SIGIR Conference on Research and Development in Information Retrieval 2021

  14. arXiv:2103.09602  [pdf, other

    cs.SI cs.CL cs.CV

    On the Role of Images for Analyzing Claims in Social Media

    Authors: Gullal S. Cheema, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth

    Abstract: Fake news is a severe problem in social media. In this paper, we present an empirical study on visual, textual, and multimodal models for the tasks of claim, claim check-worthiness, and conspiracy detection, all of which are related to fake news detection. Recent work suggests that images are more influential than text and often appear alongside fake text. To this end, several multimodal models ha… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Comments: CLEOPATRA-2021 Workshop co-located with The Web Conf 2021

  15. arXiv:2011.04714  [pdf, other

    cs.CV

    Ontology-driven Event Type Classification in Images

    Authors: Eric Müller-Budack, Matthias Springstein, Sherzod Hakimov, Kevin Mrutzek, Ralph Ewerth

    Abstract: Event classification can add valuable information for semantic search and the increasingly important topic of fact validation in news. So far, only few approaches address image classification for newsworthy event types such as natural disasters, sports events, or elections. Previous work distinguishes only between a limited number of event types and relies on rather small datasets for training. In… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted for publication in: IEEE Winter Conference on Applications of Computer Vision (WACV) 2021

  16. arXiv:2007.06390  [pdf, other

    cs.CL cs.IR cs.LG

    A Feature Analysis for Multimodal News Retrieval

    Authors: Golsa Tahmasebzadeh, Sherzod Hakimov, Eric Müller-Budack, Ralph Ewerth

    Abstract: Content-based information retrieval is based on the information contained in documents rather than using metadata such as keywords. Most information retrieval methods are either based on text or image. In this paper, we investigate the usefulness of multimodal features for cross-lingual news search in various domains: politics, health, environment, sport, and finance. To this end, we consider five… ▽ More

    Submitted 1 October, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: CLEOPATRA Workshop co-located with ESWC 2020

    Journal ref: CLEOPATRA Workshop co-located with ESWC 2020

  17. arXiv:2003.10421  [pdf, other

    cs.CL cs.IR cs.MM

    Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency

    Authors: Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, Ralph Ewerth

    Abstract: The World Wide Web has become a popular source for gathering information and news. Multimodal information, e.g., enriching text with photos, is typically used to convey the news more effectively or to attract attention. Photo content can range from decorative, depict additional important information, or can even contain misleading information. Therefore, automatic approaches to quantify cross-moda… ▽ More

    Submitted 23 October, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

    Comments: Accepted for publication in: International Conference on Multimedia Retrieval (ICMR), Dublin, 2020

  18. arXiv:2001.06823  [pdf, other

    cs.CV cs.IR

    SlideImages: A Dataset for Educational Image Classification

    Authors: David Morris, Eric Müller-Budack, Ralph Ewerth

    Abstract: In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received li… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: 8 pages, 2 figures, to be presented at ECIR 2020

  19. arXiv:1910.00412  [pdf, other

    cs.LG cs.AI

    "Does 4-4-2 exist?" -- An Analytics Approach to Understand and Classify Football Team Formations in Single Match Situations

    Authors: Eric Müller-Budack, Jonas Theiner, Robert Rein, Ralph Ewerth

    Abstract: The chances to win a football match can be significantly increased if the right tactic is chosen and the behavior of the opposite team is well anticipated. For this reason, every professional football club employs a team of game analysts. However, at present game performance analysis is done manually and therefore highly time-consuming. Consequently, automated tools to support the analysis process… ▽ More

    Submitted 2 September, 2019; originally announced October 2019.

    Comments: Accepted at MMSports 2019 (Workshop of ACM Multimedia 2019)

  20. Finding Person Relations in Image Data of the Internet Archive

    Authors: Eric Müller-Budack, Kader Pustu-Iren, Sebastian Diering, Ralph Ewerth

    Abstract: The multimedia content in the World Wide Web is rapidly growing and contains valuable information for many applications in different domains. For this reason, the Internet Archive initiative has been gathering billions of time-versioned web pages since the mid-nineties. However, the huge amount of data is rarely labeled with appropriate metadata and automatic approaches are required to enable sema… ▽ More

    Submitted 28 May, 2019; v1 submitted 21 June, 2018; originally announced June 2018.

    Journal ref: In: Méndez E., Crestani F., Ribeiro C., David G., Lopes J. (eds) Digital Libraries for Open Knowledge. TPDL 2018. Lecture Notes in Computer Science, vol 11057. Springer, Cham