Skip to main content

Showing 1–50 of 99 results for author: Gipp, B

.
  1. arXiv:2505.20118  [pdf, ps, other

    cs.CL cs.CR

    TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent

    Authors: Dominik Meier, Jan Philip Wahle, Paul Röttger, Terry Ruas, Bela Gipp

    Abstract: As large language models (LLMs) become integrated into sensitive workflows, concerns grow over their potential to leak confidential information. We propose TrojanStego, a novel threat model in which an adversary fine-tunes an LLM to embed sensitive context information into natural-looking outputs via linguistic steganography, without requiring explicit control over inference inputs. We introduce a… ▽ More

    Submitted 27 May, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

    Comments: 9 pages, 5 figures

  2. arXiv:2505.16686  [pdf, other

    cs.AI cs.CL

    SPaRC: A Spatial Pathfinding Reasoning Challenge

    Authors: Lars Benedikt Kaesberg, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Existing reasoning datasets saturate and fail to test abstract, multi-step problems, especially pathfinding and complex rule constraint satisfaction. We introduce SPaRC (Spatial Pathfinding Reasoning Challenge), a dataset of 1,000 2D grid pathfinding puzzles to evaluate spatial and symbolic reasoning, requiring step-by-step planning with arithmetic and geometric rules. Humans achieve near-perfect… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  3. arXiv:2504.19856  [pdf, other

    cs.CL

    Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language

    Authors: Anastasia Zhukova, Christian E. Matt, Terry Ruas, Bela Gipp

    Abstract: Domain-adaptive continual pretraining (DAPT) is a state-of-the-art technique that further trains a language model (LM) on its pretraining task, e.g., language masking. Although popular, it requires a significant corpus of domain-related data, which is difficult to obtain for specific domains in languages other than English, such as the process industry in the German language. This paper introduces… ▽ More

    Submitted 30 April, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

  4. arXiv:2502.19559  [pdf, other

    cs.CL

    Stay Focused: Problem Drift in Multi-Agent Debate

    Authors: Jonas Becker, Lars Benedikt Kaesberg, Andreas Stephan, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Multi-agent debate - multiple instances of large language models discussing problems in turn-based interaction - has shown promise for solving knowledge and reasoning tasks. However, these methods show limitations when solving complex problems that require longer reasoning chains. We analyze how multi-agent debate over multiple turns drifts away from the initial problem, thus harming task performa… ▽ More

    Submitted 21 May, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: 34 pages, 10 figures, 8 tables

    ACM Class: A.1; I.2.7

  5. arXiv:2502.19130  [pdf, other

    cs.MA cs.AI cs.CL

    Voting or Consensus? Decision-Making in Multi-Agent Debate

    Authors: Lars Benedikt Kaesberg, Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Much of the success of multi-agent debates depends on carefully choosing the right parameters. The decision-making protocol stands out as it can highly impact final model answers, depending on how decisions are reached. Systematic comparison of decision protocols is difficult because many studies alter multiple discussion parameters beyond the protocol. So far, it has been largely unknown how deci… ▽ More

    Submitted 27 May, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

  6. arXiv:2502.13001  [pdf, ps, other

    cs.AI cs.CL

    You need to MIMIC to get FAME: Solving Meeting Transcript Scarcity with a Multi-Agent Conversations

    Authors: Frederic Kirstein, Muneeb Khan, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Meeting summarization suffers from limited high-quality data, mainly due to privacy restrictions and expensive collection processes. We address this gap with FAME, a dataset of 500 meetings in English and 300 in German produced by MIMIC, our new multi-agent meeting synthesis framework that generates meeting transcripts on a given knowledge source by defining psychologically grounded participant pr… ▽ More

    Submitted 30 May, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: Accepted at ACL 2025 (Findings)

    Journal ref: ACL 2025 (Findings)

  7. arXiv:2502.11926  [pdf, ps, other

    cs.CL

    BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages

    Authors: Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, Jan Philip Wahle, Terry Ruas, Meriem Beloucif, Christine de Kock, Nirmal Surange, Daniela Teodorescu, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Alham Fikri Aji, Felermino D. M. A. Ali, Ilseyar Alimova, Vladimir Araujo, Nikolay Babakov, Naomi Baes, Ana-Maria Bucur, Andiswa Bukula, Guanqun Cao, Rodrigo Tufino Cardenas, Rendi Chevi, Chiamaka Ijeoma Chukwuneke, Alexandra Ciobotaru, Daryna Dementieva , et al. (23 additional authors not shown)

    Abstract: People worldwide use language in subtle and complex ways to express emotions. Although emotion recognition--an umbrella term for several NLP tasks--impacts various applications within NLP and beyond, most work in this area has focused on high-resource languages. This has led to significant disparities in research efforts and proposed solutions, particularly for under-resourced languages, which oft… ▽ More

    Submitted 29 May, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: Accepted at ACL2025 (Main)

  8. arXiv:2412.10008  [pdf, other

    cs.CL

    Automated Collection of Evaluation Dataset for Semantic Search in Low-Resource Domain Language

    Authors: Anastasia Zhukova, Christian E. Matt, Bela Gipp

    Abstract: Domain-specific languages that use a lot of specific terminology often fall into the category of low-resource languages. Collecting test datasets in a narrow domain is time-consuming and requires skilled human resources with domain knowledge and training for the annotation task. This study addresses the challenge of automated collecting test datasets to evaluate semantic search in low-resource dom… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: accepted in the First Workshop on Language Models for Low-Resource Languages (LoResLM) co-located with the 31st International Conference on Computational Linguistics (COLING 2025)

  9. arXiv:2411.18444  [pdf, other

    cs.CL cs.AI

    Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator

    Authors: Frederic Kirstein, Terry Ruas, Bela Gipp

    Abstract: The quality of meeting summaries generated by natural language generation (NLG) systems is hard to measure automatically. Established metrics such as ROUGE and BERTScore have a relatively low correlation with human judgments and fail to capture nuanced errors. Recent studies suggest using large language models (LLMs), which have the benefit of better context understanding and adaption of error def… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

    Journal ref: COLING 2025 Industry Track

  10. arXiv:2411.11081  [pdf, other

    cs.CL

    The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection

    Authors: Tomas Horych, Christoph Mandl, Terry Ruas, Andre Greiner-Petter, Bela Gipp, Akiko Aizawa, Timo Spinde

    Abstract: High annotation costs from hiring or crowdsourcing complicate the creation of large, high-quality datasets needed for training reliable text classifiers. Recent research suggests using Large Language Models (LLMs) to automate the annotation process, reducing these costs while maintaining data quality. LLMs have shown promising results in annotating downstream tasks like hate speech detection and p… ▽ More

    Submitted 24 January, 2025; v1 submitted 17 November, 2024; originally announced November 2024.

  11. Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting Summarization

    Authors: Frederic Kirstein, Terry Ruas, Robert Kratel, Bela Gipp

    Abstract: Meeting summarization is crucial in digital communication, but existing solutions struggle with salience identification to generate personalized, workable summaries, and context understanding to fully comprehend the meetings' content. Previous attempts to address these issues by considering related supplementary resources (e.g., presentation slides) alongside transcripts are hindered by models' li… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Journal ref: EMNLP 2024 Industry Track

  12. arXiv:2407.11919  [pdf, other

    cs.CL cs.AI

    What's Wrong? Refining Meeting Summaries with LLM Feedback

    Authors: Frederic Kirstein, Terry Ruas, Bela Gipp

    Abstract: Meeting summarization has become a critical task since digital encounters have become a common practice. Large language models (LLMs) show great potential in summarization, offering enhanced coherence and context understanding compared to traditional methods. However, they still struggle to maintain relevance and avoid hallucination. We introduce a multi-LLM correction approach for meeting summari… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Journal ref: COLING 2025

  13. arXiv:2407.03192  [pdf, other

    cs.DL cs.CL

    CiteAssist: A System for Automated Preprint Citation and BibTeX Generation

    Authors: Lars Benedikt Kaesberg, Terry Ruas, Jan Philip Wahle, Bela Gipp

    Abstract: We present CiteAssist, a system to automate the generation of BibTeX entries for preprints, streamlining the process of bibliographic annotation. Our system extracts metadata, such as author names, titles, publication dates, and keywords, to create standardized annotations within the document. CiteAssist automatically attaches the BibTeX citation to the end of a PDF and links it on the first page… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Published at SDProc @ ACL 2024

  14. arXiv:2407.02302  [pdf, other

    cs.CL

    Towards Human Understanding of Paraphrase Types in Large Language Models

    Authors: Dominik Meier, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Paraphrases represent a human's intuitive ability to understand expressions presented in various different ways. Current paraphrase evaluations of language models primarily use binary approaches, offering limited interpretability of specific text changes. Atomic paraphrase types (APT) decompose paraphrases into different linguistic changes and offer a granular view of the flexibility in linguistic… ▽ More

    Submitted 18 February, 2025; v1 submitted 2 July, 2024; originally announced July 2024.

    ACM Class: I.2.7

    Journal ref: Proceedings of the 31st International Conference on Computational Linguistics (2025), pages 6298-6316

  15. arXiv:2406.19898  [pdf, other

    cs.CL

    Paraphrase Types Elicit Prompt Engineering Capabilities

    Authors: Jan Philip Wahle, Terry Ruas, Yang Xu, Bela Gipp

    Abstract: Much of the success of modern language models depends on finding a suitable prompt to instruct the model. Until now, it has been largely unknown how variations in the linguistic expression of prompts affect these models. This study systematically and empirically evaluates which linguistic features influence models through paraphrase types, i.e., different linguistic changes at particular positions… ▽ More

    Submitted 10 January, 2025; v1 submitted 28 June, 2024; originally announced June 2024.

    Journal ref: EMNLP 2024

  16. CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization

    Authors: Frederic Kirstein, Jan Philip Wahle, Bela Gipp, Terry Ruas

    Abstract: Abstractive dialogue summarization is the task of distilling conversations into informative and concise summaries. Although reviews have been conducted on this topic, there is a lack of comprehensive work detailing the challenges of dialogue summarization, unifying the differing understanding of the task, and aligning proposed techniques, datasets, and evaluation metrics with the challenges. This… ▽ More

    Submitted 23 April, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Published in the Journal of Artificial Intelligence Research (JAIR) (https://www.jair.org/index.php/jair/article/view/16674)

    Journal ref: Journal of Artificial Intelligence Research (JAIR), Vol. 82, 2025

  17. arXiv:2405.15604  [pdf, other

    cs.CL

    Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges

    Authors: Jonas Becker, Jan Philip Wahle, Bela Gipp, Terry Ruas

    Abstract: Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using large language models, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text generation, s… ▽ More

    Submitted 29 August, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 35 pages, 2 figures, 2 tables, Under review

    ACM Class: A.1; I.2.7

  18. What's under the hood: Investigating Automatic Metrics on Meeting Summarization

    Authors: Frederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Meeting summarization has become a critical task considering the increase in online interactions. While new techniques are introduced regularly, their evaluation uses metrics not designed to capture meeting-specific errors, undermining effective evaluation. This paper investigates what the frequently used automatic metrics capture and which errors they mask by correlating automatic metric scores w… ▽ More

    Submitted 18 October, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Journal ref: EMNLP 2024

  19. arXiv:2404.00344  [pdf, other

    cs.CL cs.AI cs.IR

    Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

    Authors: Ankit Satpute, Noah Giessing, Andre Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Large Language Models (LLMs) have demonstrated exceptional capabilities in various natural language tasks, often achieving performances that surpass those of humans. Despite these advancements, the domain of mathematics presents a distinctive challenge, primarily due to its specialized structure and the precision it demands. In this study, we adopted a two-step approach for investigating the profi… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted for publication at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) July 14--18, 2024, Washington D.C.,USA

  20. arXiv:2403.07910  [pdf, other

    cs.CY cs.CL

    MAGPIE: Multi-Task Media-Bias Analysis Generalization for Pre-Trained Identification of Expressions

    Authors: Tomáš Horych, Martin Wessel, Jan Philip Wahle, Terry Ruas, Jerome Waßmuth, André Greiner-Petter, Akiko Aizawa, Bela Gipp, Timo Spinde

    Abstract: Media bias detection poses a complex, multifaceted problem traditionally tackled using single-task models and small in-domain datasets, consequently lacking generalizability. To address this, we introduce MAGPIE, the first large-scale multi-task pre-training approach explicitly tailored for media bias detection. To enable pre-training at scale, we present Large Bias Mixture (LBM), a compilation of… ▽ More

    Submitted 15 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

  21. arXiv:2402.12046  [pdf, other

    cs.DL cs.CL

    Citation Amnesia: On The Recency Bias of NLP and Other Academic Fields

    Authors: Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

    Abstract: This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023). We put NLP's propensity to cite older work in the context of these 20 other fields to analyze whether NLP shows similar temporal citation patterns to these other fields over time or whether differences can be observed. Our analysis, based on a dataset of approximately 240 million papers, revea… ▽ More

    Submitted 13 December, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Journal ref: COLING 2025

  22. arXiv:2402.02996  [pdf, other

    cs.LG cs.CV

    Text-Guided Image Clustering

    Authors: Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant, Benjamin Roth

    Abstract: Image clustering divides a collection of images into meaningful groups, typically interpreted post-hoc via human-given annotations. Those are usually in the form of text, begging the question of using text as an abstraction for image clustering. Current image clustering methods, however, neglect the use of generated textual descriptions. We, therefore, propose Text-Guided Image Clustering, i.e., g… ▽ More

    Submitted 19 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted to EACL 2024

  23. Taxonomy of Mathematical Plagiarism

    Authors: Ankit Satpute, Andre Greiner-Petter, Noah Gießing, Isabel Beckenbach, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating… ▽ More

    Submitted 31 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 46th European Conference on Information Retrieval (ECIR)

  24. arXiv:2312.16148  [pdf, other

    cs.CL

    The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias

    Authors: Timo Spinde, Smi Hinterreiter, Fabian Haak, Terry Ruas, Helge Giese, Norman Meuschke, Bela Gipp

    Abstract: The way the media presents events can significantly affect public perception, which in turn can alter people's beliefs and views. Media bias describes a one-sided or polarizing perspective on a topic. This article summarizes the research on computational methods to detect media bias by systematically reviewing 3140 research papers published between 2019 and 2022. To structure our review and suppor… ▽ More

    Submitted 10 January, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

  25. We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields

    Authors: Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

    Abstract: Natural Language Processing (NLP) is poised to substantially influence the world. However, significant progress comes hand-in-hand with substantial risks. Addressing them requires broad engagement with various fields of study. Yet, little empirical work examines the state of such engagement (past or current). In this paper, we quantify the degree of influence between 23 fields of study and NLP (on… ▽ More

    Submitted 16 July, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Published at EMNLP 2023

    Journal ref: EMNLP 2023

  26. Paraphrase Types for Generation and Detection

    Authors: Jan Philip Wahle, Bela Gipp, Terry Ruas

    Abstract: Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language. This paper introduces two new tasks to address this shortcoming by considering paraphrase types - specific linguistic perturbations at particular text positions. We name these tasks Paraphrase Type Generation and Paraphrase Type Dete… ▽ More

    Submitted 16 July, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Published at EMNLP 2023

    Journal ref: EMNLP 2023

  27. arXiv:2306.16143  [pdf, other

    cs.CL cs.AI cs.HC

    Generative User-Experience Research for Developing Domain-specific Natural Language Processing Applications

    Authors: Anastasia Zhukova, Lukas von Sperl, Christian E. Matt, Bela Gipp

    Abstract: User experience (UX) is a part of human-computer interaction (HCI) research and focuses on increasing intuitiveness, transparency, simplicity, and trust for the system users. Most UX research for machine learning (ML) or natural language processing (NLP) focuses on a data-driven methodology. It engages domain users mainly for usability evaluation. Moreover, more typical UX methods tailor the syste… ▽ More

    Submitted 5 August, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  28. arXiv:2305.16433  [pdf, other

    cs.CL cs.SC stat.AP

    Neural Machine Translation for Mathematical Formulae

    Authors: Felix Petersen, Moritz Schubotz, Andre Greiner-Petter, Bela Gipp

    Abstract: We tackle the problem of neural machine translation of mathematical formulae between ambiguous presentation languages and unambiguous content languages. Compared to neural machine translation on natural language, mathematical formulae have a much smaller vocabulary and much longer sequences of symbols, while their translation requires extreme precision to satisfy mathematical information needs. In… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Published at ACL 2023

  29. arXiv:2305.13193  [pdf, other

    cs.IR

    TEIMMA: The First Content Reuse Annotator for Text, Images, and Math

    Authors: Ankit Satpute, André Greiner-Petter, Moritz Schubotz, Norman Meuschke, Akiko Aizawa, Olaf Teschke, Bela Gipp

    Abstract: This demo paper presents the first tool to annotate the reuse of text, images, and mathematical formulae in a document pair -- TEIMMA. Annotating content reuse is particularly useful to develop plagiarism detection algorithms. Real-world content reuse is often obfuscated, which makes it challenging to identify such cases. TEIMMA allows entering the obfuscation type to enable novel classifications… ▽ More

    Submitted 13 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  30. Methods and Tools to Advance the Retrieval of Mathematical Knowledge from Digital Libraries for Search-, Recommendation-, and Assistance-Systems

    Authors: Bela Gipp, André Greiner-Petter, Moritz Schubotz, Norman Meuschke

    Abstract: This project investigated new approaches and technologies to enhance the accessibility of mathematical content and its semantic information for a broad range of information retrieval applications. To achieve this goal, the project addressed three main research challenges: (1) syntactic analysis of mathematical expressions, (2) semantic enrichment of mathematical expressions, and (3) evaluation usi… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: The final report for the DFG-Project MathIR - July 1st, 2018 - December 31st, 2022

    Report number: GI 1259-1 ACM Class: H.3.0

  31. Introducing Peer Copy -- A Fully Decentralized Peer-to-Peer File Transfer Tool

    Authors: Dennis Trautwein, Moritz Schubotz, Bela Gipp

    Abstract: It allows any two parties that are either both on the same network or connected via the internet to transfer the contents of a file based on a particular sequence of words. Peer discovery happens via multicast DNS if both peers are on the same network or via entries in the distributed hash table (DHT) of the InterPlanetary File-System (IPFS) if both peers are connected across network boundaries. A… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Journal ref: 2021 IFIP Networking Conference

  32. Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

    Authors: Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde

    Abstract: Although media bias detection is a complex multi-task problem, there is, to date, no unified benchmark grouping these evaluation tasks. We introduce the Media Bias Identification Benchmark (MBIB), a comprehensive benchmark that groups different types of media bias (e.g., linguistic, cognitive, political) under a common framework to test how prospective detection techniques generalize. After review… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: To be published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23)

  33. arXiv:2303.13989  [pdf, other

    cs.CL cs.AI

    Paraphrase Detection: Human vs. Machine Content

    Authors: Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: The growing prominence of large language models, such as GPT-4 and ChatGPT, has led to increased concerns over academic integrity due to the potential for machine-generated content and paraphrasing. Although studies have explored the detection of human- and machine-paraphrased content, the comparison between these types of content remains underexplored. In this paper, we conduct a comprehensive an… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  34. A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents

    Authors: Norman Meuschke, Apurva Jagdale, Timo Spinde, Jelena Mitrović, Bela Gipp

    Abstract: Extracting information from academic PDF documents is crucial for numerous indexing, retrieval, and analysis use cases. Choosing the best tool to extract specific content elements is difficult because many, technically diverse tools are available, but recent performance benchmarks are rare. Moreover, such benchmarks typically cover only a few content elements like header metadata or bibliographic… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: iConference 2023

  35. arXiv:2303.03886  [pdf, other

    cs.CY

    AI Usage Cards: Responsibly Reporting AI-generated Content

    Authors: Jan Philip Wahle, Terry Ruas, Saif M. Mohammad, Norman Meuschke, Bela Gipp

    Abstract: Given AI systems like ChatGPT can generate content that is indistinguishable from human-made work, the responsible use of this technology is a growing concern. Although understanding the benefits and harms of using AI systems requires more time, their rapid and indiscriminate adoption in practice is a reality. Currently, we lack a common framework and language to define and report the responsible… ▽ More

    Submitted 9 May, 2023; v1 submitted 16 February, 2023; originally announced March 2023.

  36. arXiv:2303.01994  [pdf, other

    cs.IR cs.LG

    Discovery and Recognition of Formula Concepts using Machine Learning

    Authors: Philipp Scharpf, Moritz Schubotz, Howard S. Cohl, Corinna Breitinger, Bela Gipp

    Abstract: Citation-based Information Retrieval (IR) methods for scientific documents have proven effective for IR applications, such as Plagiarism Detection or Literature Recommender Systems in academic disciplines that use many references. In science, technology, engineering, and mathematics, researchers often employ mathematical concepts through formula notation to refer to prior knowledge. Our long-term… ▽ More

    Submitted 19 March, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted by Scientometrics (Springer) journal

    MSC Class: 68P20 (Primary); 68T50 (Secondary) ACM Class: H.3.3; I.2.7

  37. Collaborative and AI-aided Exam Question Generation using Wikidata in Education

    Authors: Philipp Scharpf, Moritz Schubotz, Andreas Spitz, Andre Greiner-Petter, Bela Gipp

    Abstract: Since the COVID-19 outbreak, the use of digital learning or education platforms has significantly increased. Teachers now digitally distribute homework and provide exercise questions. In both cases, teachers need to continuously develop novel and individual questions. This process can be very time-consuming and should be facilitated and accelerated both through exchange with other teachers and by… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    MSC Class: 68Uxx ACM Class: H.4

  38. arXiv:2211.06664  [pdf

    cs.IR

    Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling

    Authors: Philipp Scharpf, Moritz Schubotz, Bela Gipp

    Abstract: The increasing number of questions on Question Answering (QA) platforms like Math Stack Exchange (MSE) signifies a growing information need to answer math-related questions. However, there is currently very little research on approaches for an open data QA system that retrieves mathematical formulae using their concept names or querying formula identifier relationships from knowledge graphs. In th… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    MSC Class: 68Uxx ACM Class: H.4

  39. Caching and Reproducibility: Making Data Science experiments faster and FAIRer

    Authors: Moritz Schubotz, Ankit Satpute, Andre Greiner-Petter, Akiko Aizawa, Bela Gipp

    Abstract: Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, o… ▽ More

    Submitted 9 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: 8 pages, 1 table

    Journal ref: Frontiers in Research Metrics and Analytics, volume 7, 2022

  40. Exploiting Transformer-based Multitask Learning for the Detection of Media Bias in News Articles

    Authors: Timo Spinde, Jan-David Krieger, Terry Ruas, Jelena Mitrović, Franz Götz-Hahn, Akiko Aizawa, Bela Gipp

    Abstract: Media has a substantial impact on the public perception of events. A one-sided or polarizing perspective on any topic is usually described as media bias. One of the ways how bias in news articles can be introduced is by altering word choice. Biased word choices are not always obvious, nor do they exhibit high context-dependency. Hence, detecting bias is often difficult. We propose a Transformer-ba… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Journal ref: Proceedings of the iConference 2022

  41. Analyzing Multi-Task Learning for Abstractive Text Summarization

    Authors: Frederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp

    Abstract: Despite the recent success of multi-task learning and pre-finetuning for natural language understanding, few works have studied the effects of task families on abstractive text summarization. Task families are a form of task grouping during the pre-finetuning stage to learn common skills, such as reading comprehension. To close this gap, we analyze the influence of multi-task learning strategies u… ▽ More

    Submitted 10 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Journal ref: EMNLP-GEM 2022

  42. arXiv:2210.06878  [pdf, other

    cs.CL cs.DL

    CS-Insights: A System for Analyzing Computer Science Research

    Authors: Terry Ruas, Jan Philip Wahle, Lennart Küll, Saif M. Mohammad, Bela Gipp

    Abstract: This paper presents CS-Insights, an interactive web application to analyze computer science publications from DBLP through multiple perspectives. The dedicated interfaces allow its users to identify trends in research activity, productivity, accessibility, author's productivity, venues' statistics, topics of interest, and the impact of computer science research on other fields. CS-Insightsis publi… ▽ More

    Submitted 29 January, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  43. How Large Language Models are Transforming Machine-Paraphrased Plagiarism

    Authors: Jan Philip Wahle, Terry Ruas, Frederic Kirstein, Bela Gipp

    Abstract: The recent success of large language models for text generation poses a severe threat to academic integrity, as plagiarists can generate realistic paraphrases indistinguishable from original work. However, the role of large autoregressive transformers in generating machine-paraphrased plagiarism and their detection is still developing in the literature. This work explores T5 and GPT-3 for machine-… ▽ More

    Submitted 10 November, 2022; v1 submitted 7 October, 2022; originally announced October 2022.

    Journal ref: EMNLP 2022

  44. Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts

    Authors: Timo Spinde, Manuel Plank, Jan-David Krieger, Terry Ruas, Bela Gipp, Akiko Aizawa

    Abstract: Media coverage has a substantial effect on the public perception of events. Nevertheless, media outlets are often biased. One way to bias news articles is by altering the word choice. The automatic identification of bias by word choice is challenging, primarily due to the lack of a gold standard data set and high context dependencies. This paper presents BABE, a robust and diverse data set created… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: substantial text overlap with Ph.D. proposal by same author, part of dissertation arXiv:2112.13352

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2021

  45. Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

    Authors: Dennis Trautwein, Aravindh Raman, Gareth Tyson, Ignacio Castro, Will Scott, Moritz Schubotz, Bela Gipp, Yiannis Psaras

    Abstract: Recent years have witnessed growing consolidation of web operations. For example, the majority of web traffic now originates from a few organizations, and even micro-websites often choose to host on large pre-existing cloud infrastructures. In response to this, the "Decentralized Web" attempts to distribute ownership and operation of web services more evenly. This paper describes the design and im… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: 14 pages, 11 figures

    ACM Class: C.2.2; C.2.1

    Journal ref: SIGCOMM '22, August 22-26, 2022, Amsterdam, Netherlands

  46. A Domain-adaptive Pre-training Approach for Language Bias Detection in News

    Authors: Jan-David Krieger, Timo Spinde, Terry Ruas, Juhi Kulshrestha, Bela Gipp

    Abstract: Media bias is a multi-faceted construct influencing individual behavior and collective decision-making. Slanted news reporting is the result of one-sided and polarized writing which can occur in various forms. In this work, we focus on an important form of media bias, i.e. bias by word choice. Detecting biased word choices is a challenging task due to its linguistic complexity and the lack of repr… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Journal ref: Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries 2022 (JCDL)

  47. arXiv:2204.13384  [pdf, other

    cs.DL cs.CL

    D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science Research

    Authors: Jan Philip Wahle, Terry Ruas, Saif M. Mohammad, Bela Gipp

    Abstract: DBLP is the largest open-access repository of scientific articles on computer science and provides metadata associated with publications, authors, and venues. We retrieved more than 6 million publications from DBLP and extracted pertinent metadata (e.g., abstracts, author affiliations, citations) from the publication texts to create the DBLP Discovery Dataset (D3). D3 can be used to identify trend… ▽ More

    Submitted 10 November, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Journal ref: LREC 2022

  48. arXiv:2203.14541  [pdf, other

    cs.IR cs.CL

    Specialized Document Embeddings for Aspect-based Similarity of Research Papers

    Authors: Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm

    Abstract: Document embeddings and similarity measures underpin content-based recommender systems, whereby a document is commonly represented as a single generic embedding. However, similarity computed on single vector representations provides only one perspective on document similarity that ignores which aspects make two documents alike. To address this limitation, aspect-based similarity measures have been… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at JCDL 2022

  49. arXiv:2202.06671  [pdf, other

    cs.CL

    Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

    Authors: Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

    Abstract: Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics. Prior work relies on discrete citation relations to generate contrast samples. However, discrete citations enforce a hard cut-off to similarity. This is counter-i… ▽ More

    Submitted 19 October, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Accepted to EMNLP 2022

  50. Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems

    Authors: André Greiner-Petter, Howard S. Cohl, Abdou Youssef, Moritz Schubotz, Avi Trost, Rajen Dey, Akiko Aizawa, Bela Gipp

    Abstract: Digital mathematical libraries assemble the knowledge of years of mathematical research. Numerous disciplines (e.g., physics, engineering, pure and applied mathematics) rely heavily on compendia gathered findings. Likewise, modern research applications rely more and more on computational solutions, which are often calculated and verified by computer algebra systems. Hence, the correctness, accurac… ▽ More

    Submitted 31 March, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

    Journal ref: In: TACAS, Apr. 2022, pp. 87-105