-
REANIMATOR: Reanimate Retrieval Test Collections with Extracted and Synthetic Resources
Authors:
Björn Engelmann,
Fabian Haak,
Philipp Schaer,
Mani Erfanian Abdoust,
Linus Netze,
Meik Bittkowski
Abstract:
Retrieval test collections are essential for evaluating information retrieval systems, yet they often lack generalizability across tasks. To overcome this limitation, we introduce REANIMATOR, a versatile framework designed to enable the repurposing of existing test collections by enriching them with extracted and synthetic resources. REANIMATOR enhances test collections from PDF files by parsing f…
▽ More
Retrieval test collections are essential for evaluating information retrieval systems, yet they often lack generalizability across tasks. To overcome this limitation, we introduce REANIMATOR, a versatile framework designed to enable the repurposing of existing test collections by enriching them with extracted and synthetic resources. REANIMATOR enhances test collections from PDF files by parsing full texts and machine-readable tables, as well as related contextual information. It then employs state-of-the-art large language models to produce synthetic relevance labels. Including an optional human-in-the-loop step can help validate the resources that have been extracted and generated. We demonstrate its potential with a revitalized version of the TREC-COVID test collection, showcasing the development of a retrieval-augmented generation system and evaluating the impact of tables on retrieval-augmented generation. REANIMATOR enables the reuse of test collections for new applications, lowering costs and broadening the utility of legacy resources.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Investigating Bias in Political Search Query Suggestions by Relative Comparison with LLMs
Authors:
Fabian Haak,
Björn Engelmann,
Christin Katharina Kreutz,
Philipp Schaer
Abstract:
Search query suggestions affect users' interactions with search engines, which then influences the information they encounter. Thus, bias in search query suggestions can lead to exposure to biased search results and can impact opinion formation. This is especially critical in the political domain. Detecting and quantifying bias in web search engines is difficult due to its topic dependency, comple…
▽ More
Search query suggestions affect users' interactions with search engines, which then influences the information they encounter. Thus, bias in search query suggestions can lead to exposure to biased search results and can impact opinion formation. This is especially critical in the political domain. Detecting and quantifying bias in web search engines is difficult due to its topic dependency, complexity, and subjectivity. The lack of context and phrasality of query suggestions emphasizes this problem. In a multi-step approach, we combine the benefits of large language models, pairwise comparison, and Elo-based scoring to identify and quantify bias in English search query suggestions. We apply our approach to the U.S. political news domain and compare bias in Google and Bing.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Leveraging Data Augmentation for Process Information Extraction
Authors:
Julian Neuberger,
Leonie Doll,
Benedict Engelmann,
Lars Ackermann,
Stefan Jablonski
Abstract:
Business Process Modeling projects often require formal process models as a central component. High costs associated with the creation of such formal process models motivated many different fields of research aimed at automated generation of process models from readily available data. These include process mining on event logs, and generating business process models from natural language texts. Re…
▽ More
Business Process Modeling projects often require formal process models as a central component. High costs associated with the creation of such formal process models motivated many different fields of research aimed at automated generation of process models from readily available data. These include process mining on event logs, and generating business process models from natural language texts. Research in the latter field is regularly faced with the problem of limited data availability, hindering both evaluation and development of new techniques, especially learning-based ones.
To overcome this data scarcity issue, in this paper we investigate the application of data augmentation for natural language text data. Data augmentation methods are well established in machine learning for creating new, synthetic data without human assistance. We find that many of these methods are applicable to the task of business process information extraction, improving the accuracy of extraction. Our study shows, that data augmentation is an important component in enabling machine learning methods for the task of business process model generation from natural language text, where currently mostly rule-based systems are still state of the art. Simple data augmentation techniques improved the $F_1$ score of mention extraction by 2.9 percentage points, and the $F_1$ of relation extraction by $4.5$. To better understand how data augmentation alters human annotated texts, we analyze the resulting text, visualizing and discussing the properties of augmented textual data.
We make all code and experiments results publicly available.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Context-Driven Interactive Query Simulations Based on Generative Large Language Models
Authors:
Björn Engelmann,
Timo Breuer,
Jana Isabelle Friese,
Philipp Schaer,
Norbert Fuhr
Abstract:
Simulating user interactions enables a more user-oriented evaluation of information retrieval (IR) systems. While user simulations are cost-efficient and reproducible, many approaches often lack fidelity regarding real user behavior. Most notably, current user models neglect the user's context, which is the primary driver of perceived relevance and the interactions with the search results. To this…
▽ More
Simulating user interactions enables a more user-oriented evaluation of information retrieval (IR) systems. While user simulations are cost-efficient and reproducible, many approaches often lack fidelity regarding real user behavior. Most notably, current user models neglect the user's context, which is the primary driver of perceived relevance and the interactions with the search results. To this end, this work introduces the simulation of context-driven query reformulations. The proposed query generation methods build upon recent Large Language Model (LLM) approaches and consider the user's context throughout the simulation of a search session. Compared to simple context-free query generation approaches, these methods show better effectiveness and allow the simulation of more efficient IR sessions. Similarly, our evaluations consider more interaction context than current session-based measures and reveal interesting complementary insights in addition to the established evaluation protocols. We conclude with directions for future work and provide an entirely open experimental setup.
△ Less
Submitted 25 January, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Simulating Users in Interactive Web Table Retrieval
Authors:
Björn Engelmann,
Timo Breuer,
Philipp Schaer
Abstract:
Considering the multimodal signals of search items is beneficial for retrieval effectiveness. Especially in web table retrieval (WTR) experiments, accounting for multimodal properties of tables boosts effectiveness. However, it still remains an open question how the single modalities affect user experience in particular. Previous work analyzed WTR performance in ad-hoc retrieval benchmarks, which…
▽ More
Considering the multimodal signals of search items is beneficial for retrieval effectiveness. Especially in web table retrieval (WTR) experiments, accounting for multimodal properties of tables boosts effectiveness. However, it still remains an open question how the single modalities affect user experience in particular. Previous work analyzed WTR performance in ad-hoc retrieval benchmarks, which neglects interactive search behavior and limits the conclusion about the implications for real-world user environments.
To this end, this work presents an in-depth evaluation of simulated interactive WTR search sessions as a more cost-efficient and reproducible alternative to real user studies. As a first of its kind, we introduce interactive query reformulation strategies based on Doc2Query, incorporating cognitive states of simulated user knowledge. Our evaluations include two perspectives on user effectiveness by considering different cost paradigms, namely query-wise and time-oriented measures of effort. Our multi-perspective evaluation scheme reveals new insights about query strategies, the impact of modalities, and different user types in simulated WTR search sessions.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Text Simplification of Scientific Texts for Non-Expert Readers
Authors:
Björn Engelmann,
Fabian Haak,
Christin Katharina Kreutz,
Narjes Nikzad Khasmakhi,
Philipp Schaer
Abstract:
Reading levels are highly individual and can depend on a text's language, a person's cognitive abilities, or knowledge on a topic. Text simplification is the task of rephrasing a text to better cater to the abilities of a specific target reader group. Simplification of scientific abstracts helps non-experts to access the core information by bypassing formulations that require domain or expert know…
▽ More
Reading levels are highly individual and can depend on a text's language, a person's cognitive abilities, or knowledge on a topic. Text simplification is the task of rephrasing a text to better cater to the abilities of a specific target reader group. Simplification of scientific abstracts helps non-experts to access the core information by bypassing formulations that require domain or expert knowledge. This is especially relevant for, e.g., cancer patients reading about novel treatment options. The SimpleText lab hosts the simplification of scientific abstracts for non-experts (Task 3) to advance this field. We contribute three runs employing out-of-the-box summarization models (two based on T5, one based on PEGASUS) and one run using ChatGPT with complex phrase identification.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
A Sound and Complete Hoare Logic for Dynamically-Typed, Object-Oriented Programs -- Extended Version --
Authors:
Björn Engelmann,
Ernst-Rüdiger Olderog
Abstract:
A simple dynamically-typed, (purely) object-oriented language is defined. A structural operational semantics as well as a Hoare-style program logic for reasoning about programs in the language in multiple notions of correctness are given. The Hoare logic is proved to be both sound and (relative) complete and is -- to the best of our knowledge -- the first such logic presented for a dynamically-typ…
▽ More
A simple dynamically-typed, (purely) object-oriented language is defined. A structural operational semantics as well as a Hoare-style program logic for reasoning about programs in the language in multiple notions of correctness are given. The Hoare logic is proved to be both sound and (relative) complete and is -- to the best of our knowledge -- the first such logic presented for a dynamically-typed language.
△ Less
Submitted 8 January, 2016; v1 submitted 29 September, 2015;
originally announced September 2015.
-
Closing the Gap -- Formally Verifying Dynamically Typed Programs like Statically Typed Ones Using Hoare Logic -- Extended Version --
Authors:
Björn Engelmann,
Ernst-Rüdiger Olderog,
Nils Erik Flick
Abstract:
Dynamically typed object-oriented languages enable programmers to write elegant, reusable and extensible programs. However, with the current methodology for program verification, the absence of static type information creates significant overhead. Our proposal is two-fold:
First, we propose a layer of abstraction hiding the complexity of dynamic typing when provided with sufficient type informat…
▽ More
Dynamically typed object-oriented languages enable programmers to write elegant, reusable and extensible programs. However, with the current methodology for program verification, the absence of static type information creates significant overhead. Our proposal is two-fold:
First, we propose a layer of abstraction hiding the complexity of dynamic typing when provided with sufficient type information. Since this essentially creates the illusion of verifying a statically-typed program, the effort required is equivalent to the statically-typed case.
Second, we show how the required type information can be efficiently derived for all type-safe programs by integrating a type inference algorithm into Hoare logic, yielding a semi-automatic procedure allowing the user to focus on those typing problems really requiring his attention. While applying type inference to dynamically typed programs is a well-established method by now, our approach complements conventional soft typing systems by offering formal proof as a third option besides modifying the program (static typing) and accepting the presence of runtime type errors (dynamic typing).
△ Less
Submitted 12 January, 2015;
originally announced January 2015.