Showing 1–2 of 2 results for author: Wrzalik, M

Search v0.5.6 released 2020-02-24

arXiv:2412.06432 [pdf, ps, other]

cs.LG cs.CL

Integrating Expert Labels into LLM-based Emission Goal Detection: Example Selection vs Automatic Prompt Design

Authors: Marco Wrzalik, Adrian Ulges, Anne Uersfeld, Florian Faust

Abstract: We address the detection of emission reduction goals in corporate reports, an important task for monitoring companies' progress in addressing climate change. Specifically, we focus on the issue of integrating expert feedback in the form of labeled example passages into LLM-based pipelines, and compare the two strategies of (1) a dynamic selection of few-shot examples and (2) the automatic optimiza… ▽ More We address the detection of emission reduction goals in corporate reports, an important task for monitoring companies' progress in addressing climate change. Specifically, we focus on the issue of integrating expert feedback in the form of labeled example passages into LLM-based pipelines, and compare the two strategies of (1) a dynamic selection of few-shot examples and (2) the automatic optimization of the prompt by the LLM itself. Our findings on a public dataset of 769 climate-related passages from real-world business reports indicate that automatic prompt optimization is the superior approach, while combining both methods provides only limited benefit. Qualitative results indicate that optimized prompts do indeed capture many intricacies of the targeted emission goal extraction task. △ Less

Submitted 9 December, 2024; originally announced December 2024.

ACM Class: I.2.7
arXiv:2010.10252 [pdf, other]

cs.IR cs.CL cs.LG

CoRT: Complementary Rankings from Transformers

Authors: Marco Wrzalik, Dirk Krechel

Abstract: Many recent approaches towards neural information retrieval mitigate their computational costs by using a multi-stage ranking pipeline. In the first stage, a number of potentially relevant candidates are retrieved using an efficient retrieval model such as BM25. Although BM25 has proven decent performance as a first-stage ranker, it tends to miss relevant passages. In this context we propose CoRT,… ▽ More Many recent approaches towards neural information retrieval mitigate their computational costs by using a multi-stage ranking pipeline. In the first stage, a number of potentially relevant candidates are retrieved using an efficient retrieval model such as BM25. Although BM25 has proven decent performance as a first-stage ranker, it tends to miss relevant passages. In this context we propose CoRT, a simple neural first-stage ranking model that leverages contextual representations from pretrained language models such as BERT to complement term-based ranking functions while causing no significant delay at query time. Using the MS MARCO dataset, we show that CoRT significantly increases the candidate recall by complementing BM25 with missing candidates. Consequently, we find subsequent re-rankers achieve superior results with less candidates. We further demonstrate that passage retrieval using CoRT can be realized with surprisingly low latencies. △ Less

Submitted 25 May, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

Comments: NAACL-HLT 2021, Long Paper

MSC Class: 68P20 ACM Class: H.3.3; I.2.7

Journal ref: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4194-4204). Anthology ID: 2021.naacl-main.331

Search v0.5.6 released 2020-02-24