Skip to main content

Showing 1–18 of 18 results for author: Bonifacio, L

Searching in archive cs. Search in all archives.
.
  1. Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR

    Authors: Nandan Thakur, Luiz Bonifacio, Maik Fröbe, Alexander Bondarenko, Ehsan Kamalloo, Martin Potthast, Matthias Hagen, Jimmy Lin

    Abstract: The zero-shot effectiveness of neural retrieval models is often evaluated on the BEIR benchmark -- a combination of different IR evaluation datasets. Interestingly, previous studies found that particularly on the BEIR subset Touché 2020, an argument retrieval task, neural retrieval models are considerably less effective than BM25. Still, so far, no further investigation has been conducted on what… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: SIGIR 2024 (Resource & Reproducibility Track)

  2. arXiv:2312.11361  [pdf, other

    cs.CL cs.IR

    "Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation

    Authors: Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin

    Abstract: Retrieval-Augmented Generation (RAG) grounds Large Language Model (LLM) output by leveraging external knowledge sources to reduce factual hallucinations. However, prior work lacks a comprehensive evaluation of different language families, making it challenging to evaluate LLM robustness against errors in external retrieved knowledge. To overcome this, we establish NoMIRACL, a human-annotated datas… ▽ More

    Submitted 10 November, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: EMNLP 2024 (Findings)

  3. arXiv:2308.07177  [pdf, ps, other

    cs.SE cs.FL

    Conformance Checking for Pushdown Reactive Systems based on Visibly Pushdown Languages

    Authors: Adilson Luiz Bonifacio

    Abstract: Testing pushdown reactive systems is deemed important to guarantee a precise and robust software development process. Usually, such systems can be specified by the formalism of Input/Output Visibly Pushdown Labeled Transition System (IOVPTS), where the interaction with the environment is regulated by a pushdown memory. Hence a conformance checking can be applied in a testing process to verify whet… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2107.11421

  4. arXiv:2307.04601  [pdf, ps, other

    cs.IR

    InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval

    Authors: Hugo Abonizio, Luiz Bonifacio, Vitor Jeronymo, Roberto Lotufo, Jakub Zavrel, Rodrigo Nogueira

    Abstract: Recent work has explored Large Language Models (LLMs) to overcome the lack of training data for Information Retrieval (IR) tasks. The generalization abilities of these models have enabled the creation of synthetic in-domain data by providing instructions and a few examples on a prompt. InPars and Promptagator have pioneered this approach and both methods have demonstrated the potential of using LL… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  5. arXiv:2301.01820  [pdf, ps, other

    cs.IR cs.AI

    InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval

    Authors: Vitor Jeronymo, Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Jakub Zavrel, Rodrigo Nogueira

    Abstract: Recently, InPars introduced a method to efficiently use large language models (LLMs) in information retrieval tasks: via few-shot examples, an LLM is induced to generate relevant queries for documents. These synthetic query-document pairs can then be used to train a retriever. However, InPars and, more recently, Promptagator, rely on proprietary LLMs such as GPT-3 and FLAN to generate such dataset… ▽ More

    Submitted 26 May, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

  6. arXiv:2212.06121  [pdf, other

    cs.IR cs.CL

    In Defense of Cross-Encoders for Zero-Shot Retrieval

    Authors: Guilherme Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

    Abstract: Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2206.02873

  7. arXiv:2210.14837  [pdf, other

    cs.IR cs.LG

    NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost

    Authors: Thales Sales Almeida, Thiago Laitz, João Seródio, Luiz Henrique Bonifacio, Roberto Lotufo, Rodrigo Nogueira

    Abstract: The widespread availability of search API's (both free and commercial) brings the promise of increased coverage and quality of search results for metasearch engines, while decreasing the maintenance costs of the crawling and indexing infrastructures. However, merging strategies frequently comprise complex pipelines that require careful tuning, which is often overlooked in the literature. In this w… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: published as a full paper at the DESIRES 2022 Conference. 13 pages

    Journal ref: DESIRES 2022-3rd International Conference on Design of Experimental Search and Information REtrieval Systems, 30-31,August 2022, San Jose, CA, USA

  8. arXiv:2206.02873  [pdf, other

    cs.IR cs.CL cs.PF

    No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval

    Authors: Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

    Abstract: Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of par… ▽ More

    Submitted 12 December, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

  9. arXiv:2205.15172  [pdf, ps, other

    cs.CL

    Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task

    Authors: Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Roberto Lotufo, Rodrigo Nogueira

    Abstract: Recent work has shown that language models scaled to billions of parameters, such as GPT-3, perform remarkably well in zero-shot and few-shot scenarios. In this work, we experiment with zero-shot models in the legal case entailment task of the COLIEE 2022 competition. Our experiments show that scaling the number of parameters in a language model improves the F1 score of our previous zero-shot resu… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  10. arXiv:2202.05144  [pdf, other

    cs.CL

    InPars: Data Augmentation for Information Retrieval using Large Language Models

    Authors: Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira

    Abstract: The information retrieval community has recently witnessed a revolution due to large pretrained transformer models. Another key ingredient for this revolution was the MS MARCO dataset, whose scale and diversity has enabled zero-shot transfer learning to various tasks. However, not all IR tasks and domains can benefit from one single dataset equally. Extensive research in various NLP tasks has show… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

  11. arXiv:2108.13897  [pdf, other

    cs.CL cs.AI

    mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset

    Authors: Luiz Bonifacio, Vitor Jeronymo, Hugo Queiroz Abonizio, Israel Campiotti, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

    Abstract: The MS MARCO ranking dataset has been widely used for training deep learning models for IR tasks, achieving considerable effectiveness on diverse zero-shot scenarios. However, this type of resource is scarce in languages other than English. In this work, we present mMARCO, a multilingual version of the MS MARCO passage ranking dataset comprising 13 languages that was created using machine translat… ▽ More

    Submitted 17 August, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

  12. arXiv:2107.11421  [pdf, ps, other

    cs.SE cs.FL

    Testing Pushdown Systems

    Authors: Adilson Luiz Bonifacio, Arnaldo Vieira Moura

    Abstract: Testing on reactive systems is a well-known laborious activity on software development due to their asynchronous interaction with the environment. In this setting model based testing has been employed when checking conformance and generating test suites of such systems using labeled transition system as a formalism as well as the classical ioco conformance relation. In this work we turn to a more… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

  13. arXiv:2105.06813  [pdf, other

    cs.CL cs.IR cs.LG

    A cost-benefit analysis of cross-lingual transfer methods

    Authors: Guilherme Moraes Rosa, Luiz Henrique Bonifacio, Leandro Rodrigues de Souza, Roberto Lotufo, Rodrigo Nogueira

    Abstract: An effective method for cross-lingual transfer is to fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner. Translating examples at training time or inference time are also viable alternatives. However, there are costs associated with these methods that are rarely addressed in the literature. In this work, we… ▽ More

    Submitted 14 December, 2021; v1 submitted 14 May, 2021; originally announced May 2021.

  14. arXiv:2011.00389  [pdf, other

    cs.SE

    A Model-Based Testing Tool for Asynchronous Reactive Systems

    Authors: Adilson Luiz Bonifacio, Camila Sonoda Gomes

    Abstract: Reactive systems are characterized by the interaction with the environment, where the exchange of the input and output stimuli, usually, occurs asynchronously. Systems of this nature, in general, require a rigorous testing activity over their developing process. Therefore model-based testing has been successfully applied over asynchronous reactive systems using Input Output Labeled Transition Syst… ▽ More

    Submitted 31 October, 2020; originally announced November 2020.

    Comments: 19 pages, 14 figures, 2 tables

  15. arXiv:1905.08914  [pdf, other

    cs.SE

    Automatically Checking Conformance on Asynchronous Reactive Systems

    Authors: Camila Sonada Gomes, Adilson Luiz Bonifacio

    Abstract: Software testing is an important issue in software development process to ensure higher quality on the products. Formal methods has been promising on testing reactive systems, specially critical systems, where accuracy is mandatory since any fault can cause severe damage. Systems of this nature are characterized by receiving messages from the environment and producing outputs in response. One of t… ▽ More

    Submitted 10 August, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: 18 pages, 9 figures, 5 algorithms

    MSC Class: 68N30

  16. arXiv:1902.10278  [pdf, ps, other

    cs.SE cs.LO

    A conformance relation and complete test suites for I/O systems

    Authors: Adilson Luiz Bonifacio, Arnaldo Vieira Moura

    Abstract: Model based testing is a well-established approach to verify implementations modeled by I/O labeled transition systems (IOLTSs). One of the challenges stemming from model based testing is the conformance checking and the generation of test suites, specially when completeness is a required property. In order to check whether an implementation under test is in compliance with its respective specific… ▽ More

    Submitted 10 February, 2020; v1 submitted 7 February, 2019; originally announced February 2019.

    Comments: 44 pages, 20 figures

  17. arXiv:1809.01103  [pdf, other

    cs.LO cs.FL

    An automatic tool for checking multi-party contracts

    Authors: Adilson Luiz Bonifacio, Wellington Aparecido Della Mura

    Abstract: Contracts play an important role in business where relationships among different parties are dictated by legal rules. The notion of electronic contracts has emerged mostly due to technological advances and the electronic trading among companies and customers. Thereby new challenges have arisen to guarantee reliability among the stakeholders in electronic negotiations. In this scenery, the automati… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: 28 pages, 26 figures, 3 tables, 2 algorithms

  18. arXiv:1508.02767  [pdf, ps, other

    cs.SE cs.LO

    Intrinsic Properties of Complete Test Suites

    Authors: Adilson Luiz Bonifacio, Arnaldo Vieira Moura

    Abstract: Completeness is a desirable property of test suites. Roughly, completeness guarantees that a non-equivalent implementation under test will always be identified. Several approaches proposed sufficient, and sometimes also necessary, conditions on the specification model and on the test suite in order to guarantee completeness. Usually, these approaches impose several restrictions on the specificatio… ▽ More

    Submitted 11 August, 2015; originally announced August 2015.