Skip to main content

Showing 1–16 of 16 results for author: Modarressi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14407  [pdf, ps, other

    cs.CL cs.AI

    ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge

    Authors: Zeinab Sadat Taghavi, Ali Modarressi, Yunpu Ma, Hinrich Schütze

    Abstract: Retrieval systems are central to many NLP pipelines, but often rely on surface-level cues such as keyword overlap and lexical semantic similarity. To evaluate retrieval beyond these shallow signals, recent benchmarks introduce reasoning-heavy queries; however, they primarily shift the burden to query-side processing techniques -- like prompting or multi-hop retrieval -- that can help resolve compl… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  2. arXiv:2506.03434  [pdf, ps, other

    cs.CL

    Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models

    Authors: Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, Hinrich Schütze

    Abstract: Understanding how large language models (LLMs) acquire and store factual knowledge is crucial for enhancing their interpretability and reliability. In this work, we analyze the evolution of factual knowledge representation in the OLMo-7B model by tracking the roles of its attention heads and feed forward networks (FFNs) over the course of pre-training. We classify these components into four roles:… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2505.21701  [pdf, ps, other

    cs.CL

    Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing

    Authors: Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi, Michael A. Hedderich, Hinrich Schütze

    Abstract: The reliability of large language models (LLMs) is greatly compromised by their tendency to hallucinate, underscoring the need for precise identification of knowledge gaps within LLMs. Various methods for probing such gaps exist, ranging from calibration-based to prompting-based methods. To evaluate these probing methods, in this paper, we propose a new process based on using input variations and… ▽ More

    Submitted 30 May, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

  4. arXiv:2503.05037  [pdf, ps, other

    cs.CL cs.IR

    Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence

    Authors: Mohsen Fayyaz, Ali Modarressi, Hinrich Schuetze, Nanyun Peng

    Abstract: Dense retrieval models are commonly used in Information Retrieval (IR) applications, such as Retrieval-Augmented Generation (RAG). Since they often serve as the first step in these systems, their robustness is critical to avoid downstream failures. In this work, we repurpose a relation extraction dataset (e.g., Re-DocRED) to design controlled experiments that quantify the impact of heuristic biase… ▽ More

    Submitted 2 June, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: ACL 2025 Main Conference

  5. arXiv:2502.05167  [pdf, other

    cs.CL

    NoLiMa: Long-Context Evaluation Beyond Literal Matching

    Authors: Ali Modarressi, Hanieh Deilamsalehy, Franck Dernoncourt, Trung Bui, Ryan A. Rossi, Seunghyun Yoon, Hinrich Schütze

    Abstract: Recent large language models (LLMs) support long contexts ranging from 128K to 1M tokens. A popular method for evaluating these capabilities is the needle-in-a-haystack (NIAH) test, which involves retrieving a "needle" (relevant information) from a "haystack" (long irrelevant context). Extensions of this approach include increasing distractors, fact chaining, and in-context reasoning. However, in… ▽ More

    Submitted 26 March, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  6. arXiv:2410.05873  [pdf, ps, other

    cs.CL cs.AI

    MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

    Authors: Amir Hossein Kargaran, Ali Modarressi, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schütze

    Abstract: English-centric large language models (LLMs) often show strong multilingual capabilities. However, their multilingual performance remains unclear and is under-evaluated for many other languages. Most benchmarks for multilinguality focus on classic NLP tasks or cover a minimal number of languages. We introduce MEXA, a method for assessing the multilingual capabilities of pre-trained English-centric… ▽ More

    Submitted 1 June, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: ACL Findings 2025

  7. arXiv:2407.06699  [pdf, other

    cs.CL

    Consistent Document-Level Relation Extraction via Counterfactuals

    Authors: Ali Modarressi, Abdullatif Köksal, Hinrich Schütze

    Abstract: Many datasets have been developed to train and evaluate document-level relation extraction (RE) models. Most of these are constructed using real-world data. It has been shown that RE models trained on real-world data suffer from factual biases. To evaluate and address this issue, we present CovEReD, a counterfactual data generation approach for document-level relation extraction datasets using ent… ▽ More

    Submitted 15 October, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  8. arXiv:2404.11672  [pdf, other

    cs.CL

    MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

    Authors: Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

    Abstract: While current large language models (LLMs) perform well on many knowledge-related tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with memorizing rare events and with updating their memory as facts change over time. In addition, the uninterpretable nature of parametric memory makes it challenging to prevent hallucination. Model ed… ▽ More

    Submitted 17 April, 2025; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  9. arXiv:2306.02873  [pdf, other

    cs.CL

    DecompX: Explaining Transformers Decisions by Propagating Token Decomposition

    Authors: Ali Modarressi, Mohsen Fayyaz, Ehsan Aghazadeh, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: An emerging solution for explaining Transformer-based models is to use vector-based analysis on how the representations are formed. However, providing a faithful vector-based explanation for a multi-layer model could be challenging in three aspects: (1) Incorporating all components into the analysis, (2) Aggregating the layer dynamics to determine the information flow and mixture throughout the en… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 (main conference)

  10. arXiv:2305.14322  [pdf, other

    cs.CL

    RET-LLM: Towards a General Read-Write Memory for Large Language Models

    Authors: Ali Modarressi, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

    Abstract: Large language models (LLMs) have significantly advanced the field of natural language processing (NLP) through their extensive parameters and comprehensive data utilization. However, existing LLMs lack a dedicated memory unit, limiting their ability to explicitly store and retrieve knowledge for various tasks. In this paper, we propose RET-LLM a novel framework that equips LLMs with a general wri… ▽ More

    Submitted 24 October, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: NOTE: This concept paper outlines an initial methodology, now evolved and thoroughly evaluated in the MemLLM paper

  11. arXiv:2302.02852  [pdf, other

    cs.CL

    Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

    Authors: Ali Modarressi, Hossein Amirkhani, Mohammad Taher Pilehvar

    Abstract: Several proposals have been put forward in recent years for improving out-of-distribution (OOD) performance through mitigating dataset biases. A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model. Here, the underlying assumption is that the biased model resorts to shortcut features. Hence, those training examples that are correctly pre… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted to EACL 2023 (main conference)

  12. arXiv:2211.05610  [pdf, other

    cs.CL

    BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

    Authors: Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Mohammad Taher Pilehvar, Yadollah Yaghoobzadeh, Samira Ebrahimi Kahou

    Abstract: Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based s… ▽ More

    Submitted 28 November, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: ENLSP @ NeurIPS2022

  13. arXiv:2205.03286  [pdf, other

    cs.CL

    GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

    Authors: Ali Modarressi, Mohsen Fayyaz, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: There has been a growing interest in interpreting the underlying dynamics of Transformers. While self-attention patterns were initially deemed as the primary option, recent studies have shown that integrating other components can yield more accurate explanations. This paper introduces a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted to NAACL 2022 (main conference)

  14. arXiv:2203.08991  [pdf, other

    cs.CL

    AdapLeR: Speeding up Inference by Adaptive Length Reduction

    Authors: Ali Modarressi, Hosein Mohebbi, Mohammad Taher Pilehvar

    Abstract: Pre-trained language models have shown stellar performance in various downstream tasks. But, this usually comes at the cost of high latency and computation, hindering their usage in resource-limited settings. In this work, we propose a novel approach for reducing the computational cost of BERT with minimal loss in downstream performance. Our method dynamically eliminates less contributing tokens t… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022 (main conference)

  15. arXiv:2109.05958  [pdf, other

    cs.CL cs.AI

    Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations

    Authors: Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Hosein Mohebbi, Mohammad Taher Pilehvar

    Abstract: Most of the recent works on probing representations have focused on BERT, with the presumption that the findings might be similar to the other models. In this work, we extend the probing studies to two other models in the family, namely ELECTRA and XLNet, showing that variations in the pre-training objectives or architectural choices can result in different behaviors in encoding linguistic informa… ▽ More

    Submitted 15 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted to BlackboxNLP Workshop at EMNLP 2021

  16. arXiv:2104.01477  [pdf, other

    cs.CL

    Exploring the Role of BERT Token Representations to Explain Sentence Probing Results

    Authors: Hosein Mohebbi, Ali Modarressi, Mohammad Taher Pilehvar

    Abstract: Several studies have been carried out on revealing linguistic features captured by BERT. This is usually achieved by training a diagnostic classifier on the representations obtained from different layers of BERT. The subsequent classification accuracy is then interpreted as the ability of the model in encoding the corresponding linguistic property. Despite providing insights, these studies have le… ▽ More

    Submitted 11 September, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021 (main conference)