Skip to main content

Showing 1–6 of 6 results for author: Bayat, F F

.
  1. arXiv:2410.22257  [pdf, other

    cs.CL

    FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

    Authors: Farima Fatahi Bayat, Lechen Zhang, Sheza Munir, Lu Wang

    Abstract: The rapid adoption of language models (LMs) across diverse applications has raised concerns about their factuality, i.e., their consistency with real-world facts. We first present VERIFY (Verification and Evidence RetrIeval for FactualitY evaluation), a pipeline to evaluate LMs' factuality in real-world user interactions. VERIFY considers the verifiability of LM-generated content and categorizes c… ▽ More

    Submitted 7 January, 2025; v1 submitted 29 October, 2024; originally announced October 2024.

    Comments: 24 pages, 9 figures

  2. arXiv:2408.04637  [pdf, other

    cs.CL

    APE: Active Learning-based Tooling for Finding Informative Few-shot Examples for LLM-based Entity Matching

    Authors: Kun Qian, Yisi Sang, Farima Fatahi Bayat, Anton Belyi, Xianqi Chu, Yash Govind, Samira Khorshidi, Rahul Khot, Katherine Luna, Azadeh Nikfarjam, Xiaoguang Qi, Fei Wu, Xianhan Zhang, Yunyao Li

    Abstract: Prompt engineering is an iterative procedure often requiring extensive manual effort to formulate suitable instructions for effectively directing large language models (LLMs) in specific tasks. Incorporating few-shot examples is a vital and effective approach to providing LLMs with precise instructions, leading to improved LLM performance. Nonetheless, identifying the most informative demonstratio… ▽ More

    Submitted 29 July, 2024; originally announced August 2024.

    Comments: 3 pages, Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024)

  3. arXiv:2406.13230  [pdf, other

    cs.CL

    Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding

    Authors: Xin Liu, Farima Fatahi Bayat, Lu Wang

    Abstract: Calibrating language models (LMs) aligns their generation confidence with the actual likelihood of answer correctness, which can inform users about LMs' reliability and mitigate hallucinated content. However, prior calibration methods, such as self-consistency-based and logit-based approaches, are either limited in inference-time efficiency or fall short of providing informative signals. Moreover,… ▽ More

    Submitted 12 November, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: EMNLP 2024 Camera Ready

  4. arXiv:2405.00301  [pdf, other

    cs.CL

    Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression

    Authors: Farima Fatahi Bayat, Xin Liu, H. V. Jagadish, Lu Wang

    Abstract: Large language models (LLMs) can generate long-form and coherent text, yet they often hallucinate facts, which undermines their reliability. To mitigate this issue, inference-time methods steer LLM representations toward the "truthful directions" previously learned for truth elicitation. However, applying these truthful directions with the same intensity fails to generalize across different query… ▽ More

    Submitted 6 June, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings (Long paper)

  5. arXiv:2310.17119  [pdf, other

    cs.CL

    FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge

    Authors: Farima Fatahi Bayat, Kun Qian, Benjamin Han, Yisi Sang, Anton Belyi, Samira Khorshidi, Fei Wu, Ihab F. Ilyas, Yunyao Li

    Abstract: Detecting factual errors in textual information, whether generated by large language models (LLM) or curated by humans, is crucial for making informed decisions. LLMs' inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to rely on their responses. Humans, too, are prone to factual errors in their writing. Since manual detection and correct… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 (Demonstration Track)

  6. arXiv:2205.02880  [pdf, other

    cs.CL

    CompactIE: Compact Facts in Open Information Extraction

    Authors: Farima Fatahi Bayat, Nikita Bhutani, H. V. Jagadish

    Abstract: A major drawback of modern neural OpenIE systems and benchmarks is that they prioritize high coverage of information in extractions over compactness of their constituents. This severely limits the usefulness of OpenIE extractions in many downstream tasks. The utility of extractions can be improved if extractions are compact and share constituents. To this end, we study the problem of identifying c… ▽ More

    Submitted 9 June, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 main conference (Long paper)