Skip to main content

Showing 1–5 of 5 results for author: Yih, S W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.14739  [pdf, other

    cs.CL

    Trusting Your Evidence: Hallucinate Less with Context-aware Decoding

    Authors: Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih

    Abstract: Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations. To mitigate this issue, we present context-aware decoding (CAD), which follows a contrastive output distribution that amplifies the difference between the output probabilities when a model is used with and without context. Our experiments show that CA… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  2. arXiv:2212.09726  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences

    Authors: Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz

    Abstract: Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries. In this paper, we show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders. To that end, we leverage information-theoretic measures of causal effects to quantify the amount… ▽ More

    Submitted 18 January, 2024; v1 submitted 19 December, 2022; originally announced December 2022.

  3. arXiv:2211.11501  [pdf, other

    cs.SE cs.CL

    DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

    Authors: Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu

    Abstract: We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas. Compared to prior works, DS-1000 incorporates three core features. First, our problems reflect diverse, realistic, and practical use cases since we collected them from StackOverflow. Second, our automatic evaluation is highly specific (reliable) -- acro… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  4. arXiv:1911.02972  [pdf, other

    cs.CL cs.LG

    Blockwise Self-Attention for Long Document Understanding

    Authors: Jiezhong Qiu, Hao Ma, Omer Levy, Scott Wen-tau Yih, Sinong Wang, Jie Tang

    Abstract: We present BlockBERT, a lightweight and efficient BERT model for better modeling long-distance dependencies. Our model extends BERT by introducing sparse block structures into the attention matrix to reduce both memory consumption and training/inference time, which also enables attention heads to capture either short- or long-range contextual information. We conduct experiments on language model p… ▽ More

    Submitted 1 November, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

    Comments: Accepted at Findings of EMNLP'20 and SustaiNLP 2020 at EMNLP'20, 12 pages

  5. arXiv:1908.05739  [pdf, other

    cs.CL

    Abductive Commonsense Reasoning

    Authors: Chandra Bhagavatula, Ronan Le Bras, Chaitanya Malaviya, Keisuke Sakaguchi, Ari Holtzman, Hannah Rashkin, Doug Downey, Scott Wen-tau Yih, Yejin Choi

    Abstract: Abductive reasoning is inference to the most plausible explanation. For example, if Jenny finds her house in a mess when she returns from work, and remembers that she left a window open, she can hypothesize that a thief broke into her house and caused the mess, as the most plausible explanation. While abduction has long been considered to be at the core of how people interpret and read between the… ▽ More

    Submitted 13 February, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: ICLR 2020 Camera Ready