Skip to main content

Showing 1–17 of 17 results for author: Reitter, D

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2402.02077  [pdf, other

    cs.CL

    Investigating Content Planning for Navigating Trade-offs in Knowledge-Grounded Dialogue

    Authors: Kushal Chawla, Hannah Rashkin, Gaurav Singh Tomar, David Reitter

    Abstract: Knowledge-grounded dialogue generation is a challenging task because it requires satisfying two fundamental yet often competing constraints: being responsive in a manner that is specific to what the conversation partner has said while also being attributable to an underlying source document. In this work, we bring this trade-off between these two objectives (specificity and attribution) to light a… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted at EACL 2024 Main Conference (Long)

  3. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  4. arXiv:2306.01286  [pdf, other

    cs.CL cs.AI

    KL-Divergence Guided Temperature Sampling

    Authors: Chung-Ching Chang, David Reitter, Renat Aksitov, Yun-Hsuan Sung

    Abstract: Temperature sampling is a conventional approach to diversify large language model predictions. As temperature increases, the prediction becomes diverse but also vulnerable to hallucinations -- generating tokens that are sensible but not factual. One common approach to mitigate hallucinations is to provide source/grounding documents and the model is trained to produce predictions that bind to and a… ▽ More

    Submitted 29 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  5. arXiv:2303.17006  [pdf, other

    cs.CL

    How do decoding algorithms distribute information in dialogue responses?

    Authors: Saranya Venkatraman, He He, David Reitter

    Abstract: Humans tend to follow the Uniform Information Density (UID) principle by distributing information evenly in utterances. We study if decoding algorithms implicitly follow this UID principle, and under what conditions adherence to UID might be desirable for dialogue generation. We generate responses using different decoding algorithms with GPT-2 on the Persona-Chat dataset and collect human judgment… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  6. arXiv:2302.05578  [pdf, ps, other

    cs.CL cs.AI

    Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models

    Authors: Renat Aksitov, Chung-Ching Chang, David Reitter, Siamak Shakeri, Yunhsuan Sung

    Abstract: Despite recent progress, it has been difficult to prevent semantic hallucinations in generative Large Language Models. One common solution to this is augmenting LLMs with a retrieval system and making sure that the generated output is attributable to the retrieved information. Given this new added constraint, it is plausible to expect that the overall quality of the output will be affected, for ex… ▽ More

    Submitted 14 February, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

  7. Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence

    Authors: Chris Callison-Burch, Gaurav Singh Tomar, Lara J. Martin, Daphne Ippolito, Suma Bailis, David Reitter

    Abstract: AI researchers have posited Dungeons and Dragons (D&D) as a challenge problem to test systems on various language-related capabilities. In this paper, we frame D&D specifically as a dialogue system challenge, where the tasks are to both generate the next conversational turn in the game and predict the state of the game given the dialogue history. We create a gameplay dataset consisting of nearly 9… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022

    Journal ref: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 9379-9393, Dec. 2022

  8. arXiv:2112.12870  [pdf, other

    cs.CL

    Measuring Attribution in Natural Language Generation Models

    Authors: Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter

    Abstract: With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world. In this work, we present a new evaluation framework entitled Attributable to Identified Sources (AIS) for assessing the output of natural language genera… ▽ More

    Submitted 2 August, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

  9. arXiv:2112.08558  [pdf, other

    cs.CL

    CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning

    Authors: Zeqiu Wu, Yi Luan, Hannah Rashkin, David Reitter, Hannaneh Hajishirzi, Mari Ostendorf, Gaurav Singh Tomar

    Abstract: Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context. Moreover, it can be expensive to re-train well-established retrievers such as search engines that are originally developed for non-conversational queries. To facilit… ▽ More

    Submitted 28 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: EMNLP 2022 camera-ready

  10. arXiv:2107.06963  [pdf, other

    cs.CL

    Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features

    Authors: Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das

    Abstract: Knowledge-grounded dialogue systems are intended to convey information that is based on evidence provided in a given source text. We discuss the challenges of training a generative neural dialogue model for such systems that is controlled to stay faithful to the evidence. Existing datasets contain a mix of conversational responses that are faithful to selected evidence as well as more subjective o… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: ACL 2021

  11. arXiv:2105.00071  [pdf, other

    cs.CL

    Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark

    Authors: Nouha Dziri, Hannah Rashkin, Tal Linzen, David Reitter

    Abstract: Knowledge-grounded dialogue systems powered by large language models often generate responses that, while fluent, are not attributable to a relevant source of information. Progress towards models that do not exhibit this issue requires evaluation metrics that can quantify its prevalence. To this end, we introduce the Benchmark for Evaluation of Grounded INteraction (BEGIN), comprised of 12k dialog… ▽ More

    Submitted 28 June, 2022; v1 submitted 30 April, 2021; originally announced May 2021.

    Comments: TACL, 12 pages, 9 figures, 2 tables

  12. arXiv:1909.08663  [pdf, other

    cs.CL cs.AI cs.LG

    Do We Need Neural Models to Explain Human Judgments of Acceptability?

    Authors: Wang Jing, M. A. Kelly, David Reitter

    Abstract: Native speakers can judge whether a sentence is an acceptable instance of their language. Acceptability provides a means of evaluating whether computational language models are processing language in a human-like manner. We test the ability of computational language models, simple language features, and word embeddings to predict native English speakers judgments of acceptability on English-langua… ▽ More

    Submitted 9 October, 2019; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: 10 pages (8 pages + 2 pages of references), 1 figure, 7 tables

  13. arXiv:1908.05054  [pdf, other

    cs.CL cs.CV cs.LG

    Fusion of Detected Objects in Text for Visual Question Answering

    Authors: Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter

    Abstract: To advance models of multimodal context, we introduce a simple yet powerful neural architecture for data that combines vision and natural language. The "Bounding Boxes in Text Transformer" (B2T2) also leverages referential information binding words to portions of the image in a single unified architecture. B2T2 is highly effective on the Visual Commonsense Reasoning benchmark (https://visualcommon… ▽ More

    Submitted 3 November, 2019; v1 submitted 14 August, 2019; originally announced August 2019.

  14. arXiv:1805.11546  [pdf, other

    cs.CL cs.AI

    Like a Baby: Visually Situated Neural Language Acquisition

    Authors: Alexander G. Ororbia, Ankur Mali, Matthew A. Kelly, David Reitter

    Abstract: We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2\% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in… ▽ More

    Submitted 4 June, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: Final submission (camera-ready), accepted to ACL 2019

  15. arXiv:1711.11542  [pdf, other

    cs.LG stat.ML

    Learning to Adapt by Minimizing Discrepancy

    Authors: Alexander G. Ororbia II, Patrick Haffner, David Reitter, C. Lee Giles

    Abstract: We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural frame… ▽ More

    Submitted 30 November, 2017; originally announced November 2017.

    Comments: Note: Additional experiments in support of this paper are still running (updates will be made as they are completed)

  16. arXiv:1703.08864  [pdf, ps, other

    cs.CL

    Learning Simpler Language Models with the Differential State Framework

    Authors: Alexander G. Ororbia II, Tomas Mikolov, David Reitter

    Abstract: Learning useful information across long time lags is a critical and difficult problem for temporal neural models in tasks such as language modeling. Existing architectures that address the issue are often complex and costly to train. The Differential State Framework (DSF) is a simple and high-performing design that unifies previously introduced gated neural models. DSF models maintain longer-term… ▽ More

    Submitted 16 July, 2017; v1 submitted 26 March, 2017; originally announced March 2017.

    Comments: Edits/revisions applied throughout document

  17. arXiv:1511.06964  [pdf, other

    cs.LG

    Online Semi-Supervised Learning with Deep Hybrid Boltzmann Machines and Denoising Autoencoders

    Authors: Alexander G. Ororbia II, C. Lee Giles, David Reitter

    Abstract: Two novel deep hybrid architectures, the Deep Hybrid Boltzmann Machine and the Deep Hybrid Denoising Auto-encoder, are proposed for handling semi-supervised learning problems. The models combine experts that model relevant distributions at different levels of abstraction to improve overall predictive performance on discriminative tasks. Theoretical motivations and algorithms for joint learning for… ▽ More

    Submitted 18 January, 2016; v1 submitted 21 November, 2015; originally announced November 2015.