Skip to main content

Showing 1–23 of 23 results for author: Macherey, W

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2308.08998  [pdf, other

    cs.CL cs.LG

    Reinforced Self-Training (ReST) for Language Modeling

    Authors: Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Ksenia Konyushkova, Lotte Weerts, Abhishek Sharma, Aditya Siddhant, Alex Ahern, Miaosen Wang, Chenjie Gu, Wolfgang Macherey, Arnaud Doucet, Orhan Firat, Nando de Freitas

    Abstract: Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by growing batch reinforcement learning (RL), which we call Reinforced Self-Training (ReST). Given an initial LLM policy, ReST produces a dataset by generating sampl… ▽ More

    Submitted 21 August, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 23 pages, 16 figures

  3. arXiv:2306.17842  [pdf, other

    cs.CV cs.CL cs.MM

    SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

    Authors: Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

    Abstract: In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos. SPAE converts between raw pixels and interpretable lexical tokens (or words) extracted from the LLM's vocabulary. The resulting tokens capture both the semantic meaning and the fine-grained details n… ▽ More

    Submitted 28 October, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 spotlight

  4. arXiv:2212.09553  [pdf, other

    cs.CL cs.SD eess.AS

    Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models

    Authors: Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna

    Abstract: We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages. By leveraging a quantized representation of speech as a target, Mu$^{2}$SLAM trains the speech-text models with a sequence-t… ▽ More

    Submitted 26 June, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: ICML 2023

  5. arXiv:2205.03983  [pdf, other

    cs.CL cs.AI cs.LG

    Building Machine Translation Systems for the Next Thousand Languages

    Authors: Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes

    Abstract: In this paper we share findings from our effort to build practical machine translation (MT) systems capable of translating across over one thousand languages. We describe results in three research domains: (i) Building clean, web-mined datasets for 1500+ languages by leveraging semi-supervised pre-training for language identification and developing data-driven filtering techniques; (ii) Developing… ▽ More

    Submitted 6 July, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

    Comments: V2: updated with some details from 24-language Google Translate launch in May 2022 V3: spelling corrections, additional acknowledgements

  6. arXiv:2203.07627  [pdf, other

    cs.CL cs.AI

    Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation

    Authors: Yong Cheng, Ankur Bapna, Orhan Firat, Yuan Cao, Pidong Wang, Wolfgang Macherey

    Abstract: Multilingual neural machine translation models are trained to maximize the likelihood of a mix of examples drawn from multiple language pairs. The dominant inductive bias applied to these models is a shared vocabulary and a shared set of parameters across languages; the inputs and labels corresponding to examples drawn from different language pairs might still reside in distinct sub-spaces. In thi… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  7. arXiv:2106.04060  [pdf, other

    cs.CL

    Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

    Authors: Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

    Abstract: Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT). However, it usually fails to achieve notable gains on resource-rich NMT. In this paper, we propose a joint training approach, $F_2$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models. To exploit complementary self-supervised signals for… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted by ICML 2021

  8. arXiv:2104.14478  [pdf, other

    cs.CL cs.AI cs.LG

    Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

    Authors: Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, Wolfgang Macherey

    Abstract: Human evaluation of modern high-quality machine translation systems is a difficult problem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions. While there has been considerable research on human evaluation, the field still lacks a commonly-accepted standard procedure. As a step toward this goal, we propose an evaluation methodology grounded in… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  9. arXiv:2009.11027  [pdf, other

    cs.CL

    KoBE: Knowledge-Based Machine Translation Evaluation

    Authors: Zorik Gekhman, Roee Aharoni, Genady Beryozkin, Markus Freitag, Wolfgang Macherey

    Abstract: We propose a simple and effective method for machine translation evaluation which does not require reference translations. Our approach is based on (1) grounding the entity mentions found in each source sentence and candidate translation against a large-scale multilingual knowledge base, and (2) measuring the recall of the grounded entities found in the candidate vs. those found in the source. Our… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Accepted as a short paper in Findings of EMNLP 2020

  10. arXiv:2006.11834  [pdf, other

    cs.CL

    AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

    Authors: Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

    Abstract: In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT). The main idea is to minimize the vicinal risk over virtual sentences sampled from two vicinity distributions, of which the crucial one is a novel vicinity distribution for adversarial sentences that describes a smooth interpolated embedding space centered around observed training sentence pairs. W… ▽ More

    Submitted 2 July, 2020; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: published at ACL2020

  11. arXiv:2004.03643  [pdf, other

    cs.CL

    Re-translation versus Streaming for Simultaneous Translation

    Authors: Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

    Abstract: There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available. We study a related problem in which revisions to the hypothesis beyond strictly appending words are permitted. This is suitable for applications such as live captioning an audio feed. In this setting, we compare… ▽ More

    Submitted 29 June, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: IWSLT 2020

  12. arXiv:1912.03393  [pdf, other

    cs.CL cs.AI cs.LG

    Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

    Authors: Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster

    Abstract: We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repea… ▽ More

    Submitted 7 April, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

    Comments: ICASSP 2020

  13. arXiv:1907.05019  [pdf, other

    cs.CL cs.LG

    Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

    Authors: Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, Yonghui Wu

    Abstract: We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair. We set a milestone towards this goal by building a single massively multilingual NMT model handling 103 languages trained on over 25 billion examples. Our system demonstrates effective transfer learning ability, significantly improving translation quality… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

  14. arXiv:1906.05218  [pdf, other

    cs.CL

    Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

    Authors: Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel

    Abstract: Simultaneous machine translation begins to translate each source sentence before the source speaker is finished speaking, with applications to live and streaming scenarios. Simultaneous systems must carefully schedule their reading of the source sentence to balance quality against latency. We present the first simultaneous translation system to learn an adaptive schedule jointly with a neural mach… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

    Comments: Accepted for publication at ACL 2019

  15. arXiv:1906.02443  [pdf, ps, other

    cs.CL

    Robust Neural Machine Translation with Doubly Adversarial Inputs

    Authors: Yong Cheng, Lu Jiang, Wolfgang Macherey

    Abstract: Neural machine translation (NMT) often suffers from the vulnerability to noisy perturbations in the input. We propose an approach to improving the robustness of NMT models, which consists of two parts: (1) attack the translation model with adversarial source examples; (2) defend the translation model with adversarial target inputs to improve its robustness against the adversarial source inputs.For… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: Accepted by ACL 2019

  16. arXiv:1904.06037  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Direct speech-to-speech translation with a sequence-to-sequence model

    Authors: Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu

    Abstract: We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation. The network is trained end-to-end, learning to map speech spectrograms into target spectrograms in another language, corresponding to the translated content (in a different canonical voice).… ▽ More

    Submitted 25 June, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: Accepted to Interspeech 2019

  17. arXiv:1903.07091  [pdf, other

    cs.CL cs.AI cs.LG

    The Missing Ingredient in Zero-Shot Neural Machine Translation

    Authors: Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, Wolfgang Macherey

    Abstract: Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages. Despite various approaches to train such models, they have difficulty with zero-shot translation: translating between language pairs that were not together seen during training. In this paper we first diagnose why state-of-the-art multilingual NMT models that rely purely on… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

  18. arXiv:1902.08295  [pdf, other

    cs.LG stat.ML

    Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

    Authors: Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob , et al. (66 additional authors not shown)

    Abstract: Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly w… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

  19. arXiv:1811.02050  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

    Authors: Ye Jia, Melvin Johnson, Wolfgang Macherey, Ron J. Weiss, Yuan Cao, Chung-Cheng Chiu, Naveen Ari, Stella Laurenzo, Yonghui Wu

    Abstract: End-to-end Speech Translation (ST) models have many potential advantages when compared to the cascade of Automatic Speech Recognition (ASR) and text Machine Translation (MT) models, including lowered inference latency and the avoidance of error compounding. However, the quality of end-to-end ST is often limited by a paucity of training data, since it is difficult to collect large parallel corpora… ▽ More

    Submitted 10 February, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: ICASSP 2019

  20. arXiv:1809.04686  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-lingual Classification Using Multilingual Neural Machine Translation

    Authors: Akiko Eriguchi, Melvin Johnson, Orhan Firat, Hideto Kazawa, Wolfgang Macherey

    Abstract: Transferring representations from large supervised tasks to downstream tasks has shown promising results in AI fields such as Computer Vision and Natural Language Processing (NLP). In parallel, the recent progress in Machine Translation (MT) has enabled one to train multilingual Neural MT (NMT) systems that can translate between multiple languages and are also capable of performing zero-shot trans… ▽ More

    Submitted 12 September, 2018; originally announced September 2018.

  21. arXiv:1808.09943  [pdf, other

    cs.CL

    Revisiting Character-Based Neural Machine Translation with Capacity and Compression

    Authors: Colin Cherry, George Foster, Ankur Bapna, Orhan Firat, Wolfgang Macherey

    Abstract: Translating characters instead of words or word-fragments has the potential to simplify the processing pipeline for neural machine translation (NMT), and improve results by eliminating hyper-parameters and manual feature engineering. However, it results in longer sequences in which each symbol contains less information, creating both modeling and computational challenges. In this paper, we show th… ▽ More

    Submitted 29 August, 2018; originally announced August 2018.

    Comments: To appear at EMNLP 2018

  22. arXiv:1804.09849  [pdf, other

    cs.CL cs.AI

    The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

    Authors: Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Niki Parmar, Mike Schuster, Zhifeng Chen, Yonghui Wu, Macduff Hughes

    Abstract: The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training tec… ▽ More

    Submitted 26 April, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

  23. arXiv:1609.08144  [pdf, other

    cs.CL cs.AI cs.LG

    Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

    Authors: Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Ɓukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith , et al. (6 additional authors not shown)

    Abstract: Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NM… ▽ More

    Submitted 8 October, 2016; v1 submitted 26 September, 2016; originally announced September 2016.