Skip to main content

Showing 51–90 of 90 results for author: Bisk, Y

.
  1. arXiv:2112.08614  [pdf, other

    cs.CL

    KAT: A Knowledge Augmented Transformer for Vision-and-Language

    Authors: Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

    Abstract: The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters. In this work, we ask a different question: Can multimodal transformers leverage explicit knowledge in their reasoning? Existing, primarily unimodal, methods have explored approaches under the paradigm of knowledge retrieval followed by answer prediction… ▽ More

    Submitted 5 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted by NAACL 2022

  2. arXiv:2110.08258  [pdf, other

    cs.LG cs.AI cs.HC cs.RO

    A Framework for Learning to Request Rich and Contextually Useful Information from Humans

    Authors: Khanh Nguyen, Yonatan Bisk, Hal Daumé III

    Abstract: When deployed, AI agents will encounter problems that are beyond their autonomous problem-solving capabilities. Leveraging human assistance can help agents overcome their inherent limitations and robustly cope with unfamiliar situations. We present a general interactive framework that enables an agent to request and interpret rich, contextually useful information from an assistant that has knowled… ▽ More

    Submitted 22 June, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted to ICML 2022

  3. arXiv:2110.07342  [pdf, other

    cs.CL cs.LG

    FILM: Following Instructions in Language with Modular Methods

    Authors: So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov

    Abstract: Recent methods for embodied instruction following are typically trained end-to-end using imitation learning. This often requires the use of expert trajectories and low-level language instructions. Such approaches assume that neural states will integrate multimodal semantics to perform state tracking, building spatial memory, exploration, and long-term planning. In contrast, we propose a modular me… ▽ More

    Submitted 16 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at International Conference on Learning Representations (ICLR) 2022

  4. arXiv:2109.09790  [pdf, other

    cs.CL

    Dependency Induction Through the Lens of Visual Perception

    Authors: Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig

    Abstract: Most previous work on grammar induction focuses on learning phrasal or dependency structure purely from text. However, because the signal provided by text alone is limited, recently introduced visually grounded syntax models make use of multimodal information leading to improved performance in constituency grammar induction. However, as compared to dependency grammars, constituency grammars do not… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: Accepted to CoNLL 2021

  5. arXiv:2109.00590  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    WebQA: Multihop and Multimodal QA

    Authors: Yingshan Chang, Mridu Narang, Hisami Suzuki, Guihong Cao, Jianfeng Gao, Yonatan Bisk

    Abstract: Scaling Visual Question Answering (VQA) to the open-domain and multi-hop nature of web searches, requires fundamental advances in visual representation learning, knowledge aggregation, and language generation. In this work, we introduce WebQA, a challenging new benchmark that proves difficult for large-scale state-of-the-art models which lack language groundable visual representations for novel ob… ▽ More

    Submitted 27 March, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: CVPR Camera ready

  6. arXiv:2108.09980  [pdf, other

    cs.CV cs.AI cs.LG

    TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment

    Authors: Jianwei Yang, Yonatan Bisk, Jianfeng Gao

    Abstract: Contrastive learning has been widely used to train transformer-based vision-language models for video-text alignment and multi-modal representation learning. This paper presents a new algorithm called Token-Aware Cascade contrastive learning (TACo) that improves contrastive learning using two novel techniques. The first is the token-aware contrastive loss which is computed by taking into account t… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV 2021

  7. arXiv:2107.12514  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.RO

    Language Grounding with 3D Objects

    Authors: Jesse Thomason, Mohit Shridhar, Yonatan Bisk, Chris Paxton, Luke Zettlemoyer

    Abstract: Seemingly simple natural language requests to a robot are generally underspecified, for example "Can you bring me the wireless mouse?" Flat images of candidate mice may not provide the discriminative information needed for "wireless." The world, and objects in it, are not flat images but complex 3D shapes. If a human requests an object based on any of its basic properties, such as color, shape, or… ▽ More

    Submitted 15 September, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: Conference on Robot Learning (CoRL) 2021

  8. arXiv:2107.05697  [pdf, other

    cs.CL

    Few-shot Language Coordination by Modeling Theory of Mind

    Authors: Hao Zhu, Graham Neubig, Yonatan Bisk

    Abstract: $\textit{No man is an island.}$ Humans communicate with a large community by coordinating with different interlocutors within short conversations. This ability has been understudied by the research on building neural communicative agents. We study the task of few-shot $\textit{language coordination}… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: Thirty-eighth International Conference on Machine Learning (ICML 2021)

  9. arXiv:2106.02192  [pdf, other

    cs.CL

    Grounding 'Grounding' in NLP

    Authors: Khyathi Raghavi Chandu, Yonatan Bisk, Alan W Black

    Abstract: The NLP community has seen substantial recent interest in grounding to facilitate interaction between language technologies and the world. However, as a community, we use the term broadly to reference any linking of text to data or non-textual modality. In contrast, Cognitive Science more formally defines "grounding" as the process of establishing what mutual information is required for successful… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: 24 pages

  10. arXiv:2104.08666  [pdf, other

    cs.CL

    Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models

    Authors: Tejas Srinivasan, Yonatan Bisk

    Abstract: Numerous works have analyzed biases in vision and pre-trained language models individually - however, less attention has been paid to how these biases interact in multimodal settings. This work extends text-based bias analysis methods to investigate multimodal language models, and analyzes intra- and inter-modality associations and biases learned by these models. Specifically, we demonstrate that… ▽ More

    Submitted 19 May, 2022; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: Accepted to 4th Workshop on Gender Bias in Natural Language Processing, NAACL 2022

  11. arXiv:2102.00424  [pdf, other

    cs.CL cs.CV cs.LG

    An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games

    Authors: Alessandro Suglia, Yonatan Bisk, Ioannis Konstas, Antonio Vergari, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

    Abstract: Guessing games are a prototypical instance of the "learning by interacting" paradigm. This work investigates how well an artificial agent can benefit from playing guessing games when later asked to perform on novel NLP downstream tasks such as Visual Question Answering (VQA). We propose two ways to exploit playing guessing games: 1) a supervised learning scenario in which the agent learns to mimic… ▽ More

    Submitted 31 January, 2021; originally announced February 2021.

    Comments: Accepted paper for the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)

  12. arXiv:2011.03863  [pdf, other

    cs.CL cs.AI

    Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering

    Authors: Kaixin Ma, Filip Ilievski, Jonathan Francis, Yonatan Bisk, Eric Nyberg, Alessandro Oltramari

    Abstract: Recent developments in pre-trained neural language modeling have led to leaps in accuracy on commonsense question-answering benchmarks. However, there is increasing concern that models overfit to specific tasks, without learning to utilize external knowledge or perform general semantic reasoning. In contrast, zero-shot evaluations have shown promise as a more robust measure of a model's general re… ▽ More

    Submitted 14 December, 2020; v1 submitted 7 November, 2020; originally announced November 2020.

    Comments: AAAI 2021

  13. arXiv:2011.02917  [pdf, other

    cs.CL cs.CV cs.LG

    Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games

    Authors: Alessandro Suglia, Antonio Vergari, Ioannis Konstas, Yonatan Bisk, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

    Abstract: In visual guessing games, a Guesser has to identify a target object in a scene by asking questions to an Oracle. An effective strategy for the players is to learn conceptual representations of objects that are both discriminative and expressive enough to ask questions and guess correctly. However, as shown by Suglia et al. (2020), existing models fail to learn truly multi-modal representations, re… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: Accepted to the International Conference on Computational Linguistics (COLING) 2020

  14. arXiv:2010.03768  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.RO

    ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

    Authors: Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Côté, Yonatan Bisk, Adam Trischler, Matthew Hausknecht

    Abstract: Given a simple request like Put a washed apple in the kitchen fridge, humans can reason in purely abstract terms by imagining action sequences and scoring their likelihood of success, prototypicality, and efficiency, all without moving a muscle. Once we see the kitchen in question, we can update our abstract plans to fit the scene. Embodied agents require the same abilities, but existing work does… ▽ More

    Submitted 14 March, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: ICLR 2021; Data, code, and videos are available at alfworld.github.io

  15. arXiv:2007.15135  [pdf, other

    cs.CL cs.LG

    The Return of Lexical Dependencies: Neural Lexicalized PCFGs

    Authors: Hao Zhu, Yonatan Bisk, Graham Neubig

    Abstract: In this paper we demonstrate that $\textit{context free grammar (CFG) based methods for grammar induction benefit from modeling lexical dependencies}$. This contrasts to the most popular current methods for grammar induction, which focus on discovering $\textit{either}$ constituents $\textit{or}$ dependencies. Previous approaches to marry these two disparate syntactic formalisms (e.g. lexicalized… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: Accepted at TACL 2020

  16. arXiv:2005.00728  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.RO

    RMM: A Recursive Mental Model for Dialog Navigation

    Authors: Homero Roman Roman, Yonatan Bisk, Jesse Thomason, Asli Celikyilmaz, Jianfeng Gao

    Abstract: Language-guided robots must be able to both ask humans questions and understand answers. Much existing work focuses only on the latter. In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers. Inspired by theory of mind, we propose the Recursive Mental Model (RMM). The navigating agent models… ▽ More

    Submitted 5 October, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: Findings of Empirical Methods in Natural Language Processing (EMNLP Findings), 2020

  17. arXiv:2005.00706  [pdf, other

    cs.CL cs.CV

    A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

    Authors: Frank F. Xu, Lei Ji, Botian Shi, Junyi Du, Graham Neubig, Yonatan Bisk, Nan Duan

    Abstract: Watching instructional videos are often used to learn about procedures. Video captioning is one way of automatically collecting such knowledge. However, it provides only an indirect, overall evaluation of multimodal models with no finer-grained quantitative measure of what they have learned. We propose instead, a benchmark of structured procedural knowledge extracted from cooking videos. This work… ▽ More

    Submitted 9 October, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted by NLP Beyond Text - First International Workshop on Natural Language Processing Beyond Text @ EMNLP 2020

  18. arXiv:2004.10151  [pdf, other

    cs.CL cs.AI cs.LG

    Experience Grounds Language

    Authors: Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian

    Abstract: Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates. Despite the incredible effectiveness of language processing models to tackle tasks after being trained on text alone, successful linguistic communication relies on a shared experience of the world. It is this shared experience that makes utt… ▽ More

    Submitted 1 November, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: Empirical Methods in Natural Language Processing (EMNLP), 2020

  19. arXiv:2003.00857  [pdf, ps, other

    cs.CL cs.CV cs.LG

    Multi-View Learning for Vision-and-Language Navigation

    Authors: Qiaolin Xia, Xiujun Li, Chunyuan Li, Yonatan Bisk, Zhifang Sui, Jianfeng Gao, Yejin Choi, Noah A. Smith

    Abstract: Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified. In this paper, we present a novel training paradigm, Learn from EveryOne (LEO), which leverages multiple instructions (as different views) for the same trajectory to resolve language ambiguity and impro… ▽ More

    Submitted 9 March, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

    Comments: 16 pages, 8 figures

  20. arXiv:1912.01734  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO

    ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

    Authors: Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox

    Abstract: We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED includes long, compositional tasks with non-reversible state changes to shrink the gap between research benchmarks and real-world applications. ALFRED consists of expert demons… ▽ More

    Submitted 30 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: Computer Vision and Pattern Recognition (CVPR) 2020 ; https://askforalfred.com/

  21. arXiv:1911.11641  [pdf, other

    cs.CL cs.AI cs.LG

    PIQA: Reasoning about Physical Commonsense in Natural Language

    Authors: Yonatan Bisk, Rowan Zellers, Ronan Le Bras, Jianfeng Gao, Yejin Choi

    Abstract: To apply eyeshadow without a brush, should I use a cotton swab or a toothpick? Questions requiring this kind of physical commonsense pose a challenge to today's natural language understanding systems. While recent pretrained models (such as BERT) have made progress on question answering over more abstract domains - such as news articles and encyclopedia entries, where text is plentiful - in more p… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: AAAI 2020

  22. arXiv:1909.02244  [pdf, other

    cs.CL cs.CV cs.LG

    Robust Navigation with Language Pretraining and Stochastic Sampling

    Authors: Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin Choi

    Abstract: Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments. In this paper, we report two simple but highly effective methods to address these challenges and lead to a new state-of-the-art performance. First, we adapt large-scale pretrained languag… ▽ More

    Submitted 5 September, 2019; originally announced September 2019.

    Comments: 8 pages, 4 figures, EMNLP 2019

  23. arXiv:1905.12616  [pdf, other

    cs.CL cs.CY

    Defending Against Neural Fake News

    Authors: Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi

    Abstract: Recent progress in natural language generation has raised dual-use concerns. While applications like summarization and translation are positive, the underlying technology also might enable adversaries to generate neural fake news: targeted propaganda that closely mimics the style of real news. Modern computer security relies on careful threat modeling: identifying potential threats and vulnerabi… ▽ More

    Submitted 11 December, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019 camera ready version. Project page/code/demo at https://rowanzellers.com/grover

  24. arXiv:1905.07830  [pdf, other

    cs.CL

    HellaSwag: Can a Machine Really Finish Your Sentence?

    Authors: Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi

    Abstract: Recent work by Zellers et al. (2018) introduced a new task of commonsense natural language inference: given an event description such as "A woman sits at a piano," a machine must select the most likely followup: "She sets her fingers on the keys." With the introduction of BERT, near human-level performance was reached. Does this mean that machines can perform human level commonsense inference? I… ▽ More

    Submitted 19 May, 2019; originally announced May 2019.

    Comments: ACL 2019. Project page at https://rowanzellers.com/hellaswag

  25. arXiv:1904.01650  [pdf, other

    cs.RO cs.AI cs.CL

    Improving Robot Success Detection using Static Object Data

    Authors: Rosario Scalise, Jesse Thomason, Yonatan Bisk, Siddhartha Srinivasa

    Abstract: We use static object data to improve success detection for stacking objects on and nesting objects in one another. Such actions are necessary for certain robotics tasks, e.g., clearing a dining table or packing a warehouse bin. However, using an RGB-D camera to detect success can be insufficient: same-colored objects can be difficult to differentiate, and reflective silverware cause noisy depth ca… ▽ More

    Submitted 31 July, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

    Comments: IROS 2019 + Appendix

  26. arXiv:1903.08309  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    Prospection: Interpretable Plans From Language By Predicting the Future

    Authors: Chris Paxton, Yonatan Bisk, Jesse Thomason, Arunkumar Byravan, Dieter Fox

    Abstract: High-level human instructions often correspond to behaviors with multiple implicit steps. In order for robots to be useful in the real world, they must be able to to reason over both motions and intermediate goals implied by human instructions. In this work, we propose a framework for learning representations that convert from a natural-language command to a sequence of intermediate goals for exec… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

    Comments: Accepted to ICRA 2019; extended version with appendix containing additional results

  27. arXiv:1903.02547  [pdf, other

    cs.CL cs.CV cs.LG cs.NE cs.RO

    Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

    Authors: Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

    Abstract: We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from sourc… ▽ More

    Submitted 2 April, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: CVPR 2019 Oral, video demo: https://youtu.be/AD9TNohXoPA

  28. arXiv:1902.00595  [pdf, other

    cs.CL

    Character-based Surprisal as a Model of Reading Difficulty in the Presence of Error

    Authors: Michael Hahn, Frank Keller, Yonatan Bisk, Yonatan Belinkov

    Abstract: Intuitively, human readers cope easily with errors in text; typos, misspelling, word substitutions, etc. do not unduly disrupt natural reading. Previous work indicates that letter transpositions result in increased reading times, but it is unclear if this effect generalizes to more natural errors. In this paper, we report an eye-tracking study that compares two error types (letter transpositions a… ▽ More

    Submitted 19 May, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Published in Proceedings of CogSci 2019

  29. arXiv:1811.10830  [pdf, other

    cs.CV cs.CL

    From Recognition to Cognition: Visual Commonsense Reasoning

    Authors: Rowan Zellers, Yonatan Bisk, Ali Farhadi, Yejin Choi

    Abstract: Visual understanding goes well beyond object recognition. With one glance at an image, we can effortlessly imagine the world beyond the pixels: for instance, we can infer people's actions, goals, and mental states. While this task is easy for humans, it is tremendously difficult for today's vision systems, requiring higher-order cognition and commonsense reasoning about the world. We formalize thi… ▽ More

    Submitted 26 March, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

    Comments: CVPR 2019 oral. Project page at https://visualcommonsense.com

  30. arXiv:1811.08824  [pdf, other

    cs.CV cs.RO

    Early Fusion for Goal Directed Robotic Vision

    Authors: Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox

    Abstract: Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline. Modern vision architectures require the agent to build a summary representation of the entire scene, even if most of the input is irrelevant to the agent's current goal. In this work, we flip this paradigm, by introducing… ▽ More

    Submitted 7 August, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

  31. arXiv:1811.00613  [pdf, other

    cs.CL

    Shifting the Baseline: Single Modality Performance on Visual Navigation & QA

    Authors: Jesse Thomason, Daniel Gordon, Yonatan Bisk

    Abstract: We demonstrate the surprising strength of unimodal baselines in multimodal domains, and make concrete recommendations for best practices in future research. Where existing work often compares against random or majority class baselines, we argue that unimodal approaches better capture and reflect dataset biases and therefore provide an important comparison when assessing the performance of multimod… ▽ More

    Submitted 11 March, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: Published at The Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2019

  32. arXiv:1808.05326  [pdf, other

    cs.CL

    SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

    Authors: Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi

    Abstract: Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. We present SWAG, a new dataset with 113k multiple choice questions about a rich spectru… ▽ More

    Submitted 15 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018

  33. arXiv:1805.10850  [pdf, other

    cs.CL

    Inducing Grammars with and for Neural Machine Translation

    Authors: Ke Tran, Yonatan Bisk

    Abstract: Machine translation systems require semantic knowledge and grammatical understanding. Neural machine translation (NMT) systems often assume this information is captured by an attention mechanism and a decoder that ensures fluency. Recent work has shown that incorporating explicit syntax alleviates the burden of modeling both types of knowledge. However, requiring parses is expensive and does not e… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: accepted at NMT workshop (WNMT 2018)

  34. arXiv:1805.07719  [pdf, other

    cs.RO cs.CL

    Balancing Shared Autonomy with Human-Robot Communication

    Authors: Rosario Scalise, Yonatan Bisk, Maxwell Forbes, Daqing Yi, Yejin Choi, Siddhartha Srinivasa

    Abstract: Robotic agents that share autonomy with a human should leverage human domain knowledge and account for their preferences when completing a task. This extra knowledge can dramatically improve plan efficiency and user-satisfaction, but these gains are lost if communicating with a robot is taxing and unnatural. In this paper, we show how viewing humanrobot language through the lens of shared autonomy… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

  35. arXiv:1801.07357  [pdf, other

    cs.AI

    CHALET: Cornell House Agent Learning Environment

    Authors: Claudia Yan, Dipendra Misra, Andrew Bennnett, Aaron Walsman, Yonatan Bisk, Yoav Artzi

    Abstract: We present CHALET, a 3D house simulator with support for navigation and manipulation. CHALET includes 58 rooms and 10 house configuration, and allows to easily create new house and room layouts. CHALET supports a range of common household activities, including moving objects, toggling appliances, and placing objects inside closeable containers. The environment and actions available are designed to… ▽ More

    Submitted 16 September, 2019; v1 submitted 22 January, 2018; originally announced January 2018.

  36. arXiv:1712.03463  [pdf, other

    cs.CL

    Learning Interpretable Spatial Operations in a Rich 3D Blocks World

    Authors: Yonatan Bisk, Kevin J. Shih, Yejin Choi, Daniel Marcu

    Abstract: In this paper, we study the problem of mapping natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as "mirroring", "twisting", and "balancing". This dataset, built on the simulation environment of… ▽ More

    Submitted 24 December, 2017; v1 submitted 9 December, 2017; originally announced December 2017.

    Comments: AAAI 2018

  37. arXiv:1711.02173  [pdf, other

    cs.CL cs.LG

    Synthetic and Natural Noise Both Break Neural Machine Translation

    Authors: Yonatan Belinkov, Yonatan Bisk

    Abstract: Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems. Unfortunately, they are also very brittle and easily falter when presented with noisy data. In this paper, we confront NMT models with synthetic and natural sources of noise. We find that state-of-the-art models fail to trans… ▽ More

    Submitted 24 February, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

    Comments: ICLR 2018 camera-ready

    ACM Class: I.2.7

  38. arXiv:1710.02925  [pdf, ps, other

    cs.CL

    Natural Language Inference from Multiple Premises

    Authors: Alice Lai, Yonatan Bisk, Julia Hockenmaier

    Abstract: We define a novel textual entailment task that requires inference over multiple premise sentences. We present a new dataset for this task that minimizes trivial lexical inferences, emphasizes knowledge of everyday events, and presents a more challenging setting for textual entailment. We evaluate several strong neural baselines and analyze how the multiple premise task differs from standard textua… ▽ More

    Submitted 8 October, 2017; originally announced October 2017.

    Comments: Accepted at IJCNLP 2017

  39. arXiv:1609.09405  [pdf, other

    cs.CL cs.AI

    Evaluating Induced CCG Parsers on Grounded Semantic Parsing

    Authors: Yonatan Bisk, Siva Reddy, John Blitzer, Julia Hockenmaier, Mark Steedman

    Abstract: We compare the effectiveness of four different syntactic CCG parsers for a semantic slot-filling task to explore how much syntactic supervision is required for downstream semantic analysis. This extrinsic, task-based evaluation provides a unique window to explore the strengths and weaknesses of semantics captured by unsupervised grammar induction systems. We release a new Freebase semantic parsing… ▽ More

    Submitted 31 January, 2017; v1 submitted 29 September, 2016; originally announced September 2016.

    Comments: EMNLP 2016, Table 2 erratum, Code and Freebase Semantic Parsing data URL

  40. arXiv:1609.09007  [pdf, other

    cs.CL cs.LG

    Unsupervised Neural Hidden Markov Models

    Authors: Ke Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, Kevin Knight

    Abstract: In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.

    Submitted 28 September, 2016; originally announced September 2016.

    Comments: accepted at EMNLP 2016, Workshop on Structured Prediction for NLP. Oral presentation