Skip to main content

Showing 1–2 of 2 results for author: Wanchoo, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.13026  [pdf, other

    cs.CL cs.CV

    NAVCON: A Cognitively Inspired and Linguistically Grounded Corpus for Vision and Language Navigation

    Authors: Karan Wanchoo, Xiaoye Zuo, Hannah Gonzalez, Soham Dan, Georgios Georgakis, Dan Roth, Kostas Daniilidis, Eleni Miltsakaki

    Abstract: We present NAVCON, a large-scale annotated Vision-Language Navigation (VLN) corpus built on top of two popular datasets (R2R and RxR). The paper introduces four core, cognitively motivated and linguistically grounded, navigation concepts and an algorithm for generating large-scale silver annotations of naturally occurring linguistic realizations of these concepts in navigation instructions. We pai… ▽ More

    Submitted 17 December, 2024; v1 submitted 17 December, 2024; originally announced December 2024.

  2. arXiv:2203.05137  [pdf, other

    cs.CV cs.RO

    Cross-modal Map Learning for Vision and Language Navigation

    Authors: Georgios Georgakis, Karl Schmeckpeper, Karan Wanchoo, Soham Dan, Eleni Miltsakaki, Dan Roth, Kostas Daniilidis

    Abstract: We consider the problem of Vision-and-Language Navigation (VLN). The majority of current methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or using cross-modal attention over the egocentric observations of the agent. In contrast to other works, our key insight is that the association between language and vision is stronger when it occurs in explicit spatial repr… ▽ More

    Submitted 21 March, 2022; v1 submitted 9 March, 2022; originally announced March 2022.