Skip to main content

Showing 1–4 of 4 results for author: McVay, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.14151  [pdf, other

    cs.CV cs.AI cs.RO

    Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

    Authors: Sergio Arnaud, Paul McVay, Ada Martin, Arjun Majumdar, Krishna Murthy Jatavallabhula, Phillip Thomas, Ruslan Partsey, Daniel Dugas, Abha Gejji, Alexander Sax, Vincent-Pierre Berges, Mikael Henaff, Ayush Jain, Ang Cao, Ishita Prasad, Mrinal Kalakrishnan, Michael Rabbat, Nicolas Ballas, Mido Assran, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier

    Abstract: We present LOCATE 3D, a model for localizing objects in 3D scenes from referring expressions like "the small coffee table between the sofa and the lamp." LOCATE 3D sets a new state-of-the-art on standard referential grounding benchmarks and showcases robust generalization capabilities. Notably, LOCATE 3D operates directly on sensor observation streams (posed RGB-D frames), enabling real-world depl… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    ACM Class: I.2.10; I.2.6; I.2.9; I.3.7; I.4.6; I.4.8

  2. arXiv:2502.20389  [pdf, ps, other

    cs.CV

    From Thousands to Billions: 3D Visual Language Grounding via Render-Supervised Distillation from 2D VLMs

    Authors: Ang Cao, Sergio Arnaud, Oleksandr Maksymets, Jianing Yang, Ayush Jain, Sriram Yenamandra, Ada Martin, Vincent-Pierre Berges, Paul McVay, Ruslan Partsey, Aravind Rajeswaran, Franziska Meier, Justin Johnson, Jeong Joon Park, Alexander Sax

    Abstract: 3D vision-language grounding faces a fundamental data bottleneck: while 2D models train on billions of images, 3D models have access to only thousands of labeled scenes--a six-order-of-magnitude gap that severely limits performance. We introduce $\textbf{LIFT-GS}$, a practical distillation technique that overcomes this limitation by using differentiable rendering to bridge 3D and 2D supervision. L… ▽ More

    Submitted 9 June, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

    Comments: Project page: https://liftgs.github.io

  3. arXiv:2402.14083  [pdf, other

    cs.AI

    Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

    Authors: Lucas Lehnert, Sainbayar Sukhbaatar, DiJia Su, Qinqing Zheng, Paul Mcvay, Michael Rabbat, Yuandong Tian

    Abstract: While Transformers have enabled tremendous progress in various application settings, such architectures still trail behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks. This is accomplished by training an encoder-decoder Transformer model to predict the search dynamics of the $A^*$ se… ▽ More

    Submitted 26 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  4. arXiv:2202.01118  [pdf, ps, other

    cs.LG math.ST

    On Linear Separability under Linear Compression with Applications to Hard Support Vector Machine

    Authors: Paul McVay, Tie Liu, Krishna Narayanan

    Abstract: This paper investigates the theoretical problem of maintaining linear separability of the data-generating distribution under linear compression. While it has been long known that linear separability may be maintained by linear transformations that approximately preserve the inner products between the domain points, the limit to which the inner products are preserved in order to maintain linear sep… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: 12 pages