Skip to main content

Showing 1–1 of 1 results for author: Tibrewal, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08002  [pdf, ps, other

    cs.CV

    Aligning Text, Images, and 3D Structure Token-by-Token

    Authors: Aadarsh Sahoo, Vansh Tibrewal, Georgia Gkioxari

    Abstract: Creating machines capable of understanding the world in 3D is essential in assisting designers that build and edit 3D environments and robots navigating and interacting within a three-dimensional space. Inspired by advances in language and image modeling, we investigate the potential of autoregressive models for a new modality: structured 3D scenes. To this end, we propose a unified LLM framework… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Project webpage: https://glab-caltech.github.io/kyvo/