Skip to main content

Showing 1–9 of 9 results for author: Van Hoorick, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.05331  [pdf, ps, other

    cs.RO

    A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation

    Authors: TRI LBM Team, Jose Barreiros, Andrew Beaulieu, Aditya Bhat, Rick Cory, Eric Cousineau, Hongkai Dai, Ching-Hsin Fang, Kunimatsu Hashimoto, Muhammad Zubair Irshad, Masha Itkina, Naveen Kuppuswamy, Kuan-Hui Lee, Katherine Liu, Dale McConachie, Ian McMahon, Haruki Nishimura, Calder Phillips-Grafflin, Charles Richter, Paarth Shah, Krishnan Srinivasan, Blake Wulfe, Chen Xu, Mengchao Zhang, Alex Alspach , et al. (57 additional authors not shown)

    Abstract: Robot manipulation has seen tremendous progress in recent years, with imitation learning policies enabling successful performance of dexterous and hard-to-model tasks. Concurrently, scaling data and model size has led to the development of capable language and vision foundation models, motivating large-scale efforts to create general-purpose robot foundation models. While these models have garnere… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2503.07739  [pdf, other

    cs.CV

    SIRE: SE(3) Intrinsic Rigidity Embeddings

    Authors: Cameron Smith, Basile Van Hoorick, Vitor Guizilini, Yue Wang

    Abstract: Motion serves as a powerful cue for scene perception and understanding by separating independently moving surfaces and organizing the physical world into distinct entities. We introduce SIRE, a self-supervised method for motion discovery of objects and dynamic scene reconstruction from casual scenes by learning intrinsic rigidity embeddings from videos. Our method trains an image encoder to estima… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  3. arXiv:2408.07147  [pdf, other

    cs.CV

    Controlling the World by Sleight of Hand

    Authors: Sruthi Sudhakar, Ruoshi Liu, Basile Van Hoorick, Carl Vondrick, Richard Zemel

    Abstract: Humans naturally build mental models of object interactions and dynamics, allowing them to imagine how their surroundings will change if they take a certain action. While generative models today have shown impressive results on generating/editing images unconditionally or conditioned on text, current methods do not provide the ability to perform object manipulation conditioned on actions, an impor… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  4. arXiv:2405.14868  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis

    Authors: Basile Van Hoorick, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick

    Abstract: Accurate reconstruction of complex dynamic scenes from just a single viewpoint continues to be a challenging task in computer vision. Current dynamic novel view synthesis methods typically require videos from many different camera viewpoints, necessitating careful recording setups, and significantly restricting their utility in the wild as well as in terms of embodied AI applications. In this pape… ▽ More

    Submitted 5 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to ECCV 2024. Project webpage is available at: https://gcd.cs.columbia.edu/

  5. arXiv:2305.03052  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Tracking through Containers and Occluders in the Wild

    Authors: Basile Van Hoorick, Pavel Tokmakov, Simon Stent, Jie Li, Carl Vondrick

    Abstract: Tracking objects with persistence in cluttered and dynamic environments remains a difficult challenge for computer vision systems. In this paper, we introduce $\textbf{TCOW}$, a new benchmark and model for visual tracking through heavy occlusion and containment. We set up a task where the goal is to, given a video sequence, segment both the projected extent of the target object, as well as the sur… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted at CVPR 2023. Project webpage is available at: https://tcow.cs.columbia.edu/

  6. arXiv:2303.11328  [pdf, other

    cs.CV cs.GR cs.RO

    Zero-1-to-3: Zero-shot One Image to 3D Object

    Authors: Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, Carl Vondrick

    Abstract: We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image. To perform novel view synthesis in this under-constrained setting, we capitalize on the geometric priors that large-scale diffusion models learn about natural images. Our conditional diffusion model uses a synthetic dataset to learn controls of the relative camera viewpoint, which al… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: Website: https://zero123.cs.columbia.edu/

  7. arXiv:2204.10916  [pdf, other

    cs.CV cs.LG

    Revealing Occlusions with 4D Neural Fields

    Authors: Basile Van Hoorick, Purva Tendulkar, Didac Suris, Dennis Park, Simon Stent, Carl Vondrick

    Abstract: For computer vision systems to operate in dynamic situations, they need to be able to represent and reason about object permanence. We introduce a framework for learning to estimate 4D visual representations from monocular RGB-D, which is able to persist objects, even once they become obstructed by occlusions. Unlike traditional video representations, we encode point clouds into a continuous repre… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: CVPR 2022 (Oral)

  8. arXiv:2011.11831  [pdf, other

    cs.CV

    Dissecting Image Crops

    Authors: Basile Van Hoorick, Carl Vondrick

    Abstract: The elementary operation of cropping underpins nearly every computer vision system, ranging from data augmentation and translation invariance to computational photography and representation learning. This paper investigates the subtle traces introduced by this operation. For example, despite refinements to camera optics, lenses will leave behind certain clues, notably chromatic aberration and vign… ▽ More

    Submitted 5 September, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

    Comments: Updated smartphone datasets & table; some rewording

  9. arXiv:1912.10960  [pdf, other

    cs.CV

    Image Outpainting and Harmonization using Generative Adversarial Networks

    Authors: Basile Van Hoorick

    Abstract: Although the inherently ambiguous task of predicting what resides beyond all four edges of an image has rarely been explored before, we demonstrate that GANs hold powerful potential in producing reasonable extrapolations. Two outpainting methods are proposed that aim to instigate this line of research: the first approach uses a context encoder inspired by common inpainting architectures and paradi… ▽ More

    Submitted 15 February, 2020; v1 submitted 23 December, 2019; originally announced December 2019.