Skip to main content

Showing 1–3 of 3 results for author: Kareer, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.24221  [pdf, other

    cs.RO cs.CV

    EgoMimic: Scaling Imitation Learning via Egocentric Video

    Authors: Simar Kareer, Dhruv Patel, Ryan Punamiya, Pranay Mathur, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu

    Abstract: The scale and diversity of demonstration data required for imitation learning is a significant challenge. We present EgoMimic, a full-stack framework which scales manipulation via human embodiment data, specifically egocentric human videos paired with 3D hand tracking. EgoMimic achieves this through: (1) a system to capture human embodiment data using the ergonomic Project Aria glasses, (2) a low-… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

  2. arXiv:2402.00868  [pdf, other

    cs.CV

    We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline

    Authors: Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Prithvijit Chattopadhyay, Judy Hoffman, Viraj Prabhu

    Abstract: There has been abundant work in unsupervised domain adaptation for semantic segmentation (DAS) seeking to adapt a model trained on images from a labeled source domain to an unlabeled target domain. While the vast majority of prior work has studied this as a frame-level Image-DAS problem, a few Video-DAS works have sought to additionally leverage the temporal signal present in adjacent frames. Howe… ▽ More

    Submitted 27 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: TMLR 2024

  3. arXiv:2210.14791  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    ViNL: Visual Navigation and Locomotion Over Obstacles

    Authors: Simar Kareer, Naoki Yokoyama, Dhruv Batra, Sehoon Ha, Joanne Truong

    Abstract: We present Visual Navigation and Locomotion over obstacles (ViNL), which enables a quadrupedal robot to navigate unseen apartments while stepping over small obstacles that lie in its path (e.g., shoes, toys, cables), similar to how humans and pets lift their feet over objects as they walk. ViNL consists of: (1) a visual navigation policy that outputs linear and angular velocity commands that guide… ▽ More

    Submitted 12 October, 2023; v1 submitted 26 October, 2022; originally announced October 2022.