Skip to main content

Showing 1–7 of 7 results for author: Kowdle, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2204.09171  [pdf, other

    cs.RO cs.CV

    Learned Monocular Depth Priors in Visual-Inertial Initialization

    Authors: Yunwen Zhou, Abhishek Kar, Eric Turner, Adarsh Kowdle, Chao X. Guo, Ryan C. DuToit, Konstantine Tsotsos

    Abstract: Visual-inertial odometry (VIO) is the pose estimation backbone for most AR/VR and autonomous robotic systems today, in both academia and industry. However, these systems are highly sensitive to the initialization of key parameters such as sensor biases, gravity direction, and metric scale. In practical scenarios where high-parallax or variable acceleration assumptions are rarely met (e.g. hovering… ▽ More

    Submitted 1 August, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: to be published in 2022 European Conference on Computer Vision

  2. arXiv:2007.12140  [pdf, other

    cs.CV

    HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching

    Authors: Vladimir Tankovich, Christian Häne, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz

    Abstract: This paper presents HITNet, a novel neural network architecture for real-time stereo matching. Contrary to many recent neural network approaches that operate on a full cost volume and rely on 3D convolutions, our approach does not explicitly build a volume and instead relies on a fast multi-resolution initialization step, differentiable 2D geometric propagation and warping mechanisms to infer disp… ▽ More

    Submitted 19 January, 2023; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: The pretrained models used for submission to benchmarks and sample evaluation scripts can be found at https://github.com/google-research/google-research/tree/master/hitnet

  3. arXiv:2002.03977  [pdf

    eess.AS cs.LG cs.MM stat.ML

    Multimodal active speaker detection and virtual cinematography for video conferencing

    Authors: Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle

    Abstract: Active speaker detection (ASD) and virtual cinematography (VC) can significantly improve the remote user experience of a video conference by automatically panning, tilting and zooming of a video conferencing camera: users subjectively rate an expert video cinematographer's video significantly higher than unedited video. We describe a new automated ASD and VC that performs within 0.3 MOS of an expe… ▽ More

    Submitted 24 May, 2022; v1 submitted 10 February, 2020; originally announced February 2020.

  4. arXiv:1811.05029  [pdf, other

    cs.CV

    LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering

    Authors: Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, Adarsh Kowdle, Christoph Rhemann, Dan B Goldman, Cem Keskin, Steve Seitz, Shahram Izadi, Sean Fanello

    Abstract: Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augmen… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

    Comments: The supplementary video is available at: http://youtu.be/Md3tdAKoLGU To be presented at SIGGRAPH Asia 2018

  5. arXiv:1807.08865  [pdf, other

    cs.CV

    StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction

    Authors: Sameh Khamis, Sean Fanello, Christoph Rhemann, Adarsh Kowdle, Julien Valentin, Shahram Izadi

    Abstract: This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60 fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free disparity maps. A key insight of this paper is that the network achieves a sub-pixel matching precision than is a magnitude higher than those of traditional stereo matching approaches. This allows us… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: ECCV 2018

  6. arXiv:1807.06009  [pdf, other

    cs.CV

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Authors: Yinda Zhang, Sameh Khamis, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Michael Schoenberg, Shahram Izadi, Thomas Funkhouser, Sean Fanello

    Abstract: In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of $1/30th$ of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconst… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

    Comments: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Materials

  7. arXiv:1110.5102  [pdf, other

    cs.CV cs.AI cs.RO

    Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models

    Authors: Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen

    Abstract: Scene understanding includes many related sub-tasks, such as scene categorization, depth estimation, object detection, etc. Each of these sub-tasks is often notoriously hard, and state-of-the-art classifiers already exist for many of them. These classifiers operate on the same raw image and provide correlated outputs. It is desirable to have an algorithm that can capture such correlation without r… ▽ More

    Submitted 23 October, 2011; originally announced October 2011.

    Comments: 14 pages, 11 figures