Skip to main content

Showing 1–25 of 25 results for author: Chhatkuli, A

.
  1. arXiv:2505.04109  [pdf, other

    cs.CV

    One2Any: One-Reference 6D Pose Estimation for Any Object

    Authors: Mengya Liu, Siyuan Li, Ajad Chhatkuli, Prune Truong, Luc Van Gool, Federico Tombari

    Abstract: 6D object pose estimation remains challenging for many applications due to dependencies on complete 3D models, multi-view images, or training limited to specific object categories. These requirements make generalization to novel objects difficult for which neither 3D models nor multi-view images may be available. To address this, we propose a novel method One2Any that estimates the relative 6-degr… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: accepted by CVPR 2025

    Journal ref: CVPR 2025

  2. arXiv:2409.15939  [pdf, other

    cs.CV

    Self-supervised Shape Completion via Involution and Implicit Correspondences

    Authors: Mengya Liu, Ajad Chhatkuli, Janis Postels, Luc Van Gool, Federico Tombari

    Abstract: 3D shape completion is traditionally solved using supervised training or by distribution learning on complete shape examples. Recently self-supervised learning approaches that do not require any complete 3D shape examples have gained more interests. In this paper, we propose a non-adversarial self-supervised approach for the shape completion task. Our first finding is that completion problems can… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

    Comments: ECCV 2024

  3. arXiv:2408.08766  [pdf, other

    cs.CV

    VF-NeRF: Learning Neural Vector Fields for Indoor Scene Reconstruction

    Authors: Albert Gassol Puigjaner, Edoardo Mello Rella, Erik Sandström, Ajad Chhatkuli, Luc Van Gool

    Abstract: Implicit surfaces via neural radiance fields (NeRF) have shown surprising accuracy in surface reconstruction. Despite their success in reconstructing richly textured surfaces, existing methods struggle with planar regions with weak textures, which account for the majority of indoor scenes. In this paper, we address indoor dense surface reconstruction by revisiting key aspects of NeRF in order to u… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 15 pages

  4. arXiv:2407.11174  [pdf, other

    cs.CV cs.AI

    iHuman: Instant Animatable Digital Humans From Monocular Videos

    Authors: Pramish Paudel, Anubhav Khanal, Ajad Chhatkuli, Danda Pani Paudel, Jyoti Tandukar

    Abstract: Personalized 3D avatars require an animatable representation of digital humans. Doing so instantly from monocular videos offers scalability to broad class of users and wide-scale applications. In this paper, we present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos. Our method utilizes the efficiency of Gaussian splatting to model both 3D geome… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 15 pages, eccv, 2024

  5. arXiv:2312.15471  [pdf, other

    cs.CV cs.RO

    Residual Learning for Image Point Descriptors

    Authors: Rashik Shrestha, Ajad Chhatkuli, Menelaos Kanakis, Luc Van Gool

    Abstract: Local image feature descriptors have had a tremendous impact on the development and application of computer vision methods. It is therefore unsurprising that significant efforts are being made for learning-based image point descriptors. However, the advantage of learned methods over handcrafted methods in real applications is subtle and more nuanced than expected. Moreover, handcrafted descriptors… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  6. arXiv:2311.17119  [pdf, other

    cs.CV

    Continuous Pose for Monocular Cameras in Neural Implicit Representation

    Authors: Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time. The camera poses are represented using an implicit neural function which maps the given time to the corresponding camera pose. The mapped camera poses are then used for the downstream tasks where joint camera pose optimization is also required. While doing so, the network parameters… ▽ More

    Submitted 2 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  7. arXiv:2309.08416  [pdf, other

    cs.CV

    Deformable Neural Radiance Fields using RGB and Event Cameras

    Authors: Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Modeling Neural Radiance Fields for fast-moving deformable objects from visual data alone is a challenging problem. A major issue arises due to the high deformation and low acquisition rates. To address this problem, we propose to use event cameras that offer very fast acquisition of visual change in an asynchronous manner. In this work, we develop a novel method to model the deformable neural rad… ▽ More

    Submitted 25 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

  8. arXiv:2207.10436  [pdf, other

    cs.CV

    Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

    Authors: Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc Van Gool

    Abstract: The essence of video semantic segmentation (VSS) is how to leverage temporal information for prediction. Previous efforts are mainly devoted to developing new techniques to calculate the cross-frame affinities such as optical flow and attention. Instead, this paper contributes from a different angle by mining relations among cross-frame affinities, upon which better temporal information aggregatio… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  9. arXiv:2204.06552  [pdf, other

    cs.CV

    Neural Vector Fields for Implicit Surface Representation and Inference

    Authors: Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc Van Gool

    Abstract: Implicit fields have recently shown increasing success in representing and learning 3D shapes accurately. Signed distance fields and occupancy fields are decades old and still the preferred representations, both with well-studied properties, despite their restriction to closed surfaces. With neural networks, several other variations and training principles have been proposed with the goal to repre… ▽ More

    Submitted 7 April, 2023; v1 submitted 13 April, 2022; originally announced April 2022.

  10. arXiv:2203.08795  [pdf, other

    cs.CV cs.LG

    Zero Pixel Directional Boundary by Vector Transform

    Authors: Edoardo Mello Rella, Ajad Chhatkuli, Yun Liu, Ender Konukoglu, Luc Van Gool

    Abstract: Boundaries are among the primary visual cues used by human and computer vision systems. One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned. In this paper, we re-interpret boundaries as 1-D surfaces and formulate a one-to-one v… ▽ More

    Submitted 8 September, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Published at the Tenth International Conference on Learning Representations (ICLR 2022)

  11. arXiv:2203.03610  [pdf, other

    cs.CV cs.LG cs.RO

    ZippyPoint: Fast Interest Point Detection, Description, and Matching through Mixed Precision Discretization

    Authors: Menelaos Kanakis, Simon Maurer, Matteo Spallanzani, Ajad Chhatkuli, Luc Van Gool

    Abstract: Efficient detection and description of geometric regions in images is a prerequisite in visual systems for localization and mapping. Such systems still rely on traditional hand-crafted methods for efficient generation of lightweight descriptors, a common limitation of the more powerful neural network models that come with high compute and specific hardware requirements. In this paper, we focus on… ▽ More

    Submitted 8 April, 2023; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: Computer Vision and Pattern Recognition Workshop (CVPRW), 2023

  12. arXiv:2109.04813  [pdf, other

    cs.CV

    TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation

    Authors: Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc Van Gool

    Abstract: Traditional domain adaptive semantic segmentation addresses the task of adapting a model to a novel target domain under limited or no additional supervision. While tackling the input domain gap, the standard domain adaptation settings assume no domain change in the output space. In semantic prediction tasks, different datasets are often labeled according to different semantic taxonomies. In many r… ▽ More

    Submitted 28 July, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted by ECCV 2022

  13. arXiv:2106.03180  [pdf, other

    cs.CV

    Vision Transformers with Hierarchical Attention

    Authors: Yun Liu, Yu-Huan Wu, Guolei Sun, Le Zhang, Ajad Chhatkuli, Luc Van Gool

    Abstract: This paper tackles the high computational/space complexity associated with Multi-Head Self-Attention (MHSA) in vanilla vision transformers. To this end, we propose Hierarchical MHSA (H-MHSA), a novel approach that computes self-attention in a hierarchical fashion. Specifically, we first divide the input image into patches as commonly done, and each patch is viewed as a token. Then, the proposed H-… ▽ More

    Submitted 26 March, 2024; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: Machine Intelligence Research (MIR), DOI: 10.1007/s11633-024-1393-8

  14. arXiv:2102.06696  [pdf, other

    cs.CV

    Efficient Conditional GAN Transfer with Knowledge Propagation across Classes

    Authors: Mohamad Shahbazi, Zhiwu Huang, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Generative adversarial networks (GANs) have shown impressive results in both unconditional and conditional image generation. In recent literature, it is shown that pre-trained GANs, on a different dataset, can be transferred to improve the image generation from a small target data. The same, however, has not been well-studied in the case of conditional GANs (cGANs), which provides new opportunitie… ▽ More

    Submitted 31 March, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: The is available at: https://github.com/mshahbazi72/cGANTransfer

  15. arXiv:2012.15680  [pdf, other

    cs.CV

    Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes

    Authors: Ayça Takmaz, Danda Pani Paudel, Thomas Probst, Ajad Chhatkuli, Martin R. Oswald, Luc Van Gool

    Abstract: Monocular depth reconstruction of complex and dynamic scenes is a highly challenging problem. While for rigid scenes learning-based methods have been offering promising results even in unsupervised cases, there exists little to no literature addressing the same for dynamic and deformable scenes. In this work, we present an unsupervised monocular framework for dense depth estimation of dynamic scen… ▽ More

    Submitted 28 October, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

  16. arXiv:2012.08278  [pdf, other

    cs.CV

    Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation

    Authors: Rui Gong, Yuhua Chen, Danda Pani Paudel, Yawei Li, Ajad Chhatkuli, Wen Li, Dengxin Dai, Luc Van Gool

    Abstract: Open compound domain adaptation (OCDA) is a domain adaptation setting, where target domain is modeled as a compound of multiple unknown homogeneous domains, which brings the advantage of improved generalization to unseen domains. In this work, we propose a principled meta-learning based approach to OCDA for semantic segmentation, MOCDA, by modeling the unlabeled target domain continuously. Our app… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: 18 pages, 8 figures, 8 tables

  17. arXiv:2008.12165  [pdf, other

    cs.CV

    Learning Condition Invariant Features for Retrieval-Based Localization from 1M Images

    Authors: Janine Thoma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Image features for retrieval-based localization must be invariant to dynamic objects (e.g. cars) as well as seasonal and daytime changes. Such invariances are, up to some extent, learnable with existing methods using triplet-like losses, given a large number of diverse training images. However, due to the high algorithmic training complexity, there exists insufficient comparison between different… ▽ More

    Submitted 8 December, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

  18. arXiv:2007.02045  [pdf, other

    cs.CV

    Self-Calibration Supported Robust Projective Structure-from-Motion

    Authors: Rui Gong, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Typical Structure-from-Motion (SfM) pipelines rely on finding correspondences across images, recovering the projective structure of the observed scene and upgrading it to a metric frame using camera self-calibration constraints. Solving each problem is mainly carried out independently from the others. For instance, camera self-calibration generally assumes correct matches and a good projective rec… ▽ More

    Submitted 4 July, 2020; originally announced July 2020.

    Comments: 21 pages, 5 figures, 2 tables

  19. Geometrically Mappable Image Features

    Authors: Janine Thoma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Vision-based localization of an agent in a map is an important problem in robotics and computer vision. In that context, localization by learning matchable image features is gaining popularity due to recent advances in machine learning. Features that uniquely describe the visual contents of images have a wide range of applications, including image retrieval and understanding. In this work, we prop… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

    Comments: Implementation available at https://github.com/janinethoma/geometrically_mappable

    Journal ref: IEEE Robotics and Automation Letters 5, no. 2 (2020): 2062-2069

  20. arXiv:2003.07619  [pdf, other

    cs.CV

    Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets

    Authors: Clara Fernandez-Labrador, Ajad Chhatkuli, Danda Pani Paudel, Jose J. Guerrero, Cédric Demonceaux, Luc Van Gool

    Abstract: Automatic discovery of category-specific 3D keypoints from a collection of objects of some category is a challenging problem. One reason is that not all objects in a category necessarily have the same semantic parts. The level of difficulty adds up further when objects are represented by 3D point clouds, with variations in shape and unknown coordinate frames. We define keypoints to be category-spe… ▽ More

    Submitted 6 January, 2021; v1 submitted 17 March, 2020; originally announced March 2020.

  21. arXiv:1909.12034  [pdf, other

    cs.CV

    Convex Relaxations for Consensus and Non-Minimal Problems in 3D Vision

    Authors: Thomas Probst, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: In this paper, we formulate a generic non-minimal solver using the existing tools of Polynomials Optimization Problems (POP) from computational algebraic geometry. The proposed method exploits the well known Shor's or Lasserre's relaxations, whose theoretical aspects are also discussed. Notably, we further exploit the POP formulation of non-minimal solver also for the generic consensus maximizatio… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted to ICCV'19

  22. arXiv:1812.03795  [pdf, other

    cs.CV

    Mapping, Localization and Path Planning for Image-based Navigation using Visual Features and Map

    Authors: Janine Thoma, Danda Pani Paudel, Ajad Chhatkuli, Thomas Probst, Luc Van Gool

    Abstract: Building on progress in feature representations for image retrieval, image-based localization has seen a surge of research interest. Image-based localization has the advantage of being inexpensive and efficient, often avoiding the use of 3D metric maps altogether. That said, the need to maintain a large number of reference images as an effective support of localization in a scene, nonetheless call… ▽ More

    Submitted 11 July, 2019; v1 submitted 10 December, 2018; originally announced December 2018.

    Comments: CVPR 2019, for implementation see https://github.com/janinethoma

  23. arXiv:1808.04181  [pdf, other

    cs.CV

    Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length

    Authors: Thomas Probst, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: The perspective camera and the isometric surface prior have recently gathered increased attention for Non-Rigid Structure-from-Motion (NRSfM). Despite the recent progress, several challenges remain, particularly the computational complexity and the unknown camera focal length. In this paper we present a method for incremental Non-Rigid Structure-from-Motion (NRSfM) with the perspective camera mode… ▽ More

    Submitted 13 August, 2018; originally announced August 2018.

    Comments: ECCV 2018

  24. arXiv:1807.01963  [pdf, other

    cs.CV

    Model-free Consensus Maximization for Non-Rigid Shapes

    Authors: Thomas Probst, Ajad Chhatkuli, Danda Pani Paudel, Luc Van Gool

    Abstract: Many computer vision methods use consensus maximization to relate measurements containing outliers with the correct transformation model. In the context of rigid shapes, this is typically done using Random Sampling and Consensus (RANSAC) by estimating an analytical model that agrees with the largest number of measurements (inliers). However, small parameter models may not be always available. In t… ▽ More

    Submitted 13 August, 2018; v1 submitted 5 July, 2018; originally announced July 2018.

    Comments: ECCV18

  25. arXiv:1709.05665  [pdf, other

    cs.CV cs.RO

    Automatic Tool Landmark Detection for Stereo Vision in Robot-Assisted Retinal Surgery

    Authors: Thomas Probst, Kevis-Kokitsi Maninis, Ajad Chhatkuli, Mouloud Ourak, Emmanuel Vander Poorten, Luc Van Gool

    Abstract: Computer vision and robotics are being increasingly applied in medical interventions. Especially in interventions where extreme precision is required they could make a difference. One such application is robot-assisted retinal microsurgery. In recent works, such interventions are conducted under a stereo-microscope, and with a robot-controlled surgical tool. The complementarity of computer vision… ▽ More

    Submitted 20 November, 2017; v1 submitted 17 September, 2017; originally announced September 2017.

    Comments: Accepted in Robotics and Automation Letters (RA-L). Project page: http://www.vision.ee.ethz.ch/~kmaninis/keypoints2stereo/index.html