Skip to main content

Showing 1–1 of 1 results for author: Lanka, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.18775  [pdf, other

    cs.CV cs.AI cs.LG

    ObitoNet: Multimodal High-Resolution Point Cloud Reconstruction

    Authors: Apoorv Thapliyal, Vinay Lanka, Swathi Baskaran

    Abstract: ObitoNet employs a Cross Attention mechanism to integrate multimodal inputs, where Vision Transformers (ViT) extract semantic features from images and a point cloud tokenizer processes geometric information using Farthest Point Sampling (FPS) and K Nearest Neighbors (KNN) for spatial structure capture. The learned multimodal features are fed into a transformer-based decoder for high-resolution poi… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.