Skip to main content

Showing 1–6 of 6 results for author: Mathur, A N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.17608  [pdf, ps, other

    cs.CV

    HIRE: Lightweight High-Resolution Image Feature Enrichment for Multimodal LLMs

    Authors: Nikitha SR, Aradhya Neeraj Mathur, Tarun Ram Menta, Rishabh Jain, Mausoom Sarkar

    Abstract: The integration of high-resolution image features in modern multimodal large language models has demonstrated significant improvements in fine-grained visual understanding tasks, achieving high performance across multiple benchmarks. Since these features are obtained from large image encoders like ViT, they come with a significant increase in computational costs due to multiple calls to these enco… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

    Comments: Accepted in CVPR 2025 Workshop on What's Next in Multimodal Foundational Models

  2. arXiv:2409.06620  [pdf, other

    cs.CV cs.GR

    MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

    Authors: Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera

    Abstract: The field of text-to-3D content generation has made significant progress in generating realistic 3D objects, with existing methodologies like Score Distillation Sampling (SDS) offering promising guidance. However, these methods often encounter the "Janus" problem-multi-face ambiguities due to imprecise guidance. Additionally, while recent advancements in 3D gaussian splitting have shown its effica… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: 13 pages, 10 figures

  3. arXiv:2409.00829  [pdf, other

    cs.CV cs.CG cs.GR

    Curvy: A Parametric Cross-section based Surface Reconstruction

    Authors: Aradhya N. Mathur, Apoorv Khattar, Ojaswa Sharma

    Abstract: In this work, we present a novel approach for reconstructing shape point clouds using planar sparse cross-sections with the help of generative modeling. We present unique challenges pertaining to the representation and reconstruction in this problem setting. Most methods in the classical literature lack the ability to generalize based on object class and employ complex mathematical machinery to re… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  4. arXiv:2312.04806  [pdf, other

    cs.CV

    RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation

    Authors: Aradhya N. Mathur, Phu Pham, Aniket Bera, Ojaswa Sharma

    Abstract: 3D generation has rapidly accelerated in the past decade owing to the progress in the field of generative modeling. Score Distillation Sampling (SDS) based rendering has improved 3D asset generation to a great extent. Further, the recent work of Denoising Diffusion Policy Optimization (DDPO) demonstrates that the diffusion process is compatible with policy gradient methods and has been demonstrate… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  5. arXiv:2010.16078  [pdf, other

    cs.CV eess.IV

    LIFI: Towards Linguistically Informed Frame Interpolation

    Authors: Aradhya Neeraj Mathur, Devansh Batra, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann

    Abstract: In this work, we explore a new problem of frame interpolation for speech videos. Such content today forms the major form of online communication. We try to solve this problem by using several deep learning video generation algorithms to generate the missing frames. We also provide examples where computer vision models despite showing high performance on conventional non-linguistic metrics fail to… ▽ More

    Submitted 2 December, 2020; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: 9 pages, 7 tables, 4 figures

  6. arXiv:2004.11702  [pdf, other

    eess.IV cs.GR

    Multimodal Medical Volume Colorization from 2D Style

    Authors: Aradhya Neeraj Mathur, Apoorv Khattar, Ojaswa Sharma

    Abstract: Colorization involves the synthesis of colors on a target image while preserving structural content as well as the semantics of the target image. This is a well-explored problem in 2D with many state-of-the-art solutions. We propose a novel deep learning-based approach for the colorization of 3D medical volumes. Our system is capable of directly mapping the colors of a 2D photograph to a 3D MRI vo… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.