Skip to main content

Showing 1–6 of 6 results for author: Tomar, S S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.14582  [pdf, other

    cs.CV

    NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu , et al. (86 additional authors not shown)

    Abstract: This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that ach… ▽ More

    Submitted 28 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

    Comments: NTIRE 2025 webpage: https://www.cvlai.net/ntire/2025. Code: https://github.com/zhengchen1999/NTIRE2025_ImageSR_x4

  2. arXiv:2406.19299  [pdf, other

    cs.CV

    PNeRV: A Polynomial Neural Representation for Videos

    Authors: Sonam Gupta, Snehal Singh Tomar, Grigorios G Chrysos, Sukhendu Das, A. N. Rajagopalan

    Abstract: Extracting Implicit Neural Representations (INRs) on video data poses unique challenges due to the additional temporal dimension. In the context of videos, INRs have predominantly relied on a frame-only parameterization, which sacrifices the spatiotemporal continuity observed in pixel-level (spatial) representations. To mitigate this, we introduce Polynomial Neural Representation for Videos (PNeRV… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 25 pages, 17 figures, published at TMLR, Feb 2024

  3. arXiv:2312.15037  [pdf, other

    cs.CV

    Latents2Semantics: Leveraging the Latent Space of Generative Models for Localized Style Manipulation of Face Images

    Authors: Snehal Singh Tomar, A. N. Rajagopalan

    Abstract: With the metaverse slowly becoming a reality and given the rapid pace of developments toward the creation of digital humans, the need for a principled style editing pipeline for human faces is bound to increase manifold. We cater to this need by introducing the Latents2Semantics Autoencoder (L2SAE), a Generative Autoencoder model that facilitates highly localized editing of style attributes of sev… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: Accepted as an oral paper at the AAAI-24 Workshop on AI for Digital Human

  4. arXiv:2211.11224  [pdf, other

    cs.CV

    Exploring the Effectiveness of Mask-Guided Feature Modulation as a Mechanism for Localized Style Editing of Real Images

    Authors: Snehal Singh Tomar, Maitreya Suin, A. N. Rajagopalan

    Abstract: The success of Deep Generative Models at high-resolution image generation has led to their extensive utilization for style editing of real images. Most existing methods work on the principle of inverting real images onto their latent space, followed by determining controllable directions. Both inversion of real images and determination of controllable latent directions are computationally expensiv… ▽ More

    Submitted 10 December, 2022; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: To appear as a Student Abstract Track paper in Proceedings of the AAAI Conference (2023)

  5. arXiv:2211.11066  [pdf, other

    cs.CV

    Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth Estimation

    Authors: Snehal Singh Tomar, Maitreya Suin, A. N. Rajagopalan

    Abstract: With an unprecedented increase in the number of agents and systems that aim to navigate the real world using visual cues and the rising impetus for 3D Vision Models, the importance of depth estimation is hard to understate. While supervised methods remain the gold standard in the domain, the copious amount of paired stereo data required to train such models makes them impractical. Most State of th… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: Presented at the Advances in Image Manipulation Workshop at ECCV 2022

  6. arXiv:2207.01871  [pdf, other

    cs.CV

    Latents2Segments: Disentangling the Latent Space of Generative Models for Semantic Segmentation of Face Images

    Authors: Snehal Singh Tomar, A. N. Rajagopalan

    Abstract: With the advent of an increasing number of Augmented and Virtual Reality applications that aim to perform meaningful and controlled style edits on images of human faces, the impetus for the task of parsing face images to produce accurate and fine-grained semantic segmentation maps is more than ever before. Few State of the Art (SOTA) methods which solve this problem, do so by incorporating priors… ▽ More

    Submitted 6 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: 5 pages, 4 figures, 2 tables. The paper has already been accepted to and presented at CVPR Workshop on Computer Vision for Augmented and Virtual Reality, New Orleans, LA, 2022

    MSC Class: 68T45 ACM Class: I.2; I.4