Skip to main content

Showing 1–50 of 65 results for author: Veeraraghavan, A

.
  1. Fit Pixels, Get Labels: Meta-learned Implicit Networks for Image Segmentation

    Authors: Kushal Vyas, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Implicit neural representations (INRs) have achieved remarkable successes in learning expressive yet compact signal representations. However, they are not naturally amenable to predictive tasks such as segmentation, where they must learn semantic structures over a distribution of signals. In this study, we introduce MetaSeg, a meta-learning framework to train INRs for medical image segmentation. M… ▽ More

    Submitted 5 October, 2025; originally announced October 2025.

    Comments: MICCAI 2025 (oral). Final peer-reviewed copy accessible at publisher DOI https://link.springer.com/chapter/10.1007/978-3-032-04947-6_19 . Project page, https://kushalvyas.github.io/metaseg.html

  2. arXiv:2509.22240  [pdf, ps, other

    eess.IV cs.CV cs.LG stat.AP stat.ML

    COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics

    Authors: Matt Y. Cheung, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: In clinical applications, the utility of segmentation models is often based on the accuracy of derived downstream metrics such as organ size, rather than by the pixel-level accuracy of the segmentation masks themselves. Thus, uncertainty quantification for such metrics is crucial for decision-making. Conformal prediction (CP) is a popular framework to derive such principled uncertainty guarantees,… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

  3. arXiv:2508.18389  [pdf, ps, other

    cs.CV

    FastAvatar: Instant 3D Gaussian Splatting for Faces from Single Unconstrained Poses

    Authors: Hao Liang, Zhixuan Ge, Ashish Tiwari, Soumendu Majee, G. M. Dilshan Godaliyadda, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: We present FastAvatar, a pose-invariant, feed-forward framework that can generate a 3D Gaussian Splatting (3DGS) model from a single face image from an arbitrary pose in near-instant time (<10ms). FastAvatar uses a novel encoder-decoder neural network design to achieve both fast fitting and identity preservation regardless of input pose. First, FastAvatar constructs a 3DGS face ``template'' model… ▽ More

    Submitted 25 August, 2025; originally announced August 2025.

    Comments: 11 pages, 5 figures

  4. Post-Hurricane Debris Segmentation Using Fine-Tuned Foundational Vision Models

    Authors: Kooshan Amini, Yuhao Liu, Jamie Ellen Padgett, Guha Balakrishnan, Ashok Veeraraghavan

    Abstract: Timely and accurate detection of hurricane debris is critical for effective disaster response and community resilience. While post-disaster aerial imagery is readily available, robust debris segmentation solutions applicable across multiple disaster regions remain limited. Developing a generalized solution is challenging due to varying environmental and imaging conditions that alter debris' visual… ▽ More

    Submitted 18 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: 12 pages, 8 figures

  5. arXiv:2503.22676  [pdf, other

    cs.CV

    TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting

    Authors: Tony Yu, Yanlin Jin, Ashok Veeraraghavan, Akshat Dave, Guha Balakrishnan

    Abstract: We present TranSplat, a 3D scene rendering algorithm that enables realistic cross-scene object transfer (from a source to a target scene) based on the Gaussian Splatting framework. Our approach addresses two critical challenges: (1) precise 3D object extraction from the source scene, and (2) faithful relighting of the transferred object in the target scene without explicit material property estima… ▽ More

    Submitted 7 May, 2025; v1 submitted 28 March, 2025; originally announced March 2025.

  6. arXiv:2502.02771  [pdf, other

    physics.med-ph cs.CV cs.LG eess.IV stat.AP

    When are Diffusion Priors Helpful in Sparse Reconstruction? A Study with Sparse-view CT

    Authors: Matt Y. Cheung, Sophia Zorek, Tucker J. Netherton, Laurence E. Court, Sadeer Al-Kindi, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Diffusion models demonstrate state-of-the-art performance on image generation, and are gaining traction for sparse medical image reconstruction tasks. However, compared to classical reconstruction algorithms relying on simple analytical priors, diffusion models have the dangerous property of producing realistic looking results \emph{even when incorrect}, particularly with few observations. We inve… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: Accepted at IEEE ISBI 2025, 5 pages, 2 figures, 1 table

  7. arXiv:2501.18361  [pdf, other

    cs.CV

    Video-based Surgical Tool-tip and Keypoint Tracking using Multi-frame Context-driven Deep Learning Models

    Authors: Bhargav Ghanekar, Lianne R. Johnson, Jacob L. Laughlin, Marcia K. O'Malley, Ashok Veeraraghavan

    Abstract: Automated tracking of surgical tool keypoints in robotic surgery videos is an essential task for various downstream use cases such as skill assessment, expertise assessment, and the delineation of safety zones. In recent years, the explosion of deep learning for vision applications has led to many works in surgical instrument segmentation, while lesser focus has been on tracking specific tool keyp… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

  8. arXiv:2410.05263  [pdf, other

    stat.ML cs.AI cs.LG math.ST stat.ME

    Regression Conformal Prediction under Bias

    Authors: Matt Y. Cheung, Tucker J. Netherton, Laurence E. Court, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Uncertainty quantification is crucial to account for the imperfect predictions of machine learning algorithms for high-impact applications. Conformal prediction (CP) is a powerful framework for uncertainty quantification that generates calibrated prediction intervals with valid coverage. In this work, we study how CP intervals are affected by bias - the systematic deviation of a prediction from gr… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: 17 pages, 6 figures, code available at: https://github.com/matthewyccheung/conformal-metric

  9. arXiv:2410.00381  [pdf, ps, other

    cs.LG cs.AI

    Downscaling Extreme Precipitation with Wasserstein Regularized Diffusion

    Authors: Yuhao Liu, James Doss-Gollin, Qiushi Dai, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Understanding the risks posed by extreme rainfall events requires analysis of precipitation fields with high resolution (to assess localized hazards) and extensive historical coverage (to capture sufficient examples of rare occurrences). Radar and mesonet networks provide precipitation fields at 1 km resolution but with limited historical and geographical coverage, while gauge-based records and re… ▽ More

    Submitted 12 August, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: 18 pages, 10 figures, 3 tables

  10. arXiv:2409.09566  [pdf, other

    cs.CV cs.AI

    Learning Transferable Features for Implicit Neural Representations

    Authors: Kushal Vyas, Ahmed Imtiaz Humayun, Aniket Dashpute, Richard G. Baraniuk, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Implicit neural representations (INRs) have demonstrated success in a variety of applications, including inverse problems and neural rendering. An INR is typically trained to capture one signal of interest, resulting in learned neural features that are highly attuned to that signal. Assumed to be less generalizable, we explore the aspect of transferability of such learned neural features for fitti… ▽ More

    Submitted 9 January, 2025; v1 submitted 14 September, 2024; originally announced September 2024.

    Comments: Project Website: https://kushalvyas.github.io/strainer.html

  11. arXiv:2408.15118  [pdf, other

    eess.IV cs.CV

    DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays

    Authors: Yiran Sun, Hana Baroudi, Tucker Netherton, Laurence Court, Osama Mawlawi, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Computed Tomography (CT) scans are the standard-of-care for the visualization and diagnosis of many clinical ailments, and are needed for the treatment planning of external beam radiotherapy. Unfortunately, the availability of CT scanners in low- and mid-resource settings is highly variable. Planar x-ray radiography units, in comparison, are far more prevalent, but can only provide limited 2D obse… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 11 pages, 9 figures

  12. arXiv:2406.10212  [pdf, other

    cs.CV cs.GR

    NeST: Neural Stress Tensor Tomography by leveraging 3D Photoelasticity

    Authors: Akshat Dave, Tianyi Zhang, Aaron Young, Ramesh Raskar, Wolfgang Heidrich, Ashok Veeraraghavan

    Abstract: Photoelasticity enables full-field stress analysis in transparent objects through stress-induced birefringence. Existing techniques are limited to 2D slices and require destructively slicing the object. Recovering the internal 3D stress distribution of the entire object is challenging as it involves solving a tensor tomography problem and handling phase wrapping ambiguities. We introduce NeST, an… ▽ More

    Submitted 24 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Project webpage: https://akshatdave.github.io/nest

  13. arXiv:2406.00859  [pdf, other

    eess.IV cs.CV

    Streaming quanta sensors for online, high-performance imaging and vision

    Authors: Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan

    Abstract: Recently quanta image sensors (QIS) -- ultra-fast, zero-read-noise binary image sensors -- have demonstrated remarkable imaging capabilities in many challenging scenarios. Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data. We introduce a simple, low-bandwidth comp… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  14. arXiv:2404.15274  [pdf, ps, other

    cs.LG cs.CV eess.IV physics.med-ph

    Metric-Guided Conformal Bounds for Probabilistic Image Reconstruction

    Authors: Matt Y Cheung, Tucker J Netherton, Laurence E Court, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: Modern deep learning reconstruction algorithms generate impressively realistic scans from sparse inputs, but can often produce significant inaccuracies. This makes it difficult to provide statistically guaranteed claims about the true state of a subject from scans reconstructed by these algorithms. In this study, we propose a framework for computing provably valid prediction bounds on claims deriv… ▽ More

    Submitted 26 September, 2025; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted as Long Oral at UNSURE @ MICCAI 2025. 11 pages, 4 figures, 1 table, 2 algorithms. Code available at https://github.com/matthewyccheung/conformal-metric. Previously titled "Metric-guided Image Reconstruction Bounds via Conformal Prediction"

  15. arXiv:2404.07985  [pdf, other

    cs.CV eess.IV

    WaveMo: Learning Wavefront Modulations to See Through Scattering

    Authors: Mingyang Xie, Haiyun Guo, Brandon Y. Feng, Lingbo Jin, Ashok Veeraraghavan, Christopher A. Metzler

    Abstract: Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introdu… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  16. arXiv:2403.13199  [pdf, other

    cs.CV cs.DC

    DecentNeRFs: Decentralized Neural Radiance Fields from Crowdsourced Images

    Authors: Zaid Tasneem, Akshat Dave, Abhishek Singh, Kushagra Tiwary, Praneeth Vepakomma, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: Neural radiance fields (NeRFs) show potential for transforming images captured worldwide into immersive 3D visual experiences. However, most of this captured visual data remains siloed in our camera rolls as these images contain personal details. Even if made public, the problem of learning 3D representations of billions of scenes captured daily in a centralized manner is computationally intractab… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  17. arXiv:2402.18102  [pdf, other

    eess.IV cs.CV

    Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

    Authors: Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma, Shreyas Singh, Vivek Boominathan, Kaushik Mitra, Ashok Veeraraghavan

    Abstract: Passive, compact, single-shot 3D sensing is useful in many application areas such as microscopy, medical imaging, surgical navigation, and autonomous driving where form factor, time, and power constraints can exist. Obtaining RGB-D scene information over a short imaging distance, in an ultra-compact form factor, and in a passive, snapshot manner is challenging. Dual-pixel (DP) sensors are a potent… ▽ More

    Submitted 30 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  18. arXiv:2312.04679  [pdf, other

    eess.IV cs.CV

    ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations

    Authors: Haoming Cai, Jingxi Chen, Brandon Y. Feng, Weiyun Jiang, Mingyang Xie, Kevin Zhang, Ashok Veeraraghavan, Christopher Metzler

    Abstract: tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: https://convrt-2024.github.io/

  19. arXiv:2311.09652  [pdf, ps, other

    cs.CV cs.GR

    Event-based Motion-Robust Accurate Shape Estimation for Mixed Reflectance Scenes

    Authors: Aniket Dashpute, Jiazhang Wang, James Taylor, Oliver Cossairt, Ashok Veeraraghavan, Florian Willomitzer

    Abstract: Event-based structured light systems have recently been introduced as an exciting alternative to conventional frame-based triangulation systems for the 3D measurements of diffuse surfaces. Important benefits include the fast capture speed and the high dynamic range provided by the event camera - albeit at the cost of lower data quality. So far, both low-accuracy event-based and high-accuracy frame… ▽ More

    Submitted 10 June, 2025; v1 submitted 16 November, 2023; originally announced November 2023.

  20. ISLAND: Interpolating Land Surface Temperature using land cover

    Authors: Yuhao Liu, Pranavesh Panakkal, Sylvia Dee, Guha Balakrishnan, Jamie Padgett, Ashok Veeraraghavan

    Abstract: Cloud occlusion is a common problem in the field of remote sensing, particularly for retrieving Land Surface Temperature (LST). Remote sensing thermal instruments onboard operational satellites are supposed to enable frequent and high-resolution observations over land; unfortunately, clouds adversely affect thermal signals by blocking outgoing longwave radiation emission from the Earth's surface,… ▽ More

    Submitted 29 August, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 21 pages, 11 figures

    Journal ref: Remote Sensing Applications: Society and Environment, Volume 36, 2024, 101332, ISSN 2352-9385

  21. arXiv:2308.02100  [pdf, other

    eess.IV cs.CV

    CT Reconstruction from Few Planar X-rays with Application towards Low-resource Radiotherapy

    Authors: Yiran Sun, Tucker Netherton, Laurence Court, Ashok Veeraraghavan, Guha Balakrishnan

    Abstract: CT scans are the standard-of-care for many clinical ailments, and are needed for treatments like external beam radiotherapy. Unfortunately, CT scanners are rare in low and mid-resource settings due to their costs. Planar X-ray radiography units, in comparison, are far more prevalent, but can only provide limited 2D observations of the 3D anatomy. In this work, we propose a method to generate CT vo… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures

  22. arXiv:2308.00622  [pdf, other

    cs.CV

    NeRT: Implicit Neural Representations for General Unsupervised Turbulence Mitigation

    Authors: Weiyun Jiang, Yuhao Liu, Vivek Boominathan, Ashok Veeraraghavan

    Abstract: The atmospheric and water turbulence mitigation problems have emerged as challenging inverse problems in computer vision and optics communities over the years. However, current methods either rely heavily on the quality of the training dataset or fail to generalize over various scenarios, such as static scenes, dynamic scenes, and text reconstructions. We propose a general implicit neural represen… ▽ More

    Submitted 1 April, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

  23. arXiv:2307.11385  [pdf, other

    physics.optics physics.app-ph

    Broadband Thermal Imaging using Meta-Optics

    Authors: Luocheng Huang, Zheyi Han, Anna Wirth-Singh, Vishwanath Saragadam, Saswata Mukherjee, Johannes E. Fröch, Quentin A. A. Tanguy, Joshua Rollag, Ricky Gibson, Joshua R. Hendrickson, Phillip W. C. Hon, Orrin Kigner, Zachary Coppens, Karl F. Böhringer, Ashok Veeraraghavan, Arka Majumdar

    Abstract: Subwavelength diffractive optics known as meta-optics have demonstrated the potential to significantly miniaturize imaging systems. However, despite impressive demonstrations, most meta-optical imaging systems suffer from strong chromatic aberrations, limiting their utilities. Here, we employ inverse-design to create broadband meta-optics operating in the long-wave infrared (LWIR) regime (8 - 12… ▽ More

    Submitted 5 September, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: 28 pages, 12 figures

    MSC Class: 78A10

  24. arXiv:2304.01308  [pdf, other

    eess.IV cs.CV

    Role of Transients in Two-Bounce Non-Line-of-Sight Imaging

    Authors: Siddharth Somasundaram, Akshat Dave, Connor Henley, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: The goal of non-line-of-sight (NLOS) imaging is to image objects occluded from the camera's field of view using multiply scattered light. Recent works have demonstrated the feasibility of two-bounce (2B) NLOS imaging by scanning a laser and measuring cast shadows of occluded objects in scenes with two relay surfaces. In this work, we study the role of time-of-flight (ToF) measurements, \ie transie… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  25. arXiv:2304.00696  [pdf, other

    cs.CV

    Thermal Spread Functions (TSF): Physics-guided Material Classification

    Authors: Aniket Dashpute, Vishwanath Saragadam, Emma Alexander, Florian Willomitzer, Aggelos Katsaggelos, Ashok Veeraraghavan, Oliver Cossairt

    Abstract: Robust and non-destructive material classification is a challenging but crucial first-step in numerous vision applications. We propose a physics-guided material classification framework that relies on thermal properties of the object. Our key observation is that the rate of heating and cooling of an object depends on the unique intrinsic properties of the material, namely the emissivity and diffus… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

  26. arXiv:2301.05187  [pdf, other

    cs.CV cs.GR eess.IV

    WIRE: Wavelet Implicit Neural Representations

    Authors: Vishwanath Saragadam, Daniel LeJeune, Jasper Tan, Guha Balakrishnan, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: Implicit neural representations (INRs) have recently advanced numerous vision-related areas. INR performance depends strongly on the choice of the nonlinear activation function employed in its multilayer perceptron (MLP) network. A wide range of nonlinearities have been explored, but, unfortunately, current INRs designed to have high accuracy also suffer from poor robustness (to signal noise, para… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  27. arXiv:2212.06345  [pdf, other

    physics.optics cs.CV

    Foveated Thermal Computational Imaging in the Wild Using All-Silicon Meta-Optics

    Authors: Vishwanath Saragadam, Zheyi Han, Vivek Boominathan, Luocheng Huang, Shiyu Tan, Johannes E. Fröch, Karl F. Böhringer, Richard G. Baraniuk, Arka Majumdar, Ashok Veeraraghavan

    Abstract: Foveated imaging provides a better tradeoff between situational awareness (field of view) and resolution and is critical in long-wavelength infrared regimes because of the size, weight, power, and cost of thermal sensors. We demonstrate computational foveated imaging by exploiting the ability of a meta-optical frontend to discriminate between different polarization states and a computational backe… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  28. arXiv:2212.04531  [pdf, other

    cs.CV cs.AI

    ORCa: Glossy Objects as Radiance Field Cameras

    Authors: Kushagra Tiwary, Akshat Dave, Nikhil Behari, Tzofi Klinghoffer, Ashok Veeraraghavan, Ramesh Raskar

    Abstract: Reflections on glossy objects contain valuable and hidden information about the surrounding environment. By converting these objects into cameras, we can unlock exciting applications, including imaging beyond the camera's field-of-view and from seemingly impossible vantage points, e.g. from reflections on the human eye. However, this task is challenging because reflections depend jointly on object… ▽ More

    Submitted 12 December, 2022; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: for more information, see https://ktiwary2.github.io/objectsascam/

  29. arXiv:2207.00945  [pdf, other

    eess.IV cs.CV

    PS$^2$F: Polarized Spiral Point Spread Function for Single-Shot 3D Sensing

    Authors: Bhargav Ghanekar, Vishwanath Saragadam, Dushyant Mehra, Anna-Karin Gustavsson, Aswin Sankaranarayanan, Ashok Veeraraghavan

    Abstract: We propose a compact snapshot monocular depth estimation technique that relies on an engineered point spread function (PSF). Traditional approaches used in microscopic super-resolution imaging such as the Double-Helix PSF (DHPSF) are ill-suited for scenes that are more complex than a sparse set of point light sources. We show, using the Cramér-Rao lower bound, that separating the two lobes of the… ▽ More

    Submitted 4 August, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

    Comments: 12 pages, 12 figures

  30. arXiv:2206.08141  [pdf

    cs.AR

    i-FlatCam: A 253 FPS, 91.49 $μ$J/Frame Ultra-Compact Intelligent Lensless Camera for Real-Time and Efficient Eye Tracking in VR/AR

    Authors: Yang Zhao, Ziyun Li, Yonggan Fu, Yongan Zhang, Chaojian Li, Cheng Wan, Haoran You, Shang Wu, Xu Ouyang, Vivek Boominathan, Ashok Veeraraghavan, Yingyan Celine Lin

    Abstract: We present a first-of-its-kind ultra-compact intelligent camera system, dubbed i-FlatCam, including a lensless camera with a computational (Comp.) chip. It highlights (1) a predict-then-focus eye tracking pipeline for boosted efficiency without compromising the accuracy, (2) a unified compression scheme for single-chip processing and improved frame rate per second (FPS), and (3) dedicated intra-ch… ▽ More

    Submitted 28 March, 2025; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted by VLSI 2022

  31. arXiv:2206.03984  [pdf, other

    math.OC

    Distributed Generalized Wirtinger Flow for Interferometric Imaging on Networks

    Authors: Sean M. Farrell, Ashok Veeraraghavan, Ashutosh Sabharwal, César A. Uribe

    Abstract: We study the problem of decentralized interferometric imaging over networks, where agents have access to a subset of local radar measurements and can compute pair-wise correlations with their neighbors. We propose a primal-dual distributed algorithm named Distributed Generalized Wirtinger Flow (DGWF). We use the theory of low rank matrix recovery to show when the interferometric imaging problem sa… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: 6 pages, 3 figures, accepted to IFAC 2022 Conference on Networked Systems (NecSys22)

  32. EyeCoD: Eye Tracking System Acceleration via FlatCam-based Algorithm & Accelerator Co-Design

    Authors: Haoran You, Cheng Wan, Yang Zhao, Zhongzhi Yu, Yonggan Fu, Jiayi Yuan, Shang Wu, Shunyao Zhang, Yongan Zhang, Chaojian Li, Vivek Boominathan, Ashok Veeraraghavan, Ziyun Li, Yingyan Celine Lin

    Abstract: Eye tracking has become an essential human-machine interaction modality for providing immersive experience in numerous virtual and augmented reality (VR/AR) applications desiring high throughput (e.g., 240 FPS), small-form, and enhanced visual privacy. However, existing eye tracking systems are still limited by their: (1) large form-factor largely due to the adopted bulky lens-based cameras; and (… ▽ More

    Submitted 2 March, 2025; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: Accepted by ISCA 2022; Also selected as an IEEE Micro's Top Pick of 2023

  33. arXiv:2204.03145  [pdf, other

    stat.AP cs.LG stat.ML

    DeepTensor: Low-Rank Tensor Decomposition with Deep Network Priors

    Authors: Vishwanath Saragadam, Randall Balestriero, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: DeepTensor is a computationally efficient framework for low-rank decomposition of matrices and tensors using deep generative networks. We decompose a tensor as the product of low-rank tensor factors (e.g., a matrix as the outer product of two vectors), where each low-rank tensor is generated by a deep network (DN) that is trained in a self-supervised manner to minimize the mean-squared approximati… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 14 pages

  34. arXiv:2203.13458  [pdf, other

    cs.CV cs.GR

    PANDORA: Polarization-Aided Neural Decomposition Of Radiance

    Authors: Akshat Dave, Yongyi Zhao, Ashok Veeraraghavan

    Abstract: Reconstructing an object's geometry and appearance from multiple images, also known as inverse rendering, is a fundamental problem in computer graphics and vision. Inverse rendering is inherently ill-posed because the captured image is an intricate function of unknown lighting conditions, material properties and scene geometry. Recent progress in representing scene properties as coordinate-based n… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: Project webpage: https://akshatdave.github.io/pandora

  35. arXiv:2202.03532  [pdf, other

    cs.CV

    MINER: Multiscale Implicit Neural Representations

    Authors: Vishwanath Saragadam, Jasper Tan, Guha Balakrishnan, Richard G. Baraniuk, Ashok Veeraraghavan

    Abstract: We introduce a new neural signal model designed for efficient high-resolution representation of large-scale signals. The key innovation in our multiscale implicit neural representation (MINER) is an internal representation via a Laplacian pyramid, which provides a sparse multiscale decomposition of the signal that captures orthogonal parts of the signal across scales. We leverage the advantages of… ▽ More

    Submitted 17 July, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 14 pages, accepted to ECCV 2022

  36. arXiv:2110.07218  [pdf, other

    physics.optics physics.bio-ph

    Deep-3D Microscope: 3D volumetric microscopy of thick scattering samples using a wide-field microscope and machine learning

    Authors: Bowen Li, Shiyu Tan, Jiuyang Dong, Xiaocong Lian, Yongbing Zhang, Xiangyang Ji, Ashok Veeraraghavan

    Abstract: Confocal microscopy is the standard approach for obtaining volumetric images of a sample with high axial and lateral resolution, especially when dealing with scattering samples. Unfortunately, a confocal microscope is quite expensive compared to traditional microscopes. In addition, the point scanning in a confocal leads to slow imaging speed and photobleaching due to the high dose of laser energy… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  37. arXiv:2108.07973  [pdf, other

    eess.IV cs.CV

    Thermal Image Processing via Physics-Inspired Deep Networks

    Authors: Vishwanath Saragadam, Akshat Dave, Ashok Veeraraghavan, Richard Baraniuk

    Abstract: We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (t… ▽ More

    Submitted 25 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

    Comments: Accepted to 2nd ICCV workshop on Learning for Computational Imaging (LCI)

  38. arXiv:2104.04641  [pdf, other

    cs.CV eess.IV physics.optics

    CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo

    Authors: Shiyu Tan, Yicheng Wu, Shoou-I Yu, Ashok Veeraraghavan

    Abstract: Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio (SNR) -- due to the conflicting impact of aperture size on both these variables. Inspired by the extended depth of field cameras, we propose a novel end-to-end learning-based technique to overcome this limitation, by introducing a phase mask at the aperture plane of the cameras in a stereo ima… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021 as an oral presentation

  39. arXiv:2101.11680  [pdf, other

    eess.IV physics.optics

    High Resolution, Deep Imaging Using Confocal Time-of-flight Diffuse Optical Tomography

    Authors: Yongyi Zhao, Ankit Raghuram, Hyun K. Kim, Andreas H. Hielscher, Jacob T. Robinson, Ashok Veeraraghavan

    Abstract: Light scattering by tissue severely limits how deep beneath the surface one can image, and the spatial resolution one can obtain from these images. Diffuse optical tomography (DOT) is one of the most powerful techniques for imaging deep within tissue -- well beyond the conventional $\sim$10-15 mean scattering lengths tolerated by ballistic imaging techniques such as confocal and two-photon microsc… ▽ More

    Submitted 27 May, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: The updated version includes edits made to our paper in response to suggestions from reviewers. These changes include: updated 3D image reconstruction results, additional comments on prior work, and further explanations of the linear model. In addition, we made a correction to figure 9, relabeling the x-axis to the correct scale. Finally, we also updated our acknowledgements

  40. arXiv:2012.14495  [pdf, other

    eess.IV cs.CV cs.GR cs.LG

    SASSI -- Super-Pixelated Adaptive Spatio-Spectral Imaging

    Authors: Vishwanath Saragadam, Michael DeZeeuw, Richard Baraniuk, Ashok Veeraraghavan, Aswin Sankaranarayanan

    Abstract: We introduce a novel video-rate hyperspectral imager with high spatial, and temporal resolutions. Our key hypothesis is that spectral profiles of pixels in a super-pixel of an oversegmented image tend to be very similar. Hence, a scene-adaptive spatial sampling of an hyperspectral scene, guided by its super-pixel segmented image, is capable of obtaining high-quality reconstructions. To achieve thi… ▽ More

    Submitted 28 December, 2020; originally announced December 2020.

  41. arXiv:2011.12485  [pdf, other

    eess.IV cs.CV

    How to Train Neural Networks for Flare Removal

    Authors: Yicheng Wu, Qiurui He, Tianfan Xue, Rahul Garg, Jiawen Chen, Ashok Veeraraghavan, Jonathan T. Barron

    Abstract: When a camera is pointed at a strong light source, the resulting photograph may contain lens flare artifacts. Flares appear in a wide variety of patterns (halos, streaks, color bleeding, haze, etc.) and this diversity in appearance makes flare removal challenging. Existing analytical solutions make strong assumptions about the artifact's geometry or brightness, and therefore only work well on a sm… ▽ More

    Submitted 7 October, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: A new version paper is uploaded

  42. arXiv:2010.15440  [pdf, other

    eess.IV cs.CV cs.LG

    FlatNet: Towards Photorealistic Scene Reconstruction from Lensless Measurements

    Authors: Salman S. Khan, Varun Sundar, Vivek Boominathan, Ashok Veeraraghavan, Kaushik Mitra

    Abstract: Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, the current iterative-optimization-based reconstruction algorithms produce noisier and perceptually poorer imag… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2020. Supplementary material attached. For project website, see https://siddiquesalman.github.io/flatnet/

  43. arXiv:2010.07770  [pdf, other

    eess.IV cs.LG

    The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention

    Authors: Ewa Nowara, Daniel McDuff, Ashok Veeraraghavan

    Abstract: Attention is a powerful concept in computer vision. End-to-end networks that learn to focus selectively on regions of an image or video often perform strongly. However, other image regions, while not necessarily containing the signal of interest, may contain useful context. We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the sig… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

  44. arXiv:1811.07567  [pdf, other

    cs.CV

    Fine-grained Classification using Heterogeneous Web Data and Auxiliary Categories

    Authors: Li Niu, Ashok Veeraraghavan, Ashu Sabharwal

    Abstract: Fine-grained classification remains a very challenging problem, because of the absence of well-labeled training data caused by the high cost of annotating a large number of fine-grained categories. In the extreme case, given a set of test categories without any well-labeled training data, the majority of existing works can be grouped into the following two research directions: 1) crawl noisy label… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

  45. arXiv:1806.09228  [pdf, other

    cs.LG cs.CV stat.ML

    Deep $k$-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions

    Authors: Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, Yingyan Lin

    Abstract: The current trend of pushing CNNs deeper with convolutions has created a pressing demand to achieve higher compression gains on CNNs where convolutions dominate the computation and parameter amount (e.g., GoogLeNet, ResNet and Wide ResNet). Further, the high energy consumption of convolutions limits its deployment on mobile devices. To this end, we proposed a simple yet effective scheme for compre… ▽ More

    Submitted 24 June, 2018; originally announced June 2018.

    Comments: Accepted by ICML 2018

  46. arXiv:1806.07437  [pdf, other

    physics.ins-det

    Signal Processing Based Pile-up Compensation for Gated Single-Photon Avalanche Diodes

    Authors: Adithya K. Pediredla, Aswin C. Sankaranarayanan, Mauro Buttafava, Alberto Tosi, Ashok Veeraraghavan

    Abstract: Single-photon avalanche diode (SPAD) based transient imaging suffers from an aberration called pile-up. When multiple photons arrive within a single repetition period of the illuminating laser, the SPAD records only the arrival of the first photon; this leads to a bias in the recorded light transient wherein the transient response at later time-instants are under-estimated. An unfortunate conseque… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 17 pages, 11 figures

  47. arXiv:1805.06374  [pdf, other

    cs.CV

    Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning

    Authors: Wanjia Liu, Huaijin Chen, Rishab Goel, Yuzhong Huang, Ashok Veeraraghavan, Ankit Patel

    Abstract: Good temporal representations are crucial for video understanding, and the state-of-the-art video recognition framework is based on two-stream networks. In such framework, besides the regular ConvNets responsible for RGB frame inputs, a second network is introduced to handle the temporal representation, usually the optical flow (OF). However, OF or other task-oriented flow is computationally costl… ▽ More

    Submitted 19 May, 2018; v1 submitted 16 May, 2018; originally announced May 2018.

  48. arXiv:1803.03857  [pdf, other

    cs.CV

    Learning from Noisy Web Data with Category-level Supervision

    Authors: Li Niu, Qingtao Tang, Ashok Veeraraghavan, Ashu Sabharwal

    Abstract: As tons of photos are being uploaded to public websites (e.g., Flickr, Bing, and Google) every day, learning from web data has become an increasingly popular research direction because of freely available web resources, which is also referred to as webly supervised learning. Nevertheless, the performance gap between webly supervised learning and traditional supervised learning is still very large,… ▽ More

    Submitted 24 May, 2018; v1 submitted 10 March, 2018; originally announced March 2018.

  49. arXiv:1803.00212  [pdf, other

    stat.ML cs.LG

    prDeep: Robust Phase Retrieval with a Flexible Deep Network

    Authors: Christopher A. Metzler, Philip Schniter, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: Phase retrieval algorithms have become an important component in many modern computational imaging systems. For instance, in the context of ptychography and speckle correlation imaging, they enable imaging past the diffraction limit and through scattering media, respectively. Unfortunately, traditional phase retrieval algorithms struggle in the presence of noise. Progress has been made recently on… ▽ More

    Submitted 29 June, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

  50. arXiv:1801.05117  [pdf, other

    cs.CV

    Reblur2Deblur: Deblurring Videos via Self-Supervised Learning

    Authors: Huaijin Chen, Jinwei Gu, Orazio Gallo, Ming-Yu Liu, Ashok Veeraraghavan, Jan Kautz

    Abstract: Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference. Traditional deblurring algorithms leverage the physics of the image formation model and use hand-crafted priors: they usually produce results that better reflect the underlying scene, but present artifacts. Recent learning-based methods implicitly extract the distribution of natural images di… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.