Skip to main content

Showing 1–28 of 28 results for author: Talebi, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.05599  [pdf, ps, other

    cs.CV

    UniRes: Universal Image Restoration for Complex Degradations

    Authors: Mo Zhou, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Vishal M. Patel, Hossein Talebi

    Abstract: Real-world image restoration is hampered by diverse degradations stemming from varying capture conditions, capture devices and post-processing pipelines. Existing works make improvements through simulating those degradations and leveraging image generative priors, however generalization to in-the-wild data remains an unresolved problem. In this paper, we focus on complex degradations, i.e., arbitr… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  2. arXiv:2505.23119  [pdf, ps, other

    cs.CV

    TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance

    Authors: Keren Ye, Ignacio Garcia Dorado, Michalis Raptis, Mauricio Delbracio, Irene Zhu, Peyman Milanfar, Hossein Talebi

    Abstract: While recent advancements in Image Super-Resolution (SR) using diffusion models have shown promise in improving overall image quality, their application to scene text images has revealed limitations. These models often struggle with accurate text region localization and fail to effectively model image and multilingual character-to-shape priors. This leads to inconsistencies, the generation of hall… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  3. arXiv:2505.21905  [pdf, ps, other

    cs.CV cs.MM

    Reference-Guided Identity Preserving Face Restoration

    Authors: Mo Zhou, Keren Ye, Viraj Shah, Kangfu Mei, Mauricio Delbracio, Peyman Milanfar, Vishal M. Patel, Hossein Talebi

    Abstract: Preserving face identity is a critical yet persistent challenge in diffusion-based image restoration. While reference faces offer a path forward, existing reference-based methods often fail to fully exploit their potential. This paper introduces a novel approach that maximizes reference face utility for improved face restoration and identity preservation. Our method makes three key contributions:… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  4. arXiv:2503.14503  [pdf, other

    cs.CV cs.AI cs.LG

    The Power of Context: How Multimodality Improves Image Super-Resolution

    Authors: Kangfu Mei, Hossein Talebi, Mojtaba Ardakani, Vishal M. Patel, Peyman Milanfar, Mauricio Delbracio

    Abstract: Single-image super-resolution (SISR) remains challenging due to the inherent difficulty of recovering fine-grained details and preserving perceptual quality from low-resolution inputs. Existing methods often rely on limited image priors, leading to suboptimal results. We propose a novel approach that leverages the rich contextual information available in multiple modalities -- including depth, seg… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: accepted by CVPR2025

  5. arXiv:2404.01367  [pdf, other

    cs.CV cs.LG

    Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

    Authors: Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar

    Abstract: We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency. While improved network architecture and inference algorithms have shown to effectively boost sampling efficiency of diffusion models, the role of model size -- a critical determinant of sampling efficiency -- has not been thoroughly examined. Through empirical analysis of established te… ▽ More

    Submitted 10 December, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to TMLR. Camera-ready version

  6. arXiv:2312.11595  [pdf, other

    cs.CV

    SPIRE: Semantic Prompt-Driven Image Restoration

    Authors: Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi

    Abstract: Text-driven diffusion models have become increasingly popular for various image editing tasks, including inpainting, stylization, and object replacement. However, it still remains an open research problem to adopt this language-vision paradigm for more fine-level image processing tasks, such as denoising, super-resolution, deblurring, and compression artifact removal. In this paper, we develop SPI… ▽ More

    Submitted 16 July, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted by ECCV 2024; Webpage: https://chenyangqiqi.github.io/tip

  7. arXiv:2310.01407  [pdf, other

    cs.CV cs.AI cs.LG

    CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

    Authors: Kangfu Mei, Mauricio Delbracio, Hossein Talebi, Zhengzhong Tu, Vishal M. Patel, Peyman Milanfar

    Abstract: Large generative diffusion models have revolutionized text-to-image generation and offer immense potential for conditional generation tasks such as image enhancement, restoration, editing, and compositing. However, their widespread adoption is hindered by the high computational cost, which limits their real-time application. To address this challenge, we introduce a novel method dubbed CoDi, that… ▽ More

    Submitted 17 February, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  8. arXiv:2304.02859  [pdf, other

    cs.CV

    MULLER: Multilayer Laplacian Resizer for Vision

    Authors: Zhengzhong Tu, Peyman Milanfar, Hossein Talebi

    Abstract: Image resizing operation is a fundamental preprocessing module in modern computer vision. Throughout the deep learning revolution, researchers have overlooked the potential of alternative resizing methods beyond the commonly used resizers that are readily available, such as nearest-neighbors, bilinear, and bicubic. The key question of our interest is whether the front-end resizer affects the perfo… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  9. arXiv:2302.14781  [pdf, other

    cs.LG

    Time Series Anomaly Detection in Smart Homes: A Deep Learning Approach

    Authors: Somayeh Zamani, Hamed Talebi, Gunnar Stevens

    Abstract: Fixing energy leakage caused by different anomalies can result in significant energy savings and extended appliance life. Further, it assists grid operators in scheduling their resources to meet the actual needs of end users, while helping end users reduce their energy costs. In this paper, we analyze the patterns pertaining to the power consumption of dishwashers used in two houses of the REFIT d… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  10. arXiv:2212.01789  [pdf, other

    cs.CV

    Multiscale Structure Guided Diffusion for Image Deblurring

    Authors: Mengwei Ren, Mauricio Delbracio, Hossein Talebi, Guido Gerig, Peyman Milanfar

    Abstract: Diffusion Probabilistic Models (DPMs) have recently been employed for image deblurring, formulated as an image-conditioned generation process that maps Gaussian noise to the high-quality image, conditioned on the blurry input. Image-conditioned DPMs (icDPMs) have shown more realistic results than regression-based methods when trained on pairwise in-domain data. However, their robustness in restori… ▽ More

    Submitted 12 December, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

    Comments: Camera ready for ICCV2023

  11. arXiv:2209.05442  [pdf, other

    cs.CV cs.AI cs.LG

    Soft Diffusion: Score Matching for General Corruptions

    Authors: Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alexandros G. Dimakis, Peyman Milanfar

    Abstract: We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching that provably learns the score function for any linear corruption process and yields state of the art results for CelebA. Soft Score Matching incorporates the degradation process in the network. Our new los… ▽ More

    Submitted 4 October, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: 21 pages, 12 figures, work in progress

  12. arXiv:2204.01697  [pdf, other

    cs.CV cs.AI cs.LG

    MaxViT: Multi-Axis Vision Transformer

    Authors: Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

    Abstract: Transformers have recently gained significant attention in the computer vision community. However, the lack of scalability of self-attention mechanisms with respect to image size has limited their wide adoption in state-of-the-art vision backbones. In this paper we introduce an efficient and scalable attention model we call multi-axis attention, which consists of two aspects: blocked local and dil… ▽ More

    Submitted 9 September, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: ECCV 2022; code: https://github.com/google-research/maxvit v1: initials; v2: added GAN visuals; v3: fixed ImageNet-1k acc typos for Maxvit @ 384

  13. arXiv:2201.02973  [pdf, other

    eess.IV cs.CV

    MAXIM: Multi-Axis MLP for Image Processing

    Authors: Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

    Abstract: Recent progress on Transformers and multi-layer perceptron (MLP) models provide new network architectural designs for computer vision tasks. Although these models proved to be effective in many vision tasks such as image recognition, there remain challenges in adapting them for low-level vision. The inflexibility to support high-resolution images and limitations of local attention are perhaps the… ▽ More

    Submitted 1 April, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: CVPR 2022 Oral; Code: \url{https://github.com/google-research/maxim}

  14. arXiv:2112.02475  [pdf, other

    cs.CV eess.IV

    Deblurring via Stochastic Refinement

    Authors: Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, Peyman Milanfar

    Abstract: Image deblurring is an ill-posed problem with multiple plausible solutions for a given input image. However, most existing methods produce a deterministic estimate of the clean image and are trained to minimize pixel-level distortion. These metrics are known to be poorly correlated with human perception, and often lead to unrealistic reconstructions. We present an alternative framework for blind d… ▽ More

    Submitted 28 December, 2021; v1 submitted 4 December, 2021; originally announced December 2021.

  15. arXiv:2104.01493  [pdf, other

    cs.LG stat.ML

    Exponentiated Gradient Reweighting for Robust Training Under Label Noise and Beyond

    Authors: Negin Majidi, Ehsan Amid, Hossein Talebi, Manfred K. Warmuth

    Abstract: Many learning tasks in machine learning can be viewed as taking a gradient step towards minimizing the average loss of a batch of examples in each training iteration. When noise is prevalent in the data, this uniform treatment of examples can lead to overfitting to noisy examples with larger loss values and result in poor generalization. Inspired by the expert setting in on-line learning, we prese… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

  16. arXiv:2103.09950  [pdf, other

    cs.CV cs.LG

    Learning to Resize Images for Computer Vision Tasks

    Authors: Hossein Talebi, Peyman Milanfar

    Abstract: For all the ways convolutional neural nets have revolutionized computer vision in recent years, one important aspect has received surprisingly little attention: the effect of image size on the accuracy of tasks being trained for. Typically, to be efficient, the input images are resized to a relatively small spatial resolution (e.g. 224x224), and both training and inference are carried out at this… ▽ More

    Submitted 17 August, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

    Comments: Accepted to ICCV 2021

  17. arXiv:2103.01114  [pdf, other

    cs.CV eess.IV

    Deep Perceptual Image Quality Assessment for Compression

    Authors: Juan Carlos Mier, Eddie Huang, Hossein Talebi, Feng Yang, Peyman Milanfar

    Abstract: Lossy Image compression is necessary for efficient storage and transfer of data. Typically the trade-off between bit-rate and quality determines the optimal compression level. This makes the image quality metric an integral part of any imaging system. While the existing full-reference metrics such as PSNR and SSIM may be less sensitive to perceptual quality, the recently introduced learning method… ▽ More

    Submitted 15 July, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

  18. arXiv:2012.09289  [pdf, other

    cs.CV eess.IV

    Projected Distribution Loss for Image Enhancement

    Authors: Mauricio Delbracio, Hossein Talebi, Peyman Milanfar

    Abstract: Features obtained from object recognition CNNs have been widely used for measuring perceptual similarities between images. Such differentiable metrics can be used as perceptual learning losses to train image enhancement models. However, the choice of the distance function between input and target features may have a consequential impact on the performance of the trained model. While using the norm… ▽ More

    Submitted 17 May, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

  19. arXiv:2011.10893  [pdf, other

    cs.CV cs.LG

    Rank-smoothed Pairwise Learning In Perceptual Quality Assessment

    Authors: Hossein Talebi, Ehsan Amid, Peyman Milanfar, Manfred K. Warmuth

    Abstract: Conducting pairwise comparisons is a widely used approach in curating human perceptual preference data. Typically raters are instructed to make their choices according to a specific set of rules that address certain dimensions of image quality and aesthetics. The outcome of this process is a dataset of sampled image pairs with their associated empirical preference probabilities. Training a model o… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

    Journal ref: IEEE International Conference on Image Processing (ICIP) 2020

  20. arXiv:2011.10822  [pdf, other

    cs.RO

    Control and implementation of fluid-driven soft gripper with dynamic uncertainty of object

    Authors: Amirhosein Alian, Mohammad Zareinejad, Heidar Ali Talebi

    Abstract: Soft grippers, for stable grasping of objects, with high compliance could be considered a suitable candidate for replacement of conventional rigid grippers, and in recent years, they have been emerging exponentially in industries. Not only are these highly adaptable grippers capable of static grasping of an object, but also they can be utilized for performing object manipulation tasks. Plenty of c… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

    Comments: 52 pages

  21. arXiv:2008.00605  [pdf, other

    eess.IV cs.CV

    The Rate-Distortion-Accuracy Tradeoff: JPEG Case Study

    Authors: Xiyang Luo, Hossein Talebi, Feng Yang, Michael Elad, Peyman Milanfar

    Abstract: Handling digital images is almost always accompanied by a lossy compression in order to facilitate efficient transmission and storage. This introduces an unavoidable tension between the allocated bit-budget (rate) and the faithfulness of the resulting image to the original one (distortion). An additional complicating consideration is the effect of the compression on recognition performance by give… ▽ More

    Submitted 2 August, 2020; originally announced August 2020.

    ACM Class: I.4.2; I.5.1

  22. arXiv:2002.11248  [pdf, other

    cs.CV eess.IV

    Super-Resolving Commercial Satellite Imagery Using Realistic Training Data

    Authors: Xiang Zhu, Hossein Talebi, Xinwei Shi, Feng Yang, Peyman Milanfar

    Abstract: In machine learning based single image super-resolution, the degradation model is embedded in training data generation. However, most existing satellite image super-resolution methods use a simple down-sampling model with a fixed kernel to create training images. These methods work fine on synthetic data, but do not perform well on real satellite images. We propose a realistic training data genera… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  23. Better Compression with Deep Pre-Editing

    Authors: Hossein Talebi, Damien Kelly, Xiyang Luo, Ignacio Garcia Dorado, Feng Yang, Peyman Milanfar, Michael Elad

    Abstract: Could we compress images via standard codecs while avoiding visible artifacts? The answer is obvious -- this is doable as long as the bit budget is generous enough. What if the allocated bit-rate for compression is insufficient? Then unfortunately, artifacts are a fact of life. Many attempts were made over the years to fight this phenomenon, with various degrees of success. In this work we aim to… ▽ More

    Submitted 23 July, 2021; v1 submitted 31 January, 2020; originally announced February 2020.

  24. arXiv:1906.05131  [pdf

    cs.CV

    High Accuracy Classification of White Blood Cells using TSLDA Classifier and Covariance Features

    Authors: Hamed Talebi, Amin Ranjbar, Alireza Davoudi, Hamed Gholami, Mohammad Bagher Menhaj

    Abstract: creating automated processes in different areas of medical science with the application of engineering tools is a highly growing field over recent decades. In this context, many medical image processing and analyzing researchers use worthwhile methods in artificial intelligence, which can reduce necessary human power while increases accuracy of results. Among various medical images, blood microsco… ▽ More

    Submitted 16 July, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: 7 pages, 4 tables, 3 figures, journal

  25. arXiv:1712.02864  [pdf, other

    cs.CV

    Learned Perceptual Image Enhancement

    Authors: Hossein Talebi, Peyman Milanfar

    Abstract: Learning a typical image enhancement pipeline involves minimization of a loss function between enhanced and reference images. While L1 and L2 losses are perhaps the most widely used functions for this purpose, they do not necessarily lead to perceptually compelling results. In this paper, we show that adding a learned no-reference image quality metric to the loss can significantly improve enhancem… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

  26. arXiv:1711.03605  [pdf, other

    eess.SY cs.RO

    Stability and Transparency Analysis of a Bilateral Teleoperation in Presence of Data Loss

    Authors: A. Bakhshi, H. A. Talebi, A. A. Suratgar, M. Abdeetedal

    Abstract: This paper presents a novel approach for stability and transparency analysis for bilateral teleoperation in the presence of data loss in communication media. A new model for data loss is proposed based on a set of periodic continuous pulses and its finite series representation. The passivity of the overall system is shown using wave variable approach including the newly defined model for data loss… ▽ More

    Submitted 9 November, 2017; originally announced November 2017.

  27. NIMA: Neural Image Assessment

    Authors: Hossein Talebi, Peyman Milanfar

    Abstract: Automatically learned quality assessment for images has recently become a hot topic due to its usefulness in a wide variety of applications such as evaluating image capture pipelines, storage techniques and sharing media. Despite the subjective nature of this problem, most existing methods only predict the mean opinion score provided by datasets such as AVA [1] and TID2013 [2]. Our approach differ… ▽ More

    Submitted 26 April, 2018; v1 submitted 15 September, 2017; originally announced September 2017.

    Comments: IEEE Transactions on Image Processing 2018

  28. arXiv:1606.07396  [pdf, other

    cs.CV

    Fast Multi-Layer Laplacian Enhancement

    Authors: Hossein Talebi, Peyman Milanfar

    Abstract: A novel, fast and practical way of enhancing images is introduced in this paper. Our approach builds on Laplacian operators of well-known edge-aware kernels, such as bilateral and nonlocal means, and extends these filter's capabilities to perform more effective and fast image smoothing, sharpening and tone manipulation. We propose an approximation of the Laplacian, which does not require normaliza… ▽ More

    Submitted 23 June, 2016; originally announced June 2016.