Skip to main content

Showing 1–14 of 14 results for author: Ghanem, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.15822  [pdf, other

    eess.IV cs.CV cs.LG

    MambaStyle: Efficient StyleGAN Inversion for Real Image Editing with State-Space Models

    Authors: Jhon Lopez, Carlos Hinojosa, Henry Arguello, Bernard Ghanem

    Abstract: The task of inverting real images into StyleGAN's latent space to manipulate their attributes has been extensively studied. However, existing GAN inversion methods struggle to balance high reconstruction quality, effective editability, and computational efficiency. In this paper, we introduce MambaStyle, an efficient single-stage encoder-based approach for GAN inversion and editing that leverages… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  2. arXiv:2408.10827  [pdf, other

    eess.IV cs.CV

    CO2Wounds-V2: Extended Chronic Wounds Dataset From Leprosy Patients

    Authors: Karen Sanchez, Carlos Hinojosa, Olinto Mieles, Chen Zhao, Bernard Ghanem, Henry Arguello

    Abstract: Chronic wounds pose an ongoing health concern globally, largely due to the prevalence of conditions such as diabetes and leprosy's disease. The standard method of monitoring these wounds involves visual inspection by healthcare professionals, a practice that could present challenges for patients in remote areas with inadequate transportation and healthcare infrastructure. This has led to the devel… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 2024 IEEE International Conference on Image Processing (ICIP 2024)

  3. arXiv:2407.13036  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders

    Authors: Carlos Hinojosa, Shuming Liu, Bernard Ghanem

    Abstract: Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework, offering remarkable performance across a wide range of downstream tasks. To increase the difficulty of the pretext task and learn richer visual representations, existing works have focused on replacing standard random masking with more sophisticated strategies, such as adversarial-guided and teacher-guided masking. Howev… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Work Accepted for Publication at ECCV 2024

  4. arXiv:2407.08822  [pdf, other

    eess.IV cs.AI cs.CV

    FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging

    Authors: Kumail Alhamoud, Yasir Ghunaim, Motasem Alfarra, Thomas Hartvigsen, Philip Torr, Bernard Ghanem, Adel Bibi, Marzyeh Ghassemi

    Abstract: For medical imaging AI models to be clinically impactful, they must generalize. However, this goal is hindered by (i) diverse types of distribution shifts, such as temporal, demographic, and label shifts, and (ii) limited diversity in datasets that are siloed within single medical institutions. While these limitations have spurred interest in federated learning, current evaluation benchmarks fail… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted at MICCAI 2024. Code is available at: https://github.com/m1k2zoo/FedMedICL

  5. Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments

    Authors: Benoît Gérin, Anaïs Halin, Anthony Cioppa, Maxim Henry, Bernard Ghanem, Benoît Macq, Christophe De Vleeschouwer, Marc Van Droogenbroeck

    Abstract: In the era of the Internet of Things (IoT), objects connect through a dynamic network, empowered by technologies like 5G, enabling real-time data sharing. However, smart objects, notably autonomous vehicles, face challenges in critical local computations due to limited resources. Lightweight AI models offer a solution but struggle with diverse data distributions. To address this limitation, we pro… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  6. arXiv:2404.00777  [pdf, other

    cs.CV cs.AI cs.CR cs.LG eess.IV

    Privacy-preserving Optics for Enhancing Protection in Face De-identification

    Authors: Jhon Lopez, Carlos Hinojosa, Henry Arguello, Bernard Ghanem

    Abstract: The modern surge in camera usage alongside widespread computer vision technology applications poses significant privacy and security concerns. Current artificial intelligence (AI) technologies aid in recognizing relevant events and assisting in daily tasks in homes, offices, hospitals, etc. The need to access or process personal information for these purposes raises privacy concerns. While softwar… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024. Project Website and Code coming soon

  7. arXiv:2204.02084  [pdf, other

    cs.CV eess.IV

    Real-time Hyperspectral Imaging in Hardware via Trained Metasurface Encoders

    Authors: Maksim Makarenko, Arturo Burguete-Lopez, Qizhou Wang, Fedor Getman, Silvio Giancola, Bernard Ghanem, Andrea Fratalocchi

    Abstract: Hyperspectral imaging has attracted significant attention to identify spectral signatures for image classification and automated pattern recognition in computer vision. State-of-the-art implementations of snapshot hyperspectral imaging rely on bulky, non-integrated, and expensive optical elements, including lenses, spectrometers, and filters. These macroscopic components do not allow fast data pro… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

  8. arXiv:2203.14250  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    End-to-End Active Speaker Detection

    Authors: Juan Leon Alcazar, Moritz Cordes, Chen Zhao, Bernard Ghanem

    Abstract: Recent advances in the Active Speaker Detection (ASD) problem build upon a two-stage process: feature extraction and spatio-temporal context aggregation. In this paper, we propose an end-to-end ASD workflow where feature learning and contextual predictions are jointly learned. Our end-to-end trainable network simultaneously learns multi-modal embeddings and aggregates spatio-temporal context. This… ▽ More

    Submitted 25 July, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

  9. arXiv:2202.04947  [pdf, other

    cs.CV cs.SD eess.AS

    OWL (Observe, Watch, Listen): Audiovisual Temporal Context for Localizing Actions in Egocentric Videos

    Authors: Merey Ramazanova, Victor Escorcia, Fabian Caba Heilbron, Chen Zhao, Bernard Ghanem

    Abstract: Egocentric videos capture sequences of human activities from a first-person perspective and can provide rich multimodal signals. However, most current localization methods use third-person videos and only incorporate visual information. In this work, we take a deep look into the effectiveness of audiovisual context in detecting actions in egocentric videos and introduce a simple-yet-effective appr… ▽ More

    Submitted 26 October, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

  10. arXiv:2005.09812  [pdf, other

    cs.CV cs.SD eess.AS

    Active Speakers in Context

    Authors: Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem

    Abstract: Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker. Although this strategy can be enough for addressing single-speaker scenarios, it prevents accurate detection when the task is to identify who of many candidate speakers are talking. This paper introduces the Active Speaker Context, a novel representation that models relationshi… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  11. arXiv:1912.01326  [pdf, other

    cs.CV cs.LG eess.IV

    A Context-Aware Loss Function for Action Spotting in Soccer Videos

    Authors: Anthony Cioppa, Adrien Deliège, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, Thomas B. Moeslund

    Abstract: In video understanding, action spotting consists in temporally localizing human-induced events annotated with single timestamps. In this paper, we propose a novel loss function that specifically considers the temporal context naturally present around each action, rather than focusing on the single annotated frame to spot. We benchmark our loss on a large dataset of soccer videos, SoccerNet, and ac… ▽ More

    Submitted 30 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: Accepted for CVPR2020 main conference. This document contains 8 pages + references + supplementary material

  12. arXiv:1910.06849  [pdf, other

    cs.CV cs.LG eess.IV

    DeepGCNs: Making GCNs Go as Deep as CNNs

    Authors: Guohao Li, Matthias Müller, Guocheng Qian, Itzel C. Delgadillo, Abdulellah Abualshour, Ali Thabet, Bernard Ghanem

    Abstract: Convolutional Neural Networks (CNNs) have been very successful at solving a variety of computer vision tasks such as object classification and detection, semantic segmentation, activity understanding, to name just a few. One key enabling factor for their great performance has been the ability to train very deep networks. Despite their huge success in many tasks, CNNs do not work well with non-Eucl… ▽ More

    Submitted 14 May, 2021; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Accepted at TPAMI. This work is a journal extension of our ICCV'19 paper arXiv:1904.03751. The first three authors contributed equally

  13. arXiv:1905.02538  [pdf, other

    eess.IV cs.CV

    Rethinking Learning-based Demosaicing, Denoising, and Super-Resolution Pipeline

    Authors: Guocheng Qian, Yuanhao Wang, Jinjin Gu, Chao Dong, Wolfgang Heidrich, Bernard Ghanem, Jimmy S. Ren

    Abstract: Imaging is usually a mixture problem of incomplete color sampling, noise degradation, and limited resolution. This mixture problem is typically solved by a sequential solution that applies demosaicing (DM), denoising (DN), and super-resolution (SR) sequentially in a fixed and predefined pipeline (execution order of tasks), DM$\to$DN$\to$SR. The most recent work on image processing focuses on devel… ▽ More

    Submitted 24 March, 2023; v1 submitted 7 May, 2019; originally announced May 2019.

    Comments: Accepted at ICCP'22. Code is available at: https://github.com/guochengqian/TENet

  14. arXiv:1802.09879  [pdf, other

    math.NA eess.IV math.OC

    L0TV: A Sparse Optimization Method for Impulse Noise Image Restoration

    Authors: Ganzhao Yuan, Bernard Ghanem

    Abstract: Total Variation (TV) is an effective and popular prior model in the field of regularization-based image processing. This paper focuses on total variation for removing impulse noise in image restoration. This type of noise frequently arises in data acquisition and transmission due to many reasons, e.g. a faulty sensor or analog-to-digital converter errors. Removing this noise is an important task i… ▽ More

    Submitted 28 December, 2018; v1 submitted 27 February, 2018; originally announced February 2018.

    Comments: to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)