Skip to main content

Showing 1–18 of 18 results for author: Escudero-Viñolo, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.04325  [pdf, other

    eess.IV cs.CV

    GBT-SAM: Adapting a Foundational Deep Learning Model for Generalizable Brain Tumor Segmentation via Efficient Integration of Multi-Parametric MRI Data

    Authors: Cecilia Diana-Albelda, Roberto Alcover-Couso, Álvaro García-Martín, Jesus Bescos, Marcos Escudero-Viñolo

    Abstract: Gliomas are aggressive brain tumors that require accurate imaging-based diagnosis, with segmentation playing a critical role in evaluating morphology and treatment decisions. Manual delineation of gliomas is time-consuming and prone to variability, motivating the use of deep learning to improve consistency and alleviate clinical workload. However, existing methods often fail to fully exploit the i… ▽ More

    Submitted 13 May, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

  2. arXiv:2412.16592  [pdf, other

    cs.CV

    Leveraging Contrastive Learning for Semantic Segmentation with Consistent Labels Across Varying Appearances

    Authors: Javier Montalvo, Roberto Alcover-Couso, Pablo Carballeira, Álvaro García-Martín, Juan C. SanMiguel, Marcos Escudero-Viñolo

    Abstract: This paper introduces a novel synthetic dataset that captures urban scenes under a variety of weather conditions, providing pixel-perfect, ground-truth-aligned images to facilitate effective feature alignment across domains. Additionally, we propose a method for domain adaptation and generalization that takes advantage of the multiple versions of each scene, enforcing feature consistency across di… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

  3. arXiv:2412.09240  [pdf, other

    cs.CV cs.AI

    VLMs meet UDA: Boosting Transferability of Open Vocabulary Segmentation with Unsupervised Domain Adaptation

    Authors: Roberto Alcover-Couso, Marcos Escudero-Viñolo, Juan C. SanMiguel, Jesus Bescos

    Abstract: Segmentation models are typically constrained by the categories defined during training. To address this, researchers have explored two independent approaches: adapting Vision-Language Models (VLMs) and leveraging synthetic data. However, VLMs often struggle with granularity, failing to disentangle fine-grained concepts, while synthetic data-based methods remain limited by the scope of available d… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

  4. arXiv:2412.09160  [pdf, other

    cs.CV

    Pinpoint Counterfactuals: Reducing social bias in foundation models via localized counterfactual generation

    Authors: Kirill Sirotkin, Marcos Escudero-Viñolo, Pablo Carballeira, Mayug Maniparambil, Catarina Barata, Noel E. O'Connor

    Abstract: Foundation models trained on web-scraped datasets propagate societal biases to downstream tasks. While counterfactual generation enables bias analysis, existing methods introduce artifacts by modifying contextual elements like clothing and background. We present a localized counterfactual generation method that preserves image context by constraining counterfactual modifications to specific attrib… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

  5. arXiv:2409.15813  [pdf, other

    cs.CV cs.AI cs.MM

    Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks

    Authors: Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Jose M Martínez

    Abstract: Merging parameters of multiple models has resurfaced as an effective strategy to enhance task performance and robustness, but prior work is limited by the high costs of ensemble creation and inference. In this paper, we leverage the abundance of freely accessible trained models to introduce a cost-free approach to model merging. It focuses on a layer-wise integration of merged models, aiming to ma… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  6. arXiv:2407.01327  [pdf, other

    cs.CV cs.LG

    Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks

    Authors: Roberto Alcover-Couso, Marcos Escudero-Viñolo, Juan C. SanMiguel, Jesus Bescós

    Abstract: In unsupervised domain adaptation (UDA), where models are trained on source data (e.g., synthetic) and adapted to target data (e.g., real-world) without target annotations, addressing the challenge of significant class imbalance remains an open issue. Despite considerable progress in bridging the domain gap, existing methods often experience performance degradation when confronted with highly imba… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2309.15478  [pdf, other

    cs.CV cs.LG

    The Robust Semantic Segmentation UNCV2023 Challenge Results

    Authors: Xuanlong Yu, Yi Zuo, Zitao Wang, Xiaowen Zhang, Jiaxuan Zhao, Yuting Yang, Licheng Jiao, Rui Peng, Xinyi Wang, Junpei Zhang, Kexin Zhang, Fang Liu, Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Hanlin Tian, Kenta Matsui, Tianhao Wang, Fahmy Adan, Zhitong Gao, Xuming He, Quentin Bouniot, Hossein Moghaddam, Shyam Nandan Rai, Fabio Cermelli , et al. (12 additional authors not shown)

    Abstract: This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023. The challenge was centered around semantic segmentation in urban environments, with a particular focus on natural adversarial scenarios. The report presents the results of 19 submitted entries, with numerous techniques drawing inspiration from cutting-edge uncertainty q… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 11 pages, 4 figures, accepted at ICCV 2023 UNCV workshop

  8. arXiv:2302.13961  [pdf, other

    cs.CV

    Soft labelling for semantic segmentation: Bringing coherence to label down-sampling

    Authors: Roberto Alcover-Couso, Marcos Escudero-Vinolo, Juan C. SanMiguel, Jose M. Martinez

    Abstract: In semantic segmentation, training data down-sampling is commonly performed due to limited resources, the need to adapt image size to the model input, or improve data augmentation. This down-sampling typically employs different strategies for the image data and the annotated labels. Such discrepancy leads to mismatches between the down-sampled color and label images. Hence, the training performanc… ▽ More

    Submitted 19 February, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

  9. arXiv:2301.10687  [pdf, other

    eess.IV cs.CV

    Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

    Authors: Iván de Andrés Tamé, Kirill Sirotkin, Pablo Carballeira, Marcos Escudero-Viñolo

    Abstract: Deep learning technologies have already demonstrated a high potential to build diagnosis support systems from medical imaging data, such as Chest X-Ray images. However, the shortage of labeled data in the medical field represents one key obstacle to narrow down the performance gap with respect to applications in other image domains. In this work, we investigate the benefits of a curricular Self-Su… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  10. arXiv:2211.15538  [pdf, other

    cs.CV

    Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking

    Authors: Elena Luna, Juan Carlos San Miguel, José María Martínez, Marcos Escudero-Viñolo

    Abstract: This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  11. arXiv:2205.01997  [pdf, other

    cs.CV

    Attention-based Knowledge Distillation in Multi-attention Tasks: The Impact of a DCT-driven Loss

    Authors: Alejandro López-Cifuentes, Marcos Escudero-Viñolo, Jesús Bescós, Juan C. SanMiguel

    Abstract: Knowledge Distillation (KD) is a strategy for the definition of a set of transferability gangways to improve the efficiency of Convolutional Neural Networks. Feature-based Knowledge Distillation is a subfield of KD that relies on intermediate network representations, either unaltered or depth-reduced via maximum activation maps, as the source knowledge. In this paper, we propose and analyse the us… ▽ More

    Submitted 6 June, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: Preprint under review in TCSVT Journal

  12. arXiv:2203.01854  [pdf, other

    cs.CV

    A study on the distribution of social biases in self-supervised learning visual models

    Authors: Kirill Sirotkin, Pablo Carballeira, Marcos Escudero-Viñolo

    Abstract: Deep neural networks are efficient at learning the data distribution if it is sufficiently sampled. However, they can be strongly biased by non-relevant factors implicitly incorporated in the training data. These include operational biases, such as ineffective or uneven data sampling, but also ethical concerns, as the social biases are implicitly present\textemdash even inadvertently, in the train… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 10 pages, 6 figures, accepted in CVPR2022

  13. arXiv:2112.12086  [pdf, other

    cs.CV

    Improved skin lesion recognition by a Self-Supervised Curricular Deep Learning approach

    Authors: Kirill Sirotkin, Marcos Escudero-Viñolo, Pablo Carballeira, Juan Carlos SanMiguel

    Abstract: State-of-the-art deep learning approaches for skin lesion recognition often require pretraining on larger and more varied datasets, to overcome the generalization limitations derived from the reduced size of the skin lesion imaging datasets. ImageNet is often used as the pretraining dataset, but its transferring potential is hindered by the domain gap between the source dataset and the target derm… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 11 pages, 8 figures, submitted to the Journal of Biomedical and Health Informatics (Special Issue on Skin Image Analysis in the Age of Deep Learning)

  14. arXiv:2102.04091  [pdf, other

    cs.CV

    Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with overlapping FOVs

    Authors: Elena Luna, Juan C. SanMiguel, Jose M. Martínez, Marcos Escudero-Viñolo

    Abstract: Multi-Target Multi-Camera (MTMC) vehicle tracking is an essential task of visual traffic monitoring, one of the main research fields of Intelligent Transportation Systems. Several offline approaches have been proposed to address this task; however, they are not compatible with real-world applications due to their high latency and post-processing requirements. In this paper, we present a new low-la… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 10 pages

  15. arXiv:2008.11588  [pdf, other

    cs.CV

    A Prospective Study on Sequence-Driven Temporal Sampling and Ego-Motion Compensation for Action Recognition in the EPIC-Kitchens Dataset

    Authors: Alejandro López-Cifuentes, Marcos Escudero-Viñolo, Jesús Bescós

    Abstract: Action recognition is currently one of the top-challenging research fields in computer vision. Convolutional Neural Networks (CNNs) have significantly boosted its performance but rely on fixed-size spatio-temporal windows of analysis, reducing CNNs temporal receptive fields. Among action recognition datasets, egocentric recorded sequences have become of important relevance while entailing an addit… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: Paper accepted at CVPR 2020 EPIC Kitchens Workshop

  16. arXiv:2005.12074  [pdf, other

    cs.CV

    Egocentric Human Segmentation for Mixed Reality

    Authors: Andrija Gajic, Ester Gonzalez-Sosa, Diego Gonzalez-Morin, Marcos Escudero-Viñolo, Alvaro Villegas

    Abstract: The objective of this work is to segment human body parts from egocentric video using semantic segmentation networks. Our contribution is two-fold: i) we create a semi-synthetic dataset composed of more than 15, 000 realistic images and associated pixel-wise labels of egocentric human body parts, such as arms or legs including different demographic factors; ii) building upon the ThunderNet archite… ▽ More

    Submitted 8 June, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: Accepted for presentation at EPIC@CVPR2020 workshop

  17. Semantic-Aware Scene Recognition

    Authors: Alejandro López-Cifuentes, Marcos Escudero-Viñolo, Jesús Bescós, Álvaro García-Martín

    Abstract: Scene recognition is currently one of the top-challenging research fields in computer vision. This may be due to the ambiguity between classes: images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different. Convolutional Neural Networks (CNNs) have significantly boosted performan… ▽ More

    Submitted 22 January, 2020; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: Paper submitted for publication to Elsevier Pattern Recognition journal

    Journal ref: Pattern Recognition Volume 102, June 2020, 107256

  18. arXiv:1812.10779  [pdf, other

    cs.CV

    Semantic Driven Multi-Camera Pedestrian Detection

    Authors: Alejandro López-Cifuentes, Marcos Escudero-Viñolo, Jesús Bescós, Pablo Carballeira

    Abstract: In the current worldwide situation, pedestrian detection has reemerged as a pivotal tool for intelligent video-based systems aiming to solve tasks such as pedestrian tracking, social distancing monitoring or pedestrian mass counting. Pedestrian detection methods, even the top performing ones, are highly sensitive to occlusions among pedestrians, which dramatically degrades their performance in cro… ▽ More

    Submitted 7 April, 2022; v1 submitted 27 December, 2018; originally announced December 2018.

    Comments: Preprint accepted in Springer Knowledge and Information Systems (KAIS)