Skip to main content

Showing 1–50 of 94 results for author: García, N

Searching in archive cs. Search in all archives.
.
  1. Advances on Affordable Hardware Platforms for Human Demonstration Acquisition in Agricultural Applications

    Authors: Alberto San-Miguel-Tello, Gennaro Scarati, Alejandro Hernández, Mario Cavero-Vidal, Aakash Maroti, Néstor García

    Abstract: This paper presents advances on the Universal Manipulation Interface (UMI), a low-cost hand-held gripper for robot Learning from Demonstration (LfD), for complex in-the-wild scenarios found in agricultural settings. The focus is on improving the acquisition of suitable samples with minimal additional setup. Firstly, idle times and user's cognitive load are reduced through the extraction of individ… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 7 pages, 2 figures

    Journal ref: European Robotics Forum 2025. ERF 2025. Springer Proceedings in Advanced Robotics, vol 36. Springer

  2. arXiv:2506.04693  [pdf, ps, other

    cs.CL

    Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification

    Authors: Lu Wei, Liangzhi Li, Tong Xiang, Xiao Liu, Noa Garcia

    Abstract: The internet has become a hotspot for hate speech (HS), threatening societal harmony and individual well-being. While automatic detection methods perform well in identifying explicit hate speech (ex-HS), they struggle with more subtle forms, such as implicit hate speech (im-HS). We tackle this problem by introducing a new taxonomy for im-HS detection, defining six encoding strategies named codetyp… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025), 112-126

  3. arXiv:2505.00569  [pdf, other

    cs.CV

    AnimalMotionCLIP: Embedding motion in CLIP for Animal Behavior Analysis

    Authors: Enmin Zhong, Carlos R. del-Blanco, Daniel Berjón, Fernando Jaureguizar, Narciso García

    Abstract: Recently, there has been a surge of interest in applying deep learning techniques to animal behavior recognition, particularly leveraging pre-trained visual language models, such as CLIP, due to their remarkable generalization capacity across various downstream tasks. However, adapting these models to the specific domain of animal behavior recognition presents two significant challenges: integrati… ▽ More

    Submitted 30 April, 2025; originally announced May 2025.

    Comments: 6 pages, 3 figures,Accepted for the poster session at the CV4Animals workshop: Computer Vision for Animal Behavior Tracking and Modeling In conjunction with Computer Vision and Pattern Recognition 2024

  4. arXiv:2503.19361  [pdf, other

    cs.CV

    ImageSet2Text: Describing Sets of Images through Text

    Authors: Piera Riccio, Francesco Galati, Kajetan Schweighofer, Noa Garcia, Nuria Oliver

    Abstract: We introduce ImageSet2Text, a novel approach that leverages vision-language foundation models to automatically create natural language descriptions of image sets. Inspired by concept bottleneck models (CBMs) and based on visual-question answering (VQA) chains, ImageSet2Text iteratively extracts key concepts from image subsets, encodes them into a structured graph, and refines insights using an ext… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  5. arXiv:2503.08188  [pdf, other

    cs.CL cs.AI

    RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware

    Authors: Gonzalo Santamaría Gómez, Guillem García Subies, Pablo Gutiérrez Ruiz, Mario González Valero, Natàlia Fuertes, Helena Montoro Zamorano, Carmen Muñoz Sanz, Leire Rosado Plaza, Nuria Aldama García, David Betancur Sánchez, Kateryna Sushkova, Marta Guerrero Nieto, Álvaro Barbero Jiménez

    Abstract: Large Language Models (LLMs) have become a key element of modern artificial intelligence, demonstrating the ability to address a wide range of language processing tasks at unprecedented levels of accuracy without the need of collecting problem-specific data. However, these versatile models face a significant challenge: both their training and inference processes require substantial computational r… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  6. arXiv:2502.09135  [pdf, other

    cs.LG q-bio.BM

    Interpreting and Steering Protein Language Models through Sparse Autoencoders

    Authors: Edith Natalia Villegas Garcia, Alessio Ansuini

    Abstract: The rapid advancements in transformer-based language models have revolutionized natural language processing, yet understanding the internal mechanisms of these models remains a significant challenge. This paper explores the application of sparse autoencoders (SAE) to interpret the internal representations of protein language models, specifically focusing on the ESM-2 8M parameter model. By perform… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

    Comments: 11 pages, 6 figures

  7. arXiv:2501.17198  [pdf, other

    cs.SD eess.AS physics.data-an

    6KSFx Synth Dataset

    Authors: Nelly Garcia, Joshua Reiss

    Abstract: Procedural audio, often referred to as "digital Foley", generates sound from scratch using computational processes. It represents an innovative approach to sound-effects creation. However, the development and adoption of procedural audio has been constrained by a lack of publicly available datasets and models, which hinders evaluation and optimization. To address this important gap, this paper pre… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: 7 pages, 2 tables and 1 figure

    Journal ref: CHIME poster 2024

  8. arXiv:2501.16011  [pdf, other

    cs.CL

    MEL: Legal Spanish Language Model

    Authors: David Betancur Sánchez, Nuria Aldama García, Álvaro Barbero Jiménez, Marta Guerrero Nieto, Patricia Marsà Morales, Nicolás Serrano Salas, Carlos García Hernán, Pablo Haya Coll, Elena Montiel Ponsoda, Pablo Calleja Ibáñez

    Abstract: Legal texts, characterized by complex and specialized terminology, present a significant challenge for Language Models. Adding an underrepresented language, such as Spanish, to the mix makes it even more challenging. While pre-trained models like XLM-RoBERTa have shown capabilities in handling multilingual corpora, their performance on domain specific documents remains underexplored. This paper pr… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: 8 pages, 6 figures, 3 tables

  9. arXiv:2501.15990  [pdf, other

    cs.CL

    3CEL: A corpus of legal Spanish contract clauses

    Authors: Nuria Aldama García, Patricia Marsà Morales, David Betancur Sánchez, Álvaro Barbero Jiménez, Marta Guerrero Nieto, Pablo Haya Coll, Patricia Martín Chozas, Elena Montiel Ponsoda

    Abstract: Legal corpora for Natural Language Processing (NLP) are valuable and scarce resources in languages like Spanish due to two main reasons: data accessibility and legal expert knowledge availability. INESData 2024 is a European Union funded project lead by the Universidad Politécnica de Madrid (UPM) and developed by Instituto de Ingeniería del Conocimiento (IIC) to create a series of state-of-the-art… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: 12 pages, 13 figures, 6 tables

  10. arXiv:2412.06286  [pdf, other

    cs.CV

    No Annotations for Object Detection in Art through Stable Diffusion

    Authors: Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia

    Abstract: Object detection in art is a valuable tool for the digital humanities, as it allows for faster identification of objects in artistic and historical images compared to humans. However, annotating such images poses significant challenges due to the need for specialized domain expertise. We present NADA (no annotations for detection in art), a pipeline that leverages diffusion models' art-related kno… ▽ More

    Submitted 17 December, 2024; v1 submitted 9 December, 2024; originally announced December 2024.

    Comments: 8 pages, 6 figures, to be published in WACV 2025

  11. arXiv:2409.20536  [pdf, other

    cs.LG cs.CY

    Best Practices for Responsible Machine Learning in Credit Scoring

    Authors: Giovani Valdrighi, Athyrson M. Ribeiro, Jansen S. B. Pereira, Vitoria Guardieiro, Arthur Hendricks, Décio Miranda Filho, Juan David Nieto Garcia, Felipe F. Bocca, Thalita B. Veronese, Lucas Wanner, Marcos Medeiros Raimundo

    Abstract: The widespread use of machine learning in credit scoring has brought significant advancements in risk assessment and decision-making. However, it has also raised concerns about potential biases, discrimination, and lack of transparency in these automated systems. This tutorial paper performed a non-systematic literature review to guide best practices for developing responsible machine learning mod… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  12. arXiv:2408.11358  [pdf, ps, other

    cs.CY

    Gender Bias Evaluation in Text-to-image Generation: A Survey

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: The rapid development of text-to-image generation has brought rising ethical considerations, especially regarding gender bias. Given a text prompt as input, text-to-image models generate images according to the prompt. Pioneering models such as Stable Diffusion and DALL-E 2 have demonstrated remarkable capabilities in producing high-fidelity images from natural language prompts. However, these mod… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  13. arXiv:2407.15399  [pdf, other

    cs.CL cs.AI cs.CR

    Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models

    Authors: Xiao Liu, Liangzhi Li, Tong Xiang, Fuying Ye, Lu Wei, Wangyue Li, Noa Garcia

    Abstract: With the development of large language models (LLMs) like ChatGPT, both their vast applications and potential vulnerabilities have come to the forefront. While developers have integrated multiple safety mechanisms to mitigate their misuse, a risk remains, particularly when models encounter adversarial inputs. This study unveils an attack mechanism that capitalizes on human conversation strategies… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  14. arXiv:2404.03242  [pdf, other

    cs.CV

    Would Deep Generative Models Amplify Bias in Future Models?

    Authors: Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa Garcia, Yuta Nakashima

    Abstract: We investigate the impact of deep generative models on potential social biases in upcoming computer vision models. As the internet witnesses an increasing influx of AI-generated images, concerns arise regarding inherent biases that may accompany them, potentially leading to the dissemination of harmful content. This paper explores whether a detrimental feedback loop, resulting in bias amplificatio… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted to CVPR 2024

  15. arXiv:2403.17752  [pdf, other

    cs.CL

    Can multiple-choice questions really be useful in detecting the abilities of LLMs?

    Authors: Wangyue Li, Liangzhi Li, Tong Xiang, Xiao Liu, Wei Deng, Noa Garcia

    Abstract: Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs) due to their simplicity and efficiency. However, there are concerns about whether MCQs can truly measure LLM's capabilities, particularly in knowledge-intensive scenarios where long-form generation (LFG) answers are required. The misalignment between the task and the evaluation method demands a thoug… ▽ More

    Submitted 23 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  16. arXiv:2403.16277  [pdf, ps, other

    cs.RO

    Combined Task and Motion Planning Via Sketch Decompositions (Extended Version with Supplementary Material)

    Authors: Magí Dalmau-Moreno, Néstor García, Vicenç Gómez, Héctor Geffner

    Abstract: The challenge in combined task and motion planning (TAMP) is the effective integration of a search over a combinatorial space, usually carried out by a task planner, and a search over a continuous configuration space, carried out by a motion planner. Using motion planners for testing the feasibility of task plans and filling out the details is not effective because it makes the geometrical constra… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  17. arXiv:2403.07091  [pdf, other

    cs.RO

    Sim-to-Real gap in RL: Use Case with TIAGo and Isaac Sim/Gym

    Authors: Jaume Albardaner, Alberto San Miguel, Néstor García, Magí Dalmau-Moreno

    Abstract: This paper explores policy-learning approaches in the context of sim-to-real transfer for robotic manipulation using a TIAGo mobile manipulator, focusing on two state-of-art simulators, Isaac Gym and Isaac Sim, both developed by Nvidia. Control architectures are discussed, with a particular emphasis on achieving collision-less movement in both simulation and the real environment. Presented results… ▽ More

    Submitted 27 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in ERF24 workshop "Towards Efficient and Portable Robot Learning for Real-World Settings". To be published in Springer Proceedings in Advanced Robotics

  18. arXiv:2403.00612  [pdf, other

    eess.IV cs.CV

    Advancing dermatological diagnosis: Development of a hyperspectral dermatoscope for enhanced skin imaging

    Authors: Martin J. Hetz, Carina Nogueira Garcia, Sarah Haggenmüller, Titus J. Brinker

    Abstract: Clinical dermatology necessitates precision and innovation for efficient diagnosis and treatment of various skin conditions. This paper introduces the development of a cutting-edge hyperspectral dermatoscope (the Hyperscope) tailored for human skin analysis. We detail the requirements to such a device and the design considerations, from optical configurations to sensor selection, necessary to capt… ▽ More

    Submitted 25 June, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: 12 pages, 11 Figures

  19. arXiv:2402.14114  [pdf, other

    cs.CV

    Multi-organ Self-supervised Contrastive Learning for Breast Lesion Segmentation

    Authors: Hugo Figueiras, Helena Aidos, Nuno Cruz Garcia

    Abstract: Self-supervised learning has proven to be an effective way to learn representations in domains where annotated labels are scarce, such as medical imaging. A widely adopted framework for this purpose is contrastive learning and it has been applied to different scenarios. This paper seeks to advance our understanding of the contrastive learning framework by exploring a novel perspective: employing m… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  20. arXiv:2312.03027  [pdf, other

    cs.CV

    Stable Diffusion Exposed: Gender Bias from Prompt to Image

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: Several studies have raised awareness about social biases in image generative models, demonstrating their predisposition towards stereotypes and imbalances. This paper contributes to this growing body of research by introducing an evaluation protocol that analyzes the impact of gender indicators at every step of the generation process on Stable Diffusion images. Leveraging insights from prior work… ▽ More

    Submitted 11 August, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  21. TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem

    Authors: Daniel Fuertes, Carlos R. del-Blanco, Fernando Jaureguizar, Narciso García

    Abstract: Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation, often integrated within larger Intelligent Transportation Systems (ITS). This problem is commonly formulated as a Vehicle Routing Problem (VRP) known as the Team Orienteering Problem (TOP). Existing solvers for this problem primarily rely on either linear programmi… ▽ More

    Submitted 20 May, 2025; v1 submitted 30 November, 2023; originally announced November 2023.

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2025, pp. 1-12

  22. Situating the social issues of image generation models in the model life cycle: a sociotechnical approach

    Authors: Amelia Katirai, Noa Garcia, Kazuki Ide, Yuta Nakashima, Atsuo Kishimoto

    Abstract: The race to develop image generation models is intensifying, with a rapid increase in the number of text-to-image models available. This is coupled with growing public awareness of these technologies. Though other generative AI models--notably, large language models--have received recent critical attention for the social and other non-technical issues they raise, there has been relatively little c… ▽ More

    Submitted 22 July, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

  23. arXiv:2307.01458  [pdf, other

    cs.CL

    CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care

    Authors: Tong Xiang, Liangzhi Li, Wangyue Li, Mingbai Bai, Lu Wei, Bowen Wang, Noa Garcia

    Abstract: The recent advances in natural language processing (NLP), have led to a new trend of applying large language models (LLMs) to real-world scenarios. While the latest LLMs are astonishingly fluent when interacting with humans, they suffer from the misinformation problem by unintentionally generating factually false statements. This can lead to harmful consequences, especially when produced within se… ▽ More

    Submitted 26 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  24. arXiv:2305.02731  [pdf

    cs.NE

    A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification

    Authors: Mehran Pourvahab, Seyed Jalaleddin Mousavirad, Virginie Felizardo, Nuno Pombo, Henriques Zacarias, Hamzeh Mohammadigheymasi, Sebastião Pais, Seyed Nooreddin Jafari, Nuno M. Garcia

    Abstract: Electrocardiogram (ECG) signals, which capture the heart's electrical activity, are used to diagnose and monitor cardiac problems. The accurate classification of ECG signals, particularly for distinguishing among various types of arrhythmias and myocardial infarctions, is crucial for the early detection and treatment of heart-related diseases. This paper proposes a novel approach based on an impro… ▽ More

    Submitted 6 October, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 44 pages, 9 figures

  25. Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, so… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  26. arXiv:2304.03693  [pdf, other

    cs.CV

    Model-Agnostic Gender Debiased Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Image captioning models are known to perpetuate and amplify harmful societal bias in the training set. In this work, we aim to mitigate such gender bias in image captioning models. While prior work has addressed this problem by forcing models to focus on people to reduce gender misclassification, it conversely generates gender-stereotypical words at the expense of predicting the correct gender. Fr… ▽ More

    Submitted 21 December, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  27. arXiv:2304.02828  [pdf, other

    cs.CV cs.CY

    Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

    Authors: Noa Garcia, Yusuke Hirota, Yankun Wu, Yuta Nakashima

    Abstract: The increasing tendency to collect large and uncurated datasets to train vision-and-language models has raised concerns about fair representations. It is known that even small but manually annotated datasets, such as MSCOCO, are affected by societal bias. This problem, far from being solved, may be getting worse with data crawled from the Internet without much control. In addition, the lack of too… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  28. arXiv:2303.12806  [pdf

    q-bio.QM cs.CV cs.LG eess.IV

    Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

    Authors: Tirtha Chanda, Katja Hauser, Sarah Hobelsberger, Tabea-Clara Bucher, Carina Nogueira Garcia, Christoph Wies, Harald Kittler, Philipp Tschandl, Cristian Navarrete-Dechent, Sebastian Podlipnik, Emmanouil Chousakos, Iva Crnaric, Jovana Majstorovic, Linda Alhajwan, Tanya Foreman, Sandra Peternel, Sergei Sarap, İrem Özdemir, Raymond L. Barnhill, Mar Llamas Velasco, Gabriela Poch, Sören Korsing, Wiebke Sondermann, Frank Friedrich Gellrich, Markus V. Heppt , et al. (10 additional authors not shown)

    Abstract: Although artificial intelligence (AI) systems have been shown to improve the accuracy of initial melanoma diagnosis, the lack of transparency in how these systems identify melanoma poses severe obstacles to user acceptance. Explainable artificial intelligence (XAI) methods can help to increase transparency, but most XAI methods are unable to produce precisely located domain-specific explanations,… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  29. arXiv:2303.09227  [pdf, other

    cs.RO

    MROS: A framework for robot self-adaptation

    Authors: Gustavo Rezende Silva, Darko Bozhinoski, Mario Garzon Oviedo, Mariano Ramírez Montero, Nadia Hammoudeh Garcia, Harshavardhan Deshpande, Andrzej Wasowski, Carlos Hernandez Corbato

    Abstract: Self-adaptation can be used in robotics to increase system robustness and reliability. This work describes the Metacontrol method for self-adaptation in robotics. Particularly, it details how the MROS (Metacontrol for ROS Systems) framework implements and packages Metacontrol, and it demonstrate how MROS can be applied in a navigation scenario where a mobile robot navigates in a factory floor. Vid… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 5 pages, 4 figures, accepted at ICSE 2023 demo track

  30. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems

    Authors: Nikzad Chizari, Niloufar Shoeibi, María N. Moreno-García

    Abstract: Recommender Systems (RSs) are used to provide users with personalized item recommendations and help them overcome the problem of information overload. Currently, recommendation methods based on deep learning are gaining ground over traditional methods such as matrix factorization due to their ability to represent the complex relationships between users and items and to incorporate additional infor… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    ACM Class: I.2.1

    Journal ref: Chizari, N.; Shoeibi, N.; Moreno-García, M.N. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems. Electronics 2022, 11, 3301

  31. arXiv:2208.10758  [pdf, other

    cs.CV cs.AI

    Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks

    Authors: Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: Is more data always better to train vision-and-language models? We study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks their overall performance will improve. However, we show that not all the knowledge transfers well or has a positive impact on related tasks, even when they share a commo… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  32. arXiv:2205.10233  [pdf, other

    cs.CL

    RigoBERTa: A State-of-the-Art Language Model For Spanish

    Authors: Alejandro Vaca Serrano, Guillem Garcia Subies, Helena Montoro Zamorano, Nuria Aldama Garcia, Doaa Samy, David Betancur Sanchez, Antonio Moreno Sandoval, Marta Guerrero Nieto, Alvaro Barbero Jimenez

    Abstract: This paper presents RigoBERTa, a State-of-the-Art Language Model for Spanish. RigoBERTa is trained over a well-curated corpus formed up from different subcorpora with key features. It follows the DeBERTa architecture, which has several advantages over other architectures of similar size as BERT or RoBERTa. RigoBERTa performance is assessed over 13 NLU tasks in comparison with other available Spani… ▽ More

    Submitted 3 June, 2022; v1 submitted 27 April, 2022; originally announced May 2022.

  33. Gender and Racial Bias in Visual Question Answering Datasets

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Vision-and-language tasks have increasingly drawn more attention as a means to evaluate human-like reasoning in machine learning models. A popular task in the field is visual question answering (VQA), which aims to answer questions about images. However, VQA models have been shown to exploit language bias by learning the statistical correlations between questions and answers without looking into t… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: ACM Conference on Fairness, Accountability, and Transparency (FAccT 2022)

  34. Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practises for QoE Assessment

    Authors: Pablo Pérez, Ester Gonzalez-Sosa, Jesús Gutiérrez, Narciso García

    Abstract: Several technological and scientific advances have been achieved recently in the fields of immersive systems, which are offering new possibilities to applications and services in different communication domains, such as entertainment, virtual conferencing, working meetings, social relations, healthcare, and industry. Users of these immersive technologies can explore and experience the stimuli in a… ▽ More

    Submitted 1 September, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Frontiers in Signal Processing

    Journal ref: Front. Signal Process. (2022)

  35. arXiv:2203.15395  [pdf, other

    cs.CV cs.MM

    Quantifying Societal Bias Amplification in Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: We study societal bias amplification in image captioning. Image captioning models have been shown to perpetuate gender and racial biases, however, metrics to measure, quantify, and evaluate the societal bias in captions are not yet standardized. We provide a comprehensive study on the strengths and limitations of each metric, and propose LIC, a metric to study captioning bias amplification. We arg… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  36. arXiv:2202.01747  [pdf, other

    cs.CV

    The Met Dataset: Instance-level Recognition for Artworks

    Authors: Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne Van Noord, Giorgos Tolias

    Abstract: This work introduces a dataset for large-scale instance-level recognition in the domain of artworks. The proposed benchmark exhibits a number of different challenges such as large inter-class similarity, long tail distribution, and many classes. We rely on the open access collection of The Met museum to form a large training set of about 224k classes, where each class corresponds to a museum exhib… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  37. Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties

    Authors: Mohamed S. Kraiem, Fernando Sánchez-Hernández, María N. Moreno-García

    Abstract: In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Thus, the prediction model is unreliable although the ov… ▽ More

    Submitted 15 December, 2021; originally announced January 2022.

    Comments: Kraiem, M.S., Sánchez-Hernández, F., Moreno-García, M.N. Selecting the Suitable Resampling Strategy for Imbalanced Data Classification Regarding Dataset Properties. An Approach Based on Association Models. Appl. Sci. 2021, 11(18), 8546, 2021

    ACM Class: I.2.1

    Journal ref: Appl. Sci. 2021, 11(18), 8546, 2021

  38. arXiv:2110.13395  [pdf, other

    cs.CV cs.AI

    Transferring Domain-Agnostic Knowledge in Video Question Answering

    Authors: Tianran Wu, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Haruo Takemura

    Abstract: Video question answering (VideoQA) is designed to answer a given question based on a relevant video clip. The current available large-scale datasets have made it possible to formulate VideoQA as the joint understanding of visual and language information. However, this training procedure is costly and still less competent with human performance. In this paper, we investigate a transfer learning met… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  39. arXiv:2110.07430  [pdf, other

    cs.LG stat.CO stat.ME

    Detecting Renewal States in Chains of Variable Length via Intrinsic Bayes Factors

    Authors: Victor Freguglia, Nancy Garcia

    Abstract: Markov chains with variable length are useful parsimonious stochastic models able to generate most stationary sequence of discrete symbols. The idea is to identify the suffixes of the past, called contexts, that are relevant to predict the future symbol. Sometimes a single state is a context, and looking at the past and finding this specific state makes the further past irrelevant. States with suc… ▽ More

    Submitted 6 January, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 25 pages, 3 figures

  40. arXiv:2109.11231  [pdf

    cs.IR

    Dynamic inference of user context through social tag embedding for music recommendation

    Authors: Diego Sánchez-Moreno, Álvaro Lozano Murciego, Vivian F. López Batista, María Dolores Muñoz Vicente, María N. Moreno-García

    Abstract: Music listening preferences at a given time depend on a wide range of contextual factors, such as user emotional state, location and activity at listening time, the day of the week, the time of the day, etc. It is therefore of great importance to take them into account when recommending music. However, it is very difficult to develop context-aware recommender systems that consider these factors, b… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: 15th ACM Conference on Recommender Systems-Workshop on Context-Aware Recommender Systems (RECSYS 2021-CARS)

  41. arXiv:2109.05743  [pdf, other

    cs.CV cs.AI cs.CL

    Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

    Authors: Zechen Bai, Yuta Nakashima, Noa Garcia

    Abstract: Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to people by generating comprehensive descriptions of fine-art paintings. Generating informative descriptions for artworks, however, is extremely challenging, as it requires to 1) describe multiple aspects of the image such as its style, content, or composition, and 2) pr… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: ICCV 2021

  42. arXiv:2108.06432  [pdf, other

    cs.CV

    Soccer line mark segmentation and classification with stochastic watershed transform

    Authors: Daniel Berjón, Carlos Cuevas, Narciso García

    Abstract: Augmented reality applications are beginning to change the way sports are broadcast, providing richer experiences and valuable insights to fans. The first step of augmented reality systems is camera calibration, possibly based on detecting the line markings of the playing field. Most existing proposals for line detection rely on edge detection and Hough transform, but radial distortion and extrane… ▽ More

    Submitted 3 August, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

    Comments: 18 pages, 11 figures

    ACM Class: I.4.6

  43. arXiv:2106.13445  [pdf, other

    cs.CV

    A Picture May Be Worth a Hundred Words for Visual Question Answering

    Authors: Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Ittetsu Taniguchi, Takao Onoye

    Abstract: How far can we go with textual representations for understanding pictures? In image understanding, it is essential to use concise but detailed image representations. Deep visual features extracted by vision models, such as Faster R-CNN, are prevailing used in multiple tasks, and especially in visual question answering (VQA). However, conventional deep visual features may struggle to convey all the… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  44. arXiv:2105.11852  [pdf

    cs.LG cs.CV

    GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph

    Authors: Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: The rise of digitization of cultural documents offers large-scale contents, opening the road for development of AI systems in order to preserve, search, and deliver cultural heritage. To organize such cultural content also means to classify them, a task that is very familiar to modern computer science. Contextual information is often the key to structure such real world data, and we propose to use… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  45. Subjective Assessment Experiments That Recruit Few Observers With Repetitions (FOWR)

    Authors: Pablo Perez, Lucjan Janowski, Narciso Garcia, Margaret Pinson

    Abstract: Recent studies have shown that it is possible to characterize subject bias and variance in subjective assessment tests. Apparent differences among subjects can, for the most part, be explained by random factors. Building on that theory, we propose a subjective test design where three to four team members each rate the stimuli multiple times. The results are comparable to a high performing objectiv… ▽ More

    Submitted 20 July, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: IEEE Transactions on Multimedia

  46. arXiv:2104.01406  [pdf

    cs.NI

    The Internet Protocol -- Past, some current limitations and a glimpse of a possible future

    Authors: Nuno M. Garcia

    Abstract: The network layer is central to the networking scientific area. It is around the network layer that all the data communications develop, and one of its main tasks is to allow the identification of each single interface/machine between the potentially many interfaces in a network. This seminar addresses some of the issues that are usually presented to young Computer Science Engineering students in… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

  47. Methodology to Assess Quality, Presence, Empathy, Attitude, and Attention in 360-degree Videos for Immersive Communications

    Authors: Marta Orduna, Pablo Pérez, Jesús Gutiérrez, Narciso García

    Abstract: This paper analyzes the joint assessment of quality, spatial and social presence, empathy, attitude, and attention in three conditions: (A)visualizing and rating the quality of contents in a Head-Mounted Display (HMD), (B)visualizing the contents in an HMD,and (C)visualizing the contents in an HMD where participants can see their hands and take notes. The experiment simulates an immersive communic… ▽ More

    Submitted 9 February, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: IEEE Transactions on Affective Computing, Early Access

  48. arXiv:2101.05479  [pdf, other

    cs.CV cs.LG

    Understanding the Role of Scene Graphs in Visual Question Answering

    Authors: Vinay Damodaran, Sharanya Chakravarthy, Akshay Kumar, Anjana Umapathy, Teruko Mitamura, Yuta Nakashima, Noa Garcia, Chenhui Chu

    Abstract: Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search. In this work, we explore the use of scene graphs for solving the VQA task. We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning ca… ▽ More

    Submitted 16 January, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  49. arXiv:2010.09145  [pdf, other

    cs.RO cs.SE

    MROS: Runtime Adaptation For Robot Control Architectures

    Authors: Darko Bozhinoski, Carlos Hernandez Corbato, Mario Garzon Oviedo, Gijs van der Hoorn, Nadia Hammoudeh Garcia, Harshavardhan Deshpande, Jon Tjerngren, Andrzej Wasowski

    Abstract: Known attempts to build autonomous robots rely on complex control architectures, often implemented with the Robot Operating System platform (ROS). Runtime adaptation is needed in these systems, to cope with component failures and with contingencies arising from dynamic environments-otherwise, these affect the reliability and quality of the mission execution. Existing proposals on how to build self… ▽ More

    Submitted 23 November, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

  50. arXiv:2009.14545  [pdf, other

    cs.CV cs.SI

    Demographic Influences on Contemporary Art with Unsupervised Style Embeddings

    Authors: Nikolai Huckle, Noa Garcia, Yuta Nakashima

    Abstract: Computational art analysis has, through its reliance on classification tasks, prioritised historical datasets in which the artworks are already well sorted with the necessary annotations. Art produced today, on the other hand, is numerous and easily accessible, through the internet and social networks that are used by professional and amateur artists alike to display their work. Although this art,… ▽ More

    Submitted 1 December, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: To be published in Proceedings of the European Conference in Computer Vision Workshops 2020