Skip to main content

Showing 1–50 of 79 results for author: Vazquez, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.20793  [pdf, other

    cs.CV cs.AI

    Rendering-Aware Reinforcement Learning for Vector Graphics Generation

    Authors: Juan A. Rodriguez, Haotian Zhang, Abhay Puri, Aarash Feizi, Rishav Pramanik, Pascal Wichmann, Arnab Mondal, Mohammad Reza Samsami, Rabiul Awal, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli

    Abstract: Scalable Vector Graphics (SVG) offer a powerful format for representing visual designs as interpretable code. Recent advances in vision-language models (VLMs) have enabled high-quality SVG generation by framing the problem as a code generation task and leveraging large-scale pretraining. VLMs are particularly suitable for this task as they capture both global semantics and fine-grained visual patt… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  2. arXiv:2504.17069  [pdf, other

    cs.CV cs.AI

    Distilling semantically aware orders for autoregressive image generation

    Authors: Rishav Pramanik, Antoine Poupon, Juan A. Rodriguez, Masih Aminbeidokhti, David Vazquez, Christopher Pal, Zhaozheng Yin, Marco Pedersoli

    Abstract: Autoregressive patch-based image generation has recently shown competitive results in terms of image quality and scalability. It can also be easily integrated and scaled within Vision-Language models. Nevertheless, autoregressive models require a defined order for patch generation. While a natural order based on the dictation of the words makes sense for text generation, there is no inherent gener… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  3. arXiv:2504.07421  [pdf, other

    cs.CL

    AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery

    Authors: Amirhossein Abaskohi, Amrutha Varshini Ramesh, Shailesh Nanisetty, Chirag Goel, David Vazquez, Christopher Pal, Spandana Gella, Giuseppe Carenini, Issam H. Laradji

    Abstract: We introduce AgentAda, the first LLM-powered analytics agent that can learn and use new analytics skills to extract more specialized insights. Unlike existing methods that require users to manually decide which data analytics method to apply, AgentAda automatically identifies the skill needed from a library of analytical skills to perform the analysis. This also allows AgentAda to use skills that… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  4. arXiv:2503.21889  [pdf, other

    cs.CV cs.AI cs.LG

    StarFlow: Generating Structured Workflow Outputs From Sketch Images

    Authors: Patrice Bechard, Chao Wang, Amirhossein Abaskohi, Juan Rodriguez, Christopher Pal, David Vazquez, Spandana Gella, Sai Rajeswar, Perouz Taslakian

    Abstract: Workflows are a fundamental component of automation in enterprise platforms, enabling the orchestration of tasks, data processing, and system integrations. Despite being widely used, building workflows can be complex, often requiring manual configuration through low-code platforms or visual programming tools. To simplify this process, we explore the use of generative foundation models, particularl… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  5. arXiv:2503.15661  [pdf, other

    cs.CV cs.AI cs.CL

    UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

    Authors: Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin, Juan A. Rodriguez, Montek Kalsi, Rabiul Awal, Nicolas Chapados, M. Tamer Özsu, Aishwarya Agrawal, David Vazquez, Christopher Pal, Perouz Taslakian, Spandana Gella, Sai Rajeswar

    Abstract: Autonomous agents that navigate Graphical User Interfaces (GUIs) to automate tasks like document editing and file management can greatly enhance computer workflows. While existing research focuses on online settings, desktop environments, critical for many professional and everyday tasks, remain underexplored due to data collection challenges and licensing issues. We introduce UI-Vision, the first… ▽ More

    Submitted 6 May, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

    Comments: This paper has been accepted to the 41st International Conference on Machine Learning (ICML 2025)

  6. arXiv:2502.01341  [pdf, other

    cs.CL

    AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

    Authors: Ahmed Masry, Juan A. Rodriguez, Tianyu Zhang, Suyuchen Wang, Chao Wang, Aarash Feizi, Akshay Kalkunte Suresh, Abhay Puri, Xiangru Jian, Pierre-André Noël, Sathwik Tejaswi Madhusudhan, Marco Pedersoli, Bang Liu, Nicolas Chapados, Yoshua Bengio, Enamul Hoque, Christopher Pal, Issam H. Laradji, David Vazquez, Perouz Taslakian, Spandana Gella, Sai Rajeswar

    Abstract: Aligning visual features with language embeddings is a key challenge in vision-language models (VLMs). The performance of such models hinges on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), often produce out-of-distribution or… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  7. Der Effizienz- und Intelligenzbegriff in der Lexikographie und kuenstlichen Intelligenz: kann ChatGPT die lexikographische Textsorte nachbilden?

    Authors: Ivan Arias-Arias, Maria Jose Dominguez Vazquez, Carlos Valcarcel Riveiro

    Abstract: By means of pilot experiments for the language pair German and Galician, this paper examines the concept of efficiency and intelligence in lexicography and artificial intelligence, AI. The aim of the experiments is to gain empirically and statistically based insights into the lexicographical text type,dictionary article, in the responses of ChatGPT 3.5, as well as into the lexicographical data on… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 25 pages, in German language

    Journal ref: Lexikos 34 (2024): 51-76

  8. arXiv:2412.04626  [pdf, other

    cs.LG cs.CL

    BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks

    Authors: Juan Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, François Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-André Noël, Mats Leon Richter, Saverio Vadacchino, Shubham Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Noah Bolger, Kurt MacDonald, Simon Fauvel , et al. (18 additional authors not shown)

    Abstract: Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding workflows, extracting data from documents, and summarizing reports. Code generation tasks that require long-structured outputs can also be enhanced by multimodality. Despite this, their use in commercial applications is often limited due to limited access to training da… ▽ More

    Submitted 17 March, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

    Comments: The project is hosted at https://bigdocs.github.io

    Journal ref: ICLR 2025 https://openreview.net/forum?id=UTgNFcpk0j

  9. arXiv:2411.10670  [pdf, other

    cs.CL

    IntentGPT: Few-shot Intent Discovery with Large Language Models

    Authors: Juan A. Rodriguez, Nicholas Botzer, David Vazquez, Christopher Pal, Marco Pedersoli, Issam Laradji

    Abstract: In today's digitally driven world, dialogue systems play a pivotal role in enhancing user interactions, from customer service to virtual assistants. In these dialogues, it is important to identify user's goals automatically to resolve their needs promptly. This has necessitated the integration of models that perform Intent Detection. However, users' intents are diverse and dynamic, making it chall… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

    Comments: ICLR 2024 Workshop on LLM Agents

  10. Libros en abierto de las editoriales universitarias españolas

    Authors: Rosana Lopez-Carreño, Angel-Maria Delgado Vazquez, Francisco-Javier Martinez-Mendez

    Abstract: This paper analyses the set of scientific publications in open access, other than journals (monographs, conferences proceedings, teaching materials and grey literature), published by Spanish public universities, studying their volume, documentary typology, level of description and open access policies with the aim of measuring their degree of incorporation and compliance with the principles of Ope… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: 10 pages, in Spanish language, 3 figures, 4 tables

    Journal ref: El Profesional de la informacion, v. 30, n. 1, 2021

  11. arXiv:2407.06423  [pdf, other

    cs.AI

    InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

    Authors: Gaurav Sahu, Abhay Puri, Juan Rodriguez, Amirhossein Abaskohi, Mohammad Chegini, Alexandre Drouin, Perouz Taslakian, Valentina Zantedeschi, Alexandre Lacoste, David Vazquez, Nicolas Chapados, Christopher Pal, Sai Rajeswar Mudumba, Issam Hadj Laradji

    Abstract: Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We introduce InsightBench, a benchmark dataset with three key features. First, it consists of 100 datasets representing diverse business use cases such as finance and incident management, each accompanied by a carefully curated set of insights planted in the datasets.… ▽ More

    Submitted 27 February, 2025; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to ICLR 2025

  12. arXiv:2407.01556  [pdf, other

    cs.CY

    A Taxonomy of the Biases of the Images created by Generative Artificial Intelligence

    Authors: Adriana Fernández de Caleya Vázquez, Eduardo C. Garrido-Merchán

    Abstract: Generative artificial intelligence models show an amazing performance creating unique content automatically just by being given a prompt by the user, which is revolutionizing several fields such as marketing and design. Not only are there models whose generated output belongs to the text format but we also find models that are able to automatically generate high quality genuine images and videos g… ▽ More

    Submitted 2 May, 2024; originally announced July 2024.

  13. arXiv:2406.11811  [pdf, other

    cs.CL cs.AI

    RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

    Authors: Joao Monteiro, Pierre-Andre Noel, Etienne Marcotte, Sai Rajeswar, Valentina Zantedeschi, David Vazquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian

    Abstract: Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training se… ▽ More

    Submitted 5 November, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2404.15420  [pdf, other

    cs.CL cs.AI

    XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

    Authors: João Monteiro, Étienne Marcotte, Pierre-André Noël, Valentina Zantedeschi, David Vázquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian

    Abstract: In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference information. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right contex… ▽ More

    Submitted 1 November, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  15. arXiv:2404.12503  [pdf, other

    cs.AR

    STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems

    Authors: Daniel Vazquez, Jose Miranda, Alfonso Rodriguez, Andres Otero, Pascuale Davide Schiavone, David Atienza

    Abstract: Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the parallelizable sections of an application throughout the reconfigurable device while the CPU is in charge of control-intensive sections. This work introduces an elastic Coar… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 14 pages, 11 figures

  16. arXiv:2403.07718  [pdf, other

    cs.LG cs.AI

    WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

    Authors: Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste

    Abstract: We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 33 tasks based on the widely-used ServiceNow platform. We also… ▽ More

    Submitted 23 July, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: 21 pages, 11 figures, preprint

  17. Recursos lexicográficos electrónicos multilingües y plurilingües: definición y clasificación tipológico-descriptiva

    Authors: María José Domínguez Vázquez

    Abstract: The aim of this paper is to provide a classification of multilingual and plurilingual electronic lexicographic resources which would enable, one the one hand, the implementation of quantitative and qualitative criteria to produce a typological taxonomy of lexicographical tools, such as dictionaries, as opposed to platforms and websites and, on the other, the distinction of multilingual and plurili… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 25 pages, in Spanish language

    Journal ref: Revista Internacional de Lenguas Extranjeras, 10, 2019

  18. Zur Darstellung eines mehrstufigen Prototypbegriffs in der multilingualen automatischen Sprachgenerierung: vom Korpus über word embeddings bis hin zum automatischen Wörterbuch

    Authors: María José Domínguez Vázquez

    Abstract: The multilingual dictionary of noun valency Portlex is considered to be the trigger for the creation of the automatic language generators Xera and Combinatoria, whose development and use is presented in this paper. Both prototypes are used for the automatic generation of nominal phrases with their mono- and bi-argumental valence slots, which could be used, among others, as dictionary examples or a… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 31 pages, in German language

    Journal ref: Lexikos 31 (AFRILEX-reeks/series 31: 2021):

  19. Contribución de la semántica combinatoria al desarrollo de herramientas digitales multilingües

    Authors: María José Domínguez Vázquez

    Abstract: This paper describes how the field of Combinatorial Semantics has contributed to the design of three prototypes for the automatic generation of argument patterns in nominal phrases in Spanish, French and German (Xera, Combinatoria and CombiContext). It also shows the importance of knowing about the argument syntactic-semantic interface in a production situation in the context of foreign languages.… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 18 pages, in Spanish language. Círculo de lingüística aplicada a la comunicación, 2022

  20. arXiv:2312.13876  [pdf, other

    cs.LG cs.CL stat.ML

    Capture the Flag: Uncovering Data Insights with Large Language Models

    Authors: Issam Laradji, Perouz Taslakian, Sai Rajeswar, Valentina Zantedeschi, Alexandre Lacoste, Nicolas Chapados, David Vazquez, Christopher Pal, Alexandre Drouin

    Abstract: The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. However, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasonin… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 14 pages, 1 figure, Foundation Models for Decision Making Workshop at NeurIPS 2023

  21. arXiv:2312.11556  [pdf, ps, other

    cs.CV cs.AI cs.CL

    StarVector: Generating Scalable Vector Graphics Code from Images and Text

    Authors: Juan A. Rodriguez, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli

    Abstract: Scalable Vector Graphics (SVGs) are vital for modern image rendering due to their scalability and versatility. Previous SVG generation methods have focused on curve-based vectorization, lacking semantic understanding, often producing artifacts, and struggling with SVG primitives beyond path curves. To address these issues, we introduce StarVector, a multimodal large language model for SVG generati… ▽ More

    Submitted 31 May, 2025; v1 submitted 17 December, 2023; originally announced December 2023.

  22. arXiv:2310.18807  [pdf, other

    cs.AI cs.CV

    OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

    Authors: Rim Assouel, Pau Rodriguez, Perouz Taslakian, David Vazquez, Yoshua Bengio

    Abstract: A key aspect of human intelligence is the ability to imagine -- composing learned concepts in novel ways -- to make sense of new scenarios. Such capacity is not yet attained for machine learning systems. In this work, in the context of visual reasoning, we show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination. Our method, denoted Object-… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  23. arXiv:2310.18555  [pdf, other

    cs.LG

    Group Robust Classification Without Any Group Information

    Authors: Christos Tsirigotis, Joao Monteiro, Pau Rodriguez, David Vazquez, Aaron Courville

    Abstract: Empirical risk minimization (ERM) is sensitive to spurious correlations in the training data, which poses a significant risk when deploying systems trained under this paradigm in high-stake applications. While the existing literature focuses on maximizing group-balanced or worst-group accuracy, estimating these accuracies is hindered by costly bias annotations. This study contends that current bia… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Code is available at https://github.com/tsirif/uLA

  24. arXiv:2308.11480  [pdf, other

    cs.LG cs.AI cs.CV

    Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection

    Authors: Charles Guille-Escuret, Pierre-André Noël, Ioannis Mitliagkas, David Vazquez, Joao Monteiro

    Abstract: Improving the reliability of deployed machine learning systems often involves developing methods to detect out-of-distribution (OOD) inputs. However, existing research often narrowly focuses on samples from classes that are absent from the training set, neglecting other types of plausible distribution shifts. This limitation reduces the applicability of these methods in real-world scenarios, where… ▽ More

    Submitted 9 December, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

  25. arXiv:2307.11808  [pdf, other

    cs.CV

    Automatic Data Augmentation Learning using Bilevel Optimization for Histopathological Images

    Authors: Saypraseuth Mounsaveng, Issam Laradji, David Vázquez, Marco Perdersoli, Ismail Ben Ayed

    Abstract: Training a deep learning model to classify histopathological images is challenging, because of the color and shape variability of the cells and tissues, and the reduced amount of available data, which does not allow proper learning of those variations. Variations can come from the image acquisition process, for example, due to different cell staining protocols or tissue deformation. To tackle this… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2006.14699

  26. arXiv:2306.03831  [pdf, other

    cs.LG cs.CV

    GEO-Bench: Toward Foundation Models for Earth Monitoring

    Authors: Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin, Mehmet Gunturkun, Gabriel Huang, David Vazquez, Dava Newman, Yoshua Bengio, Stefano Ermon, Xiao Xiang Zhu

    Abstract: Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote s… ▽ More

    Submitted 23 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2112.00570

  27. arXiv:2306.01729  [pdf, other

    cs.CL cs.AI

    Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans

    Authors: Stefania Raimondo, Christopher Pal, Xiaotian Liu, David Vazquez, Hector Palacios

    Abstract: Task-oriented dialogue is difficult in part because it involves understanding user intent, collecting information from the user, executing API calls, and generating helpful and fluent responses. However, for complex tasks one must also correctly do all of these things over multiple steps, and in a specific order. While large pre-trained language models can be fine-tuned end-to-end to create multi-… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  28. arXiv:2306.00800  [pdf, other

    cs.CV cs.AI

    FigGen: Text to Scientific Figure Generation

    Authors: Juan A Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, Pau Rodriguez

    Abstract: The generative modeling landscape has experienced tremendous growth in recent years, particularly in generating natural images and art. Recent techniques have shown impressive potential in creating complex visual compositions while delivering impressive realism and quality. However, state-of-the-art methods have been focusing on the narrow domain of natural images, while other distributions remain… ▽ More

    Submitted 17 December, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Published at ICLR 2023 as a Tiny Paper

  29. arXiv:2302.05507  [pdf, other

    cs.CL cs.AI cs.LG

    Language Decision Transformers with Exponential Tilt for Interactive Text Environments

    Authors: Nicolas Gontier, Pau Rodriguez, Issam Laradji, David Vazquez, Christopher Pal

    Abstract: Text-based game environments are challenging because agents must deal with long sequences of text, execute compositional actions using text and learn from sparse rewards. We address these challenges by proposing Language Decision Transformers (LDTs), a framework that is based on transformer language models and decision transformers (DTs). Our LDTs extend DTs with 3 components: (1) exponential tilt… ▽ More

    Submitted 17 November, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: 19 pages, 6 figures, 5 tables

  30. arXiv:2212.06833  [pdf, other

    cs.CV cs.AI cs.LG

    3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding

    Authors: Lorenzo Pellegrini, Chenchen Zhu, Fanyi Xiao, Zhicheng Yan, Antonio Carta, Matthias De Lange, Vincenzo Lomonaco, Roshan Sumbaly, Pau Rodriguez, David Vazquez

    Abstract: Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been h… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: 21 pages, 12 figures, 5 tables

  31. arXiv:2211.10747  [pdf, other

    stat.ML cs.LG

    Exploring validation metrics for offline model-based optimisation with diffusion models

    Authors: Christopher Beckham, Alexandre Piche, David Vazquez, Christopher Pal

    Abstract: In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle, which is expensive to compute since it involves executing a real world process. In offline MBO we wish to do so without assuming access to such an oracle during training or validation, with mak… ▽ More

    Submitted 13 January, 2024; v1 submitted 19 November, 2022; originally announced November 2022.

  32. arXiv:2211.05213  [pdf, other

    cs.LG stat.ML

    Flaky Performances when Pretraining on Relational Databases

    Authors: Shengchao Liu, David Vazquez, Jian Tang, Pierre-André Noël

    Abstract: We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extracted from relational databases (RDBs). Intuitively, this joint use of SSL and GNNs should allow to leverage more of the available data, which could translate to better results. However, we found that naively porting contrastive SSL techniques can cause ``negat… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  33. arXiv:2210.12272  [pdf, other

    stat.ML cs.LG cs.RO

    Implicit Offline Reinforcement Learning via Supervised Learning

    Authors: Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal

    Abstract: Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), but takes advantage of return information. On datasets collected by policies of similar expertise, implicit BC has been shown to match or outperform exp… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  34. arXiv:2210.11248  [pdf, other

    cs.CV

    OCR-VQGAN: Taming Text-within-Image Generation

    Authors: Juan A. Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, Pau Rodriguez

    Abstract: Synthetic image generation has recently experienced significant improvements in domains such as natural image or art generation. However, the problem of figure and diagram generation remains unexplored. A challenging aspect of generating figures and diagrams is effectively rendering readable texts within the images. To alleviate this problem, we present OCR-VQGAN, an image encoder, and decoder tha… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: Paper accepted at WACV 2023

  35. arXiv:2210.01742  [pdf, other

    cs.LG cs.CV

    CADet: Fully Self-Supervised Out-Of-Distribution Detection With Contrastive Learning

    Authors: Charles Guille-Escuret, Pau Rodriguez, David Vazquez, Ioannis Mitliagkas, Joao Monteiro

    Abstract: Handling out-of-distribution (OOD) samples has become a major stake in the real-world deployment of machine learning systems. This work explores the use of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples: unseen classes and adversarial perturbations. First, we pair self-supervised contrastive learning with the maximum mean discrepancy (MMD) two-sample… ▽ More

    Submitted 9 December, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Journal ref: Advances in Neural Information Processing Systems 36 (2024)

  36. arXiv:2208.14488  [pdf, other

    cs.LG cs.AI cs.CV

    Constraining Representations Yields Models That Know What They Don't Know

    Authors: Joao Monteiro, Pau Rodriguez, Pierre-Andre Noel, Issam Laradji, David Vazquez

    Abstract: A well-known failure mode of neural networks is that they may confidently return erroneous predictions. Such unsafe behaviour is particularly frequent when the use case slightly differs from the training context, and/or in the presence of an adversary. This work presents a novel direction to address these issues in a broad, general manner: imposing class-aware constraints on a model's internal act… ▽ More

    Submitted 19 April, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: CR version published at ICLR 2023

  37. arXiv:2205.11690  [pdf, other

    cs.CL

    Workflow Discovery from Dialogues in the Low Data Regime

    Authors: Amine El Hattami, Stefania Raimondo, Issam Laradji, David Vazquez, Pau Rodriguez, Chris Pal

    Abstract: Text-based dialogues are now widely used to solve real-world problems. In cases where solution strategies are already known, they can sometimes be codified into workflows and used to guide humans or artificial agents through the task of helping clients. We introduce a new problem formulation that we call Workflow Discovery (WD) in which we are interested in the situation where a formal workflow ma… ▽ More

    Submitted 11 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

  38. arXiv:2204.01959  [pdf, other

    cs.CL cs.AI

    Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

    Authors: Gaurav Sahu, Pau Rodriguez, Issam H. Laradji, Parmida Atighehchian, David Vazquez, Dzmitry Bahdanau

    Abstract: Data augmentation is a widely employed technique to alleviate the problem of data scarcity. In this work, we propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models (LMs) such as GPT-3. An advantage of this method is that no task-specific LM-fine-tuning for data generation is required; hence the method requires no hyper-par… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted to 4th Workshop on NLP for Conversational AI, ACL 2022

  39. arXiv:2203.16662  [pdf, other

    stat.ML cs.LG

    Overcoming challenges in leveraging GANs for few-shot data augmentation

    Authors: Christopher Beckham, Issam Laradji, Pau Rodriguez, David Vazquez, Derek Nowrouzezahrai, Christopher Pal

    Abstract: In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We perform an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues re… ▽ More

    Submitted 8 August, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: v3 of the paper, various changes including better figures, CIFAR-100 results, and precision-recall metrics

  40. arXiv:2112.00570  [pdf, other

    cs.LG physics.geo-ph

    Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark

    Authors: Alexandre Lacoste, Evan David Sherwin, Hannah Kerner, Hamed Alemohammad, Björn Lütjens, Jeremy Irvin, David Dao, Alex Chang, Mehmet Gunturkun, Alexandre Drouin, Pau Rodriguez, David Vazquez

    Abstract: Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks. Such models, recently coined as foundation models, have been transformational to the field of natural language processing. While similar models have also been trained on large corpuses of images, they are not… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  41. arXiv:2111.12172  [pdf, other

    cs.CV cs.AI cs.LG

    Multi-label Iterated Learning for Image Classification with Label Ambiguity

    Authors: Sai Rajeswar, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville

    Abstract: Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that datasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data.… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

  42. arXiv:2110.14711  [pdf, other

    cs.CV cs.AI cs.LG

    A Survey of Self-Supervised and Few-Shot Object Detection

    Authors: Gabriel Huang, Issam Laradji, David Vazquez, Simon Lacoste-Julien, Pau Rodriguez

    Abstract: Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require dense labeling of the image. While few-shot object detection is about training a model on novel (unseen) object classes with little data, it still requires prior training on many labeled examples of base (seen) classes. On the other hand, self-supervised metho… ▽ More

    Submitted 23 August, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence. Awesome Few-Shot Object Detection (Leaderboard) at https://github.com/gabrielhuang/awesome-few-shot-object-detection

  43. A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images

    Authors: Alzayat Saleh, Issam H. Laradji, Corey Lammie, David Vazquez, Carol A Flavell, Mostafa Rahimi Azghadi

    Abstract: Health professionals extensively use Two- Dimensional (2D) Ultrasound (US) videos and images to visualize and measure internal organs for various purposes including evaluation of muscle architectural changes. US images can be used to measure abdominal muscles dimensions for the diagnosis and creation of customized treatment plans for patients with Low Back Pain (LBP), however, they are difficult t… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

    Comments: 9 pages, 8 figures, 1 tables, Accepted for Publication in the IEEE Journal of Biomedical and Health Informatics (J-BHI) 25-May-2021

  44. arXiv:2108.09593  [pdf, other

    cs.CV

    SSR: Semi-supervised Soft Rasterizer for single-view 2D to 3D Reconstruction

    Authors: Issam Laradji, Pau Rodríguez, David Vazquez, Derek Nowrouzezahrai

    Abstract: Recent work has made significant progress in learning object meshes with weak supervision. Soft Rasterization methods have achieved accurate 3D reconstruction from 2D images with viewpoint supervision only. In this work, we further reduce the labeling effort by allowing such 3D reconstruction methods leverage unlabeled images. In order to obtain the viewpoints for these unlabeled images, we propos… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

  45. arXiv:2104.00442  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Touch-based Curiosity for Sparse-Reward Tasks

    Authors: Sai Rajeswar, Cyril Ibrahim, Nitin Surya, Florian Golemo, David Vazquez, Aaron Courville, Pedro O. Pinheiro

    Abstract: Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion. In this work, we leverage surprise from mismatches in touch feedback to guide exploration in hard sparse-reward reinforcement learning tasks. Our approach, Touch-based Curiosity (ToC), learns what visible objects interactions are… ▽ More

    Submitted 26 June, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

  46. arXiv:2103.16607  [pdf, other

    cs.CV

    Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

    Authors: Oscar Mañas, Alexandre Lacoste, Xavier Giro-i-Nieto, David Vazquez, Pau Rodriguez

    Abstract: Remote sensing and automatic earth monitoring are key to solve global-scale challenges such as disaster prevention, land use monitoring, or tackling climate change. Although there exist vast amounts of remote sensing data, most of it remains unlabeled and thus inaccessible for supervised learning algorithms. Transfer learning approaches can reduce the data requirements of deep learning algorithms.… ▽ More

    Submitted 3 May, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

  47. arXiv:2103.10226  [pdf, other

    cs.LG cs.CV

    Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

    Authors: Pau Rodriguez, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam Laradji, Laurent Charlin, David Vazquez

    Abstract: Explainability for machine learning models has gained considerable attention within the research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction, providing details about the model's decision-making. Current methods tend to generate… ▽ More

    Submitted 11 November, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: ICCV 2021

  48. arXiv:2102.09557  [pdf, other

    cs.LG

    Knowledge Hypergraph Embedding Meets Relational Algebra

    Authors: Bahare Fatemi, Perouz Taslakian, David Vazquez, David Poole

    Abstract: Embedding-based methods for reasoning in knowledge hypergraphs learn a representation for each entity and relation. Current methods do not capture the procedural rules underlying the relations in the graph. We propose a simple embedding-based model called ReAlE that performs link prediction in knowledge hypergraphs (generalized knowledge graphs) and can represent high-level abstractions in terms o… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

  49. arXiv:2011.07369  [pdf, other

    cs.CV

    Counting Cows: Tracking Illegal Cattle Ranching From High-Resolution Satellite Imagery

    Authors: Issam Laradji, Pau Rodriguez, Freddie Kalaitzis, David Vazquez, Ross Young, Ed Davey, Alexandre Lacoste

    Abstract: Cattle farming is responsible for 8.8\% of greenhouse gas emissions worldwide. In addition to the methane emitted due to their digestive process, the growing need for grazing areas is an important driver of deforestation. While some regulations are in place for preserving the Amazon against deforestation, these are being flouted in various ways, hence the need to scale and automate the monitoring… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

  50. arXiv:2011.03149  [pdf, other

    cs.CV

    Affinity LCFCN: Learning to Segment Fish with Weak Supervision

    Authors: Issam Laradji, Alzayat Saleh, Pau Rodriguez, Derek Nowrouzezahrai, Mostafa Rahimi Azghadi, David Vazquez

    Abstract: Aquaculture industries rely on the availability of accurate fish body measurements, e.g., length, width and mass. Manual methods that rely on physical tools like rulers are time and labour intensive. Leading automatic approaches rely on fully-supervised segmentation models to acquire these measurements but these require collecting per-pixel labels -- also time consuming and laborious: i.e., it can… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.