Skip to main content

Showing 1–11 of 11 results for author: Barriere, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.18903  [pdf, other

    cs.CL

    StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos

    Authors: Valentin Barriere, Nahuel Gomez, Leo Hemamou, Sofia Callejas, Brian Ravenet

    Abstract: Aiming towards improving current computational models of humor detection, we propose a new multimodal dataset of stand-up comedies, in seven languages: English, French, Spanish, Italian, Portuguese, Hungarian and Czech. Our dataset of more than 330 hours, is at the time of writing the biggest available for this type of task, and the most diverse. The whole dataset is automatically annotated in lau… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  2. arXiv:2411.15051  [pdf, other

    cs.CL cs.CV cs.CY cs.LG

    Fantastic Biases (What are They) and Where to Find Them

    Authors: Valentin Barriere

    Abstract: Deep Learning models tend to learn correlations of patterns on huge datasets. The bigger these systems are, the more complex are the phenomena they can detect, and the more data they need for this. The use of Artificial Intelligence (AI) is becoming increasingly ubiquitous in our society, and its impact is growing everyday. The promises it holds strongly depend on their fair and universal use, suc… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

    Comments: Publication in Spanish in the Journal Bits de Ciencias: https://www.dcc.uchile.cl/media/bits/pdfs/bits26.2-sesgos-fantasticos.pdf

    Journal ref: Bits de Ciencias 26 (2024), 02-13

  3. arXiv:2407.01834  [pdf, other

    cs.CL

    A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers

    Authors: Valentin Barriere, Sebastian Cifuentes

    Abstract: In this paper, we apply a method to quantify biases associated with named entities from various countries. We create counterfactual examples with small perturbations on target-domain data instead of relying on templates or specific datasets for bias detection. On widely used classifiers for subjectivity analysis, including sentiment, emotion, hate speech, and offensive text using Twitter data, our… ▽ More

    Submitted 23 November, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: updated EMNLP camera ready version

    Journal ref: (2024), Proceedings of EMNLP, 569-579

  4. arXiv:2402.05349  [pdf, other

    cs.CV

    Scrapping The Web For Early Wildfire Detection: A New Annotated Dataset of Images and Videos of Smoke Plumes In-the-wild

    Authors: Mateo Lostanlen, Nicolas Isla, Jose Guillen, Felix Veith, Cristian Buc, Valentin Barriere

    Abstract: Early wildfire detection is of the utmost importance to enable rapid response efforts, and thus minimize the negative impacts of wildfire spreads. To this end, we present PyroNear-2024, a new dataset composed of both images and videos, allowing for the training and evaluation of smoke plume detection models, including sequential models. The data is sourced from: \textit{(i)} web-scraped videos of… ▽ More

    Submitted 22 November, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Preprint of ongoing work

  5. Deep Natural Language Feature Learning for Interpretable Prediction

    Authors: Felipe Urrutia, Cristian Buc, Valentin Barriere

    Abstract: We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. Our method allows for representing each example by a vector consisting of the answers to these questions. We call this representation Natural Language Learned Features (NLLF). NLLF is generated by a… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  6. arXiv:2309.15991  [pdf, other

    cs.CV cs.CL

    Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness

    Authors: Valentin Barriere, Felipe del Rio, Andres Carvallo De Ferari, Carlos Aspillaga, Eugenio Herrera-Berg, Cristian Buc Calderon

    Abstract: Artificial neural networks typically struggle in generalizing to out-of-context examples. One reason for this limitation is caused by having datasets that incorporate only partial information regarding the potential correlational structure of the world. In this work, we propose TIDA (Targeted Image-editing Data Augmentation), a targeted data augmentation method focused on improving models' human-l… ▽ More

    Submitted 17 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  7. arXiv:2305.12011  [pdf, other

    cs.CV eess.IV

    Boosting Crop Classification by Hierarchically Fusing Satellite, Rotational, and Contextual Data

    Authors: Valentin Barriere, Martin Claverie, Maja Schneider, Guido Lemoine, Raphaël d'Andrimont

    Abstract: Accurate in-season crop type classification is crucial for the crop production estimation and monitoring of agricultural parcels. However, the complexity of the plant growth patterns and their spatio-temporal variability present significant challenges. While current deep learning-based methods show promise in crop type classification from single- and multi-modal time series, most existing methods… ▽ More

    Submitted 7 November, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: submitted to Remote Sensing of Environment, special issue Deep Learning for Time Series. Version 2

  8. arXiv:2208.10838  [pdf, other

    cs.CV cs.CL

    Multimodal Crop Type Classification Fusing Multi-Spectral Satellite Time Series with Farmers Crop Rotations and Local Crop Distribution

    Authors: Valentin Barriere, Martin Claverie

    Abstract: Accurate, detailed, and timely crop type mapping is a very valuable information for the institutions in order to create more accurate policies according to the needs of the citizens. In the last decade, the amount of available data dramatically increased, whether it can come from Remote Sensing (using Copernicus Sentinel-2 data) or directly from the farmers (providing in-situ crop information thro… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: accepted to CECEO22@IJCAI

  9. arXiv:2010.03486  [pdf, ps, other

    cs.CL

    Improving Sentiment Analysis over non-English Tweets using Multilingual Transformers and Automatic Translation for Data-Augmentation

    Authors: Valentin Barriere, Alexandra Balahur

    Abstract: Tweets are specific text data when compared to general text. Although sentiment analysis over tweets has become very popular in the last decade for English, it is still difficult to find huge annotated corpora for non-English languages. The recent rise of the transformer models in Natural Language Processing allows to achieve unparalleled performances in many tasks, but these models need a consequ… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted to COLING2020

  10. arXiv:1909.03453  [pdf, ps, other

    cs.CL

    May I Check Again? -- A simple but efficient way to generate and use contextual dictionaries for Named Entity Recognition. Application to French Legal Texts

    Authors: Valentin Barriere, Amaury Fouret

    Abstract: In this paper we present a new method to learn a model robust to typos for a Named Entity Recognition task. Our improvement over existing methods helps the model to take into account the context of the sentence inside a court decision in order to recognize an entity with a typo. We used state-of-the-art models and enriched the last layer of the neural network with high-level information linked wit… ▽ More

    Submitted 8 September, 2019; originally announced September 2019.

    Comments: accepted at NoDaLiDa'19, Turku, Finland

  11. arXiv:1806.07787  [pdf, other

    cs.CL

    Opinion Dynamics Modeling for Movie Review Transcripts Classification with Hidden Conditional Random Fields

    Authors: Valentin Barriere, Chloé Clavel, Slim Essid

    Abstract: In this paper, the main goal is to detect a movie reviewer's opinion using hidden conditional random fields. This model allows us to capture the dynamics of the reviewer's opinion in the transcripts of long unsegmented audio reviews that are analyzed by our system. High level linguistic features are computed at the level of inter-pausal segments. The features include syntactic features, a statisti… ▽ More

    Submitted 20 June, 2018; originally announced June 2018.

    Comments: Oral Interspeech 2017