Skip to main content

Showing 1–13 of 13 results for author: Clausel, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11478  [pdf, ps, other

    cs.CL

    ImmunoFOMO: Are Language Models missing what oncologists see?

    Authors: Aman Sinha, Bogdan-Valentin Popescu, Xavier Coubez, Marianne Clausel, Mathieu Constant

    Abstract: Language models (LMs) capabilities have grown with a fast pace over the past decade leading researchers in various disciplines, such as biomedical research, to increasingly explore the utility of LMs in their day-to-day applications. Domain specific language models have already been in use for biomedical natural language processing (NLP) applications. Recently however, the interest has grown towar… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  2. arXiv:2407.12626  [pdf, other

    cs.CL

    Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification?

    Authors: Aman Sinha, Timothee Mickus, Marianne Clausel, Mathieu Constant, Xavier Coubez

    Abstract: The success of pretrained language models (PLMs) across a spate of use-cases has led to significant investment from the NLP community towards building domain-specific foundational models. On the other hand, in mission critical settings such as biomedical applications, other aspects also factor in-chief of which is a model's ability to produce reasonable estimates of its own uncertainty. In the pre… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: BioNLP 2024

  3. arXiv:2406.14033  [pdf, other

    stat.ML cs.LG

    Ensembles of Probabilistic Regression Trees

    Authors: Alexandre Seiller, Éric Gaussier, Emilie Devijver, Marianne Clausel, Sami Alkhoury

    Abstract: Tree-based ensemble methods such as random forests, gradient-boosted trees, and Bayesianadditive regression trees have been successfully used for regression problems in many applicationsand research studies. In this paper, we study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2403.19358  [pdf, other

    cs.CL

    Risk prediction of pathological gambling on social media

    Authors: Angelina Parfenova, Marianne Clausel

    Abstract: This paper addresses the problem of risk prediction on social media data, specifically focusing on the classification of Reddit users as having a pathological gambling disorder. To tackle this problem, this paper focuses on incorporating temporal and emotional features into the model. The preprocessing phase involves dealing with the time irregularity of posts by padding sequences. Two baseline ar… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  5. arXiv:2309.08698  [pdf, other

    cs.AI cs.LG

    No Imputation Needed: A Switch Approach to Irregularly Sampled Time Series

    Authors: Rohit Agarwal, Aman Sinha, Ayan Vishwakarma, Xavier Coubez, Marianne Clausel, Mathieu Constant, Alexander Horsch, Dilip K. Prasad

    Abstract: Modeling irregularly-sampled time series (ISTS) is challenging because of missing values. Most existing methods focus on handling ISTS by converting irregularly sampled data into regularly sampled data via imputation. These models assume an underlying missing mechanism, which may lead to unwanted bias and sub-optimal performance. We present SLAN (Switch LSTM Aggregate Network), which utilizes a gr… ▽ More

    Submitted 19 August, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

  6. arXiv:2204.09362  [pdf, other

    cs.LG stat.AP stat.ML

    Wind power predictions from nowcasts to 4-hour forecasts: a learning approach with variable selection

    Authors: Dimitri Bouche, Rémi Flamary, Florence d'Alché-Buc, Riwal Plougonven, Marianne Clausel, Jordi Badosa, Philippe Drobinski

    Abstract: We study short-term prediction of wind speed and wind power (every 10 minutes up to 4 hours ahead). Accurate forecasts for these quantities are crucial to mitigate the negative effects of wind farms' intermittent production on energy systems and markets. We use machine learning to combine outputs from numerical weather prediction models with local observations. The former provide valuable informat… ▽ More

    Submitted 13 December, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

  7. arXiv:2202.13240  [pdf, ps, other

    cs.IR stat.ML

    Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

    Authors: Aleksandra Burashnikova, Yury Maximov, Marianne Clausel, Charlotte Laclau, Franck Iutzeler, Massih-Reza Amini

    Abstract: This paper is an extended version of [Burashnikova et al., 2021, arXiv: 2012.06910], where we proposed a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

    Comments: 7 pages, 2 tables; extended abstract accepted to IJCAI 2022. arXiv admin note: substantial text overlap with arXiv:2012.06910, arXiv:1902.08495

  8. arXiv:2112.02242  [pdf, other

    cs.IR

    Recommender systems: when memory matters

    Authors: Aleksandra Burashnikova, Marianne Clausel, Massih-Reza Amini, Yury Maximov, Nicolas Dante

    Abstract: In this paper, we study the effect of long memory in the learnability of a sequential recommender system including users' implicit feedback. We propose an online algorithm, where model parameters are updated user per user over blocks of items constituted by a sequence of unclicked items followed by a clicked one. We illustrate through thorough empirical evaluations that filtering users with respec… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted to the 44-th European Conference on Information Retrieval (ECIR), 2022. arXiv admin note: text overlap with arXiv:2012.06910

  9. arXiv:2111.15278  [pdf, other

    cs.CL

    Bilingual Topic Models for Comparable Corpora

    Authors: Georgios Balikas, Massih-Reza Amini, Marianne Clausel

    Abstract: Probabilistic topic models like Latent Dirichlet Allocation (LDA) have been previously extended to the bilingual setting. A fundamental modeling assumption in several of these extensions is that the input corpora are in the form of document pairs whose constituent documents share a single topic distribution. However, this assumption is strong for comparable corpora that consist of documents themat… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Comments: 32 pages, 2 figures

  10. arXiv:2012.06910  [pdf, other

    cs.IR cs.LG

    Learning over no-Preferred and Preferred Sequence of items for Robust Recommendation

    Authors: Aleksandra Burashnikova, Marianne Clausel, Charlotte Laclau, Frack Iutzeller, Yury Maximov, Massih-Reza Amini

    Abstract: In this paper, we propose a theoretically founded sequential strategy for training large-scale Recommender Systems (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strate… ▽ More

    Submitted 12 December, 2020; originally announced December 2020.

    Comments: 21 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:1902.08495

  11. arXiv:2003.01432  [pdf, other

    stat.ML cs.LG

    Nonlinear Functional Output Regression: a Dictionary Approach

    Authors: Dimitri Bouche, Marianne Clausel, François Roueff, Florence d'Alché-Buc

    Abstract: To address functional-output regression, we introduce projection learning (PL), a novel dictionary-based approach that learns to predict a function that is expanded on a dictionary while minimizing an empirical risk based on a functional loss. PL makes it possible to use non orthogonal dictionaries and can then be combined with dictionary learning; it is thus much more flexible than expansion-base… ▽ More

    Submitted 26 February, 2021; v1 submitted 3 March, 2020; originally announced March 2020.

  12. arXiv:1810.11698  [pdf, ps, other

    cs.LG stat.ML

    Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

    Authors: Myriam Tami, Marianne Clausel, Emilie Devijver, Adrien Dulac, Eric Gaussier, Stefan Janaqi, Meriam Chebre

    Abstract: Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided y… ▽ More

    Submitted 18 November, 2018; v1 submitted 27 October, 2018; originally announced October 2018.

    Comments: 9 pages

  13. arXiv:1606.00253  [pdf, other

    cs.CL cs.IR cs.LG

    On a Topic Model for Sentences

    Authors: Georgios Balikas, Massih-Reza Amini, Marianne Clausel

    Abstract: Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text spans such as sentences, contains much information which is generally lost with these models. In this paper, we propose sentenceLDA, an extension of LDA whose go… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.