Skip to main content

Showing 1–50 of 74 results for author: Gallinari, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.07731  [pdf, ps, other

    cs.AI

    NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

    Authors: Mouadh Yagoubi, Yasser Dahou, Billel Mokeddem, Younes Belkada, Phuc H. Le-Khac, Basma El Amel Boussaha, Reda Alami, Jingwei Zuo, Damiano Marsili, Mugariya Farooq, Mounia Lalmas, Georgia Gkioxari, Patrick Gallinari, Philip Torr, Hakim Hacid

    Abstract: Existing benchmarks have proven effective for assessing the performance of fully trained large language models. However, we find striking differences in the early training stages of small models, where benchmarks often fail to provide meaningful or discriminative signals. To explore how these differences arise, this competition tackles the challenge of designing scientific knowledge evaluation tas… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2506.06158  [pdf, other

    cs.LG

    ENMA: Tokenwise Autoregression for Generative Neural PDE Operators

    Authors: Armand Kassaï Koupaï, Lise Le Boudec, Louis Serrano, Patrick Gallinari

    Abstract: Solving time-dependent parametric partial differential equations (PDEs) remains a fundamental challenge for neural solvers, particularly when generalizing across a wide range of physical parameters and dynamics. When data is uncertain or incomplete-as is often the case-a natural approach is to turn to generative models. We introduce ENMA, a generative neural operator designed to model spatio-tempo… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  3. arXiv:2503.23236  [pdf, other

    cs.LG physics.flu-dyn

    UP-dROM : Uncertainty-Aware and Parametrised dynamic Reduced-Order Model, application to unsteady flows

    Authors: Ismaël Zighed, Nicolas Thome, Patrick Gallinari, Taraneh Sayadi

    Abstract: Reduced order models (ROMs) play a critical role in fluid mechanics by providing low-cost predictions, making them an attractive tool for engineering applications. However, for ROMs to be widely applicable, they must not only generalise well across different regimes, but also provide a measure of confidence in their predictions. While recent data-driven approaches have begun to address nonlinear r… ▽ More

    Submitted 3 May, 2025; v1 submitted 29 March, 2025; originally announced March 2025.

  4. arXiv:2502.13674  [pdf, other

    cs.CL

    SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation

    Authors: Song Duong, Florian Le Bronnec, Alexandre Allauzen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

    Abstract: Large Language Models (LLMs), when used for conditional text generation, often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This issue arises in typical conditional text generation tasks, such as text summarization and data-to-text generation, where the goal is to produce fluent text based on contextual input. When fine-tuned on specific domain… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: 10 pages, ICLR 2025 conference

  5. arXiv:2410.23889  [pdf, other

    cs.LG cs.AI

    GEPS: Boosting Generalization in Parametric PDE Neural Solvers through Adaptive Conditioning

    Authors: Armand Kassaï Koupaï, Jorge Mifsut Benet, Yuan Yin, Jean-Noël Vittaut, Patrick Gallinari

    Abstract: Solving parametric partial differential equations (PDEs) presents significant challenges for data-driven methods due to the sensitivity of spatio-temporal dynamics to variations in PDE parameters. Machine learning approaches often struggle to capture this variability. To address this, data-driven approaches learn parametric PDEs by sampling a very large variety of trajectories with varying PDE par… ▽ More

    Submitted 8 November, 2024; v1 submitted 31 October, 2024; originally announced October 2024.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS) 2024

  6. arXiv:2410.06820  [pdf, ps, other

    cs.LG

    Learning a Neural Solver for Parametric PDE to Enhance Physics-Informed Methods

    Authors: Lise Le Boudec, Emmanuel de Bezenac, Louis Serrano, Ramon Daniel Regueiro-Espino, Yuan Yin, Patrick Gallinari

    Abstract: Physics-informed deep learning often faces optimization challenges due to the complexity of solving partial differential equations (PDEs), which involve exploring large solution spaces, require numerous iterations, and can lead to unstable training. These challenges arise particularly from the ill-conditioning of the optimization problem caused by the differential terms in the loss function. To ad… ▽ More

    Submitted 2 June, 2025; v1 submitted 9 October, 2024; originally announced October 2024.

  7. arXiv:2410.05817  [pdf, other

    cs.CL

    Probing Language Models on Their Knowledge Source

    Authors: Zineddine Tighidet, Andrea Mogini, Jiali Mei, Benjamin Piwowarski, Patrick Gallinari

    Abstract: Large Language Models (LLMs) often encounter conflicts between their learned, internal (parametric knowledge, PK) and external knowledge provided during inference (contextual knowledge, CK). Understanding how LLMs models prioritize one knowledge source over the other remains a challenge. In this paper, we propose a novel probing framework to explore the mechanisms governing the selection between P… ▽ More

    Submitted 9 November, 2024; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: Accepted at BlackBoxNLP@EMNLP2024

  8. arXiv:2410.03437  [pdf, ps, other

    cs.LG

    Zebra: In-Context Generative Pretraining for Solving Parametric PDEs

    Authors: Louis Serrano, Armand Kassaï Koupaï, Thomas X Wang, Pierre Erbacher, Patrick Gallinari

    Abstract: Solving time-dependent parametric partial differential equations (PDEs) is challenging for data-driven methods, as these models must adapt to variations in parameters such as coefficients, forcing terms, and initial conditions. State-of-the-art neural surrogates perform adaptation through gradient-based optimization and meta-learning to implicitly encode the variety of dynamics from observations.… ▽ More

    Submitted 26 June, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

  9. arXiv:2409.12737  [pdf, other

    cs.CL cs.AI

    MEXMA: Token-level objectives improve sentence representations

    Authors: João Maria Janeiro, Benjamin Piwowarski, Patrick Gallinari, Loïc Barrault

    Abstract: Current pre-trained cross-lingual sentence encoders approaches use sentence-level objectives only. This can lead to loss of information, especially for tokens, which then degrades the sentence representation. We propose MEXMA, a novel approach that integrates both sentence-level and token-level objectives. The sentence representation in one language is used to predict masked tokens in another lang… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: 11 pages, 12 figures

  10. arXiv:2407.01641  [pdf, other

    physics.flu-dyn cs.CE cs.LG

    NeurIPS 2024 ML4CFD Competition: Harnessing Machine Learning for Computational Fluid Dynamics in Airfoil Design

    Authors: Mouadh Yagoubi, David Danan, Milad Leyli-abadi, Jean-Patrick Brunet, Jocelyn Ahmed Mazari, Florent Bonnet, maroua gmati, Asma Farjallah, Paola Cinnella, Patrick Gallinari, Marc Schoenauer

    Abstract: The integration of machine learning (ML) techniques for addressing intricate physics problems is increasingly recognized as a promising avenue for expediting simulations. However, assessing ML-derived physical models poses a significant challenge for their adoption within industrial contexts. This competition is designed to promote the development of innovative ML approaches for tackling physical… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.01623

  11. arXiv:2406.02176  [pdf, other

    cs.LG

    AROMA: Preserving Spatial Structure for Latent PDE Modeling with Local Neural Fields

    Authors: Louis Serrano, Thomas X Wang, Etienne Le Naour, Jean-Noël Vittaut, Patrick Gallinari

    Abstract: We present AROMA (Attentive Reduced Order Model with Attention), a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types, including irregular-grid inputs and point clouds. This versatility eliminates the… ▽ More

    Submitted 21 October, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS) 2024

  12. arXiv:2403.01623  [pdf, other

    cs.LG cs.CE

    ML4PhySim : Machine Learning for Physical Simulations Challenge (The airfoil design)

    Authors: Mouadh Yagoubi, Milad Leyli-Abadi, David Danan, Jean-Patrick Brunet, Jocelyn Ahmed Mazari, Florent Bonnet, Asma Farjallah, Marc Schoenauer, Patrick Gallinari

    Abstract: The use of machine learning (ML) techniques to solve complex physical problems has been considered recently as a promising approach. However, the evaluation of such learned physical models remains an important issue for industrial use. The aim of this competition is to encourage the development of new ML techniques to solve physical problems using a unified evaluation framework proposed recently,… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  13. arXiv:2401.17919  [pdf, other

    cs.CL cs.LG

    LOCOST: State-Space Models for Long Document Abstractive Summarization

    Authors: Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

    Abstract: State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of $O(L \log L)$, this architecture can handle significantly longer sequences than state-of-the-a… ▽ More

    Submitted 25 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures, 7 tables, EACL 2024 conference

  14. arXiv:2309.17357  [pdf, other

    cs.LG

    Module-wise Training of Neural Networks via the Minimizing Movement Scheme

    Authors: Skander Karkar, Ibrahim Ayed, Emmanuel de Bézenac, Patrick Gallinari

    Abstract: Greedy layer-wise or module-wise training of neural networks is compelling in constrained and on-device settings where memory is limited, as it circumvents a number of problems of end-to-end back-propagation. However, it suffers from a stagnation problem, whereby early layers overfit and deeper layers stop increasing the test accuracy after a certain depth. We propose to solve this issue by introd… ▽ More

    Submitted 5 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: NeurIPS 2023. arXiv admin note: text overlap with arXiv:2210.00949

  15. arXiv:2307.13538  [pdf, other

    cs.LG physics.flu-dyn

    INFINITY: Neural Field Modeling for Reynolds-Averaged Navier-Stokes Equations

    Authors: Louis Serrano, Leon Migus, Yuan Yin, Jocelyn Ahmed Mazari, Patrick Gallinari

    Abstract: For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and ph… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Synergy of Scientific and Machine Learning Modeling

    Journal ref: ICML 2023 Workshop on Synergy of Scientific and Machine Learning Modeling

  16. arXiv:2306.07266  [pdf, other

    cs.LG cs.AI

    Operator Learning with Neural Fields: Tackling PDEs on General Geometries

    Authors: Louis Serrano, Lise Le Boudec, Armand Kassaï Koupaï, Thomas X Wang, Yuan Yin, Jean-Noël Vittaut, Patrick Gallinari

    Abstract: Machine learning approaches for solving partial differential equations require learning mappings between function spaces. While convolutional or graph neural networks are constrained to discretized functions, neural operators present a promising milestone toward mapping functions directly. Despite impressive results they still face challenges with respect to the domain geometry and typically rely… ▽ More

    Submitted 30 November, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Journal ref: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  17. arXiv:2306.05880  [pdf, other

    cs.LG cs.AI

    Time Series Continuous Modeling for Imputation and Forecasting with Implicit Neural Representations

    Authors: Etienne Le Naour, Louis Serrano, Léon Migus, Yuan Yin, Ghislain Agoua, Nicolas Baskiotis, Patrick Gallinari, Vincent Guigue

    Abstract: We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data, such as irregular samples, missing data, or unaligned measurements from multiple sensors. Our method relies on a continuous-time-dependent model of the series' evolution dynamics. It leverages adaptations of conditional, implicit neural represe… ▽ More

    Submitted 22 April, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

  18. arXiv:2306.04252  [pdf, other

    cs.LG

    Adversarial Sample Detection Through Neural Network Transport Dynamics

    Authors: Skander Karkar, Patrick Gallinari, Alain Rakotomamonjy

    Abstract: We propose a detector of adversarial samples that is based on the view of neural networks as discrete dynamic systems. The detector tells clean inputs from abnormal ones by comparing the discrete vector fields they follow through the layers. We also show that regularizing this vector field during training makes the network more regular on the data distribution's support, thus making the activation… ▽ More

    Submitted 8 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: ECML PKDD 2023

  19. arXiv:2305.17155  [pdf, other

    cs.LG cs.AI math.NA

    Stability of implicit neural networks for long-term forecasting in dynamical systems

    Authors: Leon Migus, Julien Salomon, Patrick Gallinari

    Abstract: Forecasting physical signals in long time range is among the most challenging tasks in Partial Differential Equations (PDEs) research. To circumvent limitations of traditional solvers, many different Deep Learning methods have been proposed. They are all based on auto-regressive methods and exhibit stability issues. Drawing inspiration from the stability property of implicit numerical schemes, we… ▽ More

    Submitted 8 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: ICLR 2023 Workshop on Physics for Machine Learning

  20. arXiv:2302.11269  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Learning from Multiple Sources for Data-to-Text and Text-to-Data

    Authors: Song Duong, Alberto Lumbreras, Mike Gartrell, Patrick Gallinari

    Abstract: Data-to-text (D2T) and text-to-data (T2D) are dual tasks that convert structured data, such as graphs or tables into fluent text, and vice versa. These tasks are usually handled separately and use corpora extracted from a single source. Current systems leverage pre-trained language models fine-tuned on D2T or T2D tasks. This approach has two main limitations: first, a separate system has to be tun… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: AISTATS 2023

  21. arXiv:2302.03033  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Exemplars and Counterexemplars Explanations for Image Classifiers, Targeting Skin Lesion Labeling

    Authors: Carlo Metta, Riccardo Guidotti, Yuan Yin, Patrick Gallinari, Salvatore Rinzivillo

    Abstract: Explainable AI consists in developing mechanisms allowing for an interaction between decision systems and humans by making the decisions of the formers understandable. This is particularly important in sensitive contexts like in the medical domain. We propose a use case study, for skin lesion diagnosis, illustrating how it is possible to provide the practitioner with explanations on the decisions… ▽ More

    Submitted 18 January, 2023; originally announced February 2023.

    Comments: arXiv admin note: text overlap with arXiv:2111.11863

    Journal ref: 2021 IEEE Symposium on Computers and Communications (ISCC)

  22. arXiv:2212.07564  [pdf, other

    cs.LG cs.CV physics.comp-ph physics.flu-dyn

    AirfRANS: High Fidelity Computational Fluid Dynamics Dataset for Approximating Reynolds-Averaged Navier-Stokes Solutions

    Authors: Florent Bonnet, Ahmed Jocelyn Mazari, Paola Cinnella, Patrick Gallinari

    Abstract: Surrogate models are necessary to optimize meaningful quantities in physical dynamics as their recursive numerical resolutions are often prohibitively expensive. It is mainly the case for fluid dynamics and the resolution of Navier-Stokes equations. However, despite the fast-growing field of data-driven models for physical systems, reference datasets representing real-world phenomena are lacking.… ▽ More

    Submitted 1 June, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks

  23. arXiv:2211.08253  [pdf, other

    cs.LG cs.AI cs.CV

    HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization

    Authors: Jingang Qu, Thibault Faney, Ze Wang, Patrick Gallinari, Soleiman Yousef, Jean-Charles de Hemptinne

    Abstract: Due to domain shifts, machine learning systems typically struggle to generalize well to new domains that differ from those of training data, which is what domain generalization (DG) aims to address. Although a variety of DG methods have been proposed, most of them fall short in interpretability and require domain labels, which are not available in many real-world scenarios. This paper presents a n… ▽ More

    Submitted 14 November, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

  24. arXiv:2210.00949  [pdf, other

    cs.LG

    Block-wise Training of Residual Networks via the Minimizing Movement Scheme

    Authors: Skander Karkar, Ibrahim Ayed, Emmanuel de Bézenac, Patrick Gallinari

    Abstract: End-to-end backpropagation has a few shortcomings: it requires loading the entire model during training, which can be impossible in constrained settings, and suffers from three locking problems (forward locking, update locking and backward locking), which prohibit training the layers in parallel. Solving layer-wise optimization problems can address these problems and has been used in on-device tra… ▽ More

    Submitted 6 June, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 1st International Workshop on Practical Deep Learning in the Wild at AAAI 2022

  25. arXiv:2209.14855  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Continuous PDE Dynamics Forecasting with Implicit Neural Representations

    Authors: Yuan Yin, Matthieu Kirchmeyer, Jean-Yves Franceschi, Alain Rakotomamonjy, Patrick Gallinari

    Abstract: Effective data-driven PDE forecasting methods often rely on fixed spatial and / or temporal discretizations. This raises limitations in real-world applications like weather prediction where flexible extrapolation at arbitrary spatiotemporal locations is required. We address this problem by introducing a new data-driven approach, DINo, that models a PDE's flow with continuous-time dynamics of spati… ▽ More

    Submitted 15 February, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Journal ref: The Eleventh International Conference on Learning Representations, International Conference on Representation Learning, May 2023, Kigali, Rwanda

  26. arXiv:2206.14709  [pdf, other

    cs.LG cs.CV math.NA

    An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes Equations

    Authors: Florent Bonnet, Jocelyn Ahmed Mazari, Thibaut Munzer, Pierre Yser, Patrick Gallinari

    Abstract: Recent progress in \emph{Geometric Deep Learning} (GDL) has shown its potential to provide powerful data-driven models. This gives momentum to explore new methods for learning physical systems governed by \emph{Partial Differential Equations} (PDEs) from Graph-Mesh data. However, despite the efforts and recent achievements, several research directions remain unexplored and progress is still far fr… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: ICLR 2022 Workshop on Geometrical and Topological Representation Learning

    Journal ref: ICLR 2022 Workshop on Geometrical and Topological Representation Learning

  27. arXiv:2206.14687  [pdf, other

    cs.LG cs.CV

    Multi-scale Physical Representations for Approximating PDE Solutions with Graph Neural Operators

    Authors: Léon Migus, Yuan Yin, Jocelyn Ahmed Mazari, Patrick Gallinari

    Abstract: Representing physical signals at different scales is among the most challenging problems in engineering. Several multi-scale modeling tools have been developed to describe physical systems governed by \emph{Partial Differential Equations} (PDEs). These tools are at the crossroad of principled physical models and numerical schema. Recently, data-driven models have been introduced to speed-up the ap… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: ICLR 2022 Workshop on Geometrical and Topological Representation Learning

    Journal ref: ICLR 2022 Workshop on Geometrical and Topological Representation Learning

  28. arXiv:2205.09739  [pdf, other

    cs.CV cs.AI cs.LG

    Diverse Weight Averaging for Out-of-Distribution Generalization

    Authors: Alexandre Ramé, Matthieu Kirchmeyer, Thibaud Rahier, Alain Rakotomamonjy, Patrick Gallinari, Matthieu Cord

    Abstract: Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA) strategies were shown to perform best on the competitive DomainBed benchmark; they directly average the weights of multiple networks despite their nonlinearities. In t… ▽ More

    Submitted 27 January, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: 36 pages, 16 figures, 15 tables

  29. arXiv:2205.03090  [pdf, other

    physics.chem-ph cs.AI cs.LG

    PTFlash : A deep learning framework for isothermal two-phase equilibrium calculations

    Authors: Jingang Qu, Thibault Faney, Jean-Charles de Hemptinne, Soleiman Yousef, Patrick Gallinari

    Abstract: Phase equilibrium calculations are an essential part of numerical simulations of multi-component multi-phase flow in porous media, accounting for the largest share of the computational time. In this work, we introduce a GPUenabled, fast, and parallel framework, PTFlash, that vectorizes algorithms required for isothermal two-phase flash calculations using PyTorch, and can facilitate a wide range of… ▽ More

    Submitted 19 May, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

  30. arXiv:2202.01889  [pdf, other

    cs.LG cs.AI stat.ML

    Generalizing to New Physical Systems via Context-Informed Dynamics Model

    Authors: Matthieu Kirchmeyer, Yuan Yin, Jérémie Donà, Nicolas Baskiotis, Alain Rakotomamonjy, Patrick Gallinari

    Abstract: Data-driven approaches to modeling physical systems fail to generalize to unseen systems that share the same general dynamics with the learning domain, but correspond to different physical contexts. We propose a new framework for this key problem, context-informed dynamics adaptation (CoDA), which takes into account the distributional shift across systems for fast and efficient adaptation to new d… ▽ More

    Submitted 24 June, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Accepted at ICML 2022

  31. arXiv:2111.11863  [pdf, other

    cs.CV cs.AI cs.LG

    Explainable Deep Image Classifiers for Skin Lesion Diagnosis

    Authors: Carlo Metta, Andrea Beretta, Riccardo Guidotti, Yuan Yin, Patrick Gallinari, Salvatore Rinzivillo, Fosca Giannotti

    Abstract: A key issue in critical contexts such as medical diagnosis is the interpretability of the deep learning models adopted in decision-making systems. Research in eXplainable Artificial Intelligence (XAI) is trying to solve this issue. However, often XAI approaches are only tested on generalist classifier and do not represent realistic problems such as those of medical diagnosis. In this paper, we ana… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

  32. arXiv:2110.15057  [pdf, other

    cs.LG cs.AI

    Mapping conditional distributions for domain adaptation under generalized target shift

    Authors: Matthieu Kirchmeyer, Alain Rakotomamonjy, Emmanuel de Bezenac, Patrick Gallinari

    Abstract: We consider the problem of unsupervised domain adaptation (UDA) between a source and a target domain under conditional and label shift a.k.a Generalized Target Shift (GeTarS). Unlike simpler UDA settings, few works have addressed this challenging problem. Recent approaches learn domain-invariant representations, yet they have practical limitations and rely on strong assumptions that may not hold i… ▽ More

    Submitted 18 March, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  33. arXiv:2109.12008  [pdf, other

    cs.CL cs.LG

    Separating Retention from Extraction in the Evaluation of End-to-end Relation Extraction

    Authors: Bruno Taillé, Vincent Guigue, Geoffrey Scoutheeten, Patrick Gallinari

    Abstract: State-of-the-art NLP models can adopt shallow heuristics that limit their generalization capability (McCoy et al., 2019). Such heuristics include lexical overlap with the training set in Named-Entity Recognition (Taillé et al., 2020) and Event or Type heuristics in Relation Extraction (Rosenman et al., 2020). In the more realistic end-to-end RE setting, we can expect yet another heuristic: the mer… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  34. arXiv:2109.09505  [pdf, other

    cs.LG

    Unsupervised domain adaptation with non-stochastic missing data

    Authors: Matthieu Kirchmeyer, Patrick Gallinari, Alain Rakotomamonjy, Amin Mantrach

    Abstract: We consider unsupervised domain adaptation (UDA) for classification problems in the presence of missing data in the unlabelled target domain. More precisely, motivated by practical applications, we analyze situations where distribution shift exists between domains and where some components are systematically absent on the target domain without available supervision for imputing the missing target… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  35. arXiv:2107.10030  [pdf, other

    cs.LG cs.NE stat.ML

    Differentiable Feature Selection, a Reparameterization Approach

    Authors: Jérémie Dona, Patrick Gallinari

    Abstract: We consider the task of feature selection for reconstruction which consists in choosing a small subset of features from which whole data instances can be reconstructed. This is of particular importance in several contexts involving for example costly physical measurements, sensor placement or information compression. To break the intrinsic combinatorial nature of this problem, we formulate the tas… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    Journal ref: European Conference (ECML-PKDD), In press

  36. arXiv:2106.05566  [pdf, other

    cs.LG cs.NE stat.ML

    A Neural Tangent Kernel Perspective of GANs

    Authors: Jean-Yves Franceschi, Emmanuel de Bézenac, Ibrahim Ayed, Mickaël Chen, Sylvain Lamprier, Patrick Gallinari

    Abstract: We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling GANs' training scheme, are subject to ill-defined discriminator gradients. We overcome this issue which impedes a principled study of GAN training, solving it within our framework by taking into account the discriminator's… ▽ More

    Submitted 7 November, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Journal ref: 39th International Conference on Machine Learning, International Machine Learning Society, Jul 2022, Baltimore, MD, United States. pp.6660-6704

  37. arXiv:2106.04546  [pdf, other

    cs.LG cs.AI stat.ML

    LEADS: Learning Dynamical Systems that Generalize Across Environments

    Authors: Yuan Yin, Ibrahim Ayed, Emmanuel de Bézenac, Nicolas Baskiotis, Patrick Gallinari

    Abstract: When modeling dynamical systems from real-world data samples, the distribution of data often changes according to the environment in which they are captured, and the dynamics of the system itself vary from one environment to another. Generalizing across environments thus challenges the conventional frameworks. The classical settings suggest either considering data as i.i.d. and learning a single m… ▽ More

    Submitted 14 February, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Published at NeurIPS 2021

  38. arXiv:2104.07555  [pdf, other

    cs.CL

    Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

    Authors: Clément Rebuffel, Thomas Scialom, Laure Soulier, Benjamin Piwowarski, Sylvain Lamprier, Jacopo Staiano, Geoffrey Scoutheeten, Patrick Gallinari

    Abstract: QuestEval is a reference-less metric used in text-to-text tasks, that compares the generated summaries directly to the source text, by automatically asking and answering questions. Its adaptation to Data-to-Text tasks is not straightforward, as it requires multimodal Question Generation and Answering systems on the considered tasks, which are seldom available. To this purpose, we propose a method… ▽ More

    Submitted 7 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted at EMNLP 2021

  39. arXiv:2103.12693  [pdf, other

    cs.CL

    QuestEval: Summarization Asks for Fact-based Evaluation

    Authors: Thomas Scialom, Paul-Alexis Dray, Patrick Gallinari, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano, Alex Wang

    Abstract: Summarization evaluation remains an open research problem: current metrics such as ROUGE are known to be limited and to correlate poorly with human judgments. To alleviate this issue, recent work has proposed evaluation metrics which rely on question answering models to assess whether a summary contains all the relevant information in its source document. Though promising, the proposed approaches… ▽ More

    Submitted 9 April, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: project page: https://github.com/recitalAI/QuestEval

  40. arXiv:2102.02810  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Controlling Hallucinations at Word Level in Data-to-Text Generation

    Authors: Clément Rebuffel, Marco Roberti, Laure Soulier, Geoffrey Scoutheeten, Rossella Cancelliere, Patrick Gallinari

    Abstract: Data-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the trai… ▽ More

    Submitted 9 July, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: 20 pages, 6 figures, 5 tables (excluding Appendix). Source code: https://github.com/KaijuML/dtt-multi-branch

    MSC Class: 68T50 (Primary); 68T07 (Secondary); 68T05 ACM Class: I.2.6; I.2.7

  41. arXiv:2011.12423  [pdf, ps, other

    cs.LG cs.CR cs.CV

    Stochastic sparse adversarial attacks

    Authors: Manon Césaire, Lucas Schott, Hatem Hajri, Sylvain Lamprier, Patrick Gallinari

    Abstract: This paper introduces stochastic sparse adversarial attacks (SSAA), standing as simple, fast and purely noise-based targeted and untargeted attacks of neural network classifiers (NNC). SSAA offer new examples of sparse (or $L_0$) attacks for which only few methods have been proposed previously. These attacks are devised by exploiting a small-time expansion idea widely used for Markov processes. Ex… ▽ More

    Submitted 19 February, 2022; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Final version published at the ICTAI 2021 conference with a best student paper award. Codes are available through the link: https://github.com/hhajri/stochastic-sparse-adv-attacks

  42. arXiv:2011.06850  [pdf, other

    cs.CV cs.AI

    Transductive Zero-Shot Learning using Cross-Modal CycleGAN

    Authors: Patrick Bordes, Eloi Zablocki, Benjamin Piwowarski, Patrick Gallinari

    Abstract: In Computer Vision, Zero-Shot Learning (ZSL) aims at classifying unseen classes -- classes for which no matching training image exists. Most of ZSL works learn a cross-modal mapping between images and class labels for seen classes. However, the data distribution of seen and unseen classes might differ, causing a domain shift problem. Following this observation, transductive ZSL (T-ZSL) assumes tha… ▽ More

    Submitted 13 November, 2020; originally announced November 2020.

  43. arXiv:2010.10866  [pdf, other

    cs.CL

    PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation

    Authors: Clément Rebuffel, Laure Soulier, Geoffrey Scoutheeten, Patrick Gallinari

    Abstract: In language generation models conditioned by structured data, the classical training via maximum likelihood almost always leads models to pick up on dataset divergence (i.e., hallucinations or omissions), and to incorporate them erroneously in their own generations at inference. In this work, we build ontop of previous Reinforcement Learning based approaches and show that a model-agnostic framewor… ▽ More

    Submitted 22 October, 2020; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted at the 13th International Conference on Natural Language Generation (INLG 2020)

  44. arXiv:2010.04456  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

    Authors: Yuan Yin, Vincent Le Guen, Jérémie Dona, Emmanuel de Bézenac, Ibrahim Ayed, Nicolas Thome, Patrick Gallinari

    Abstract: Forecasting complex dynamical phenomena in settings where only partial knowledge of their dynamics is available is a prevalent problem across various scientific fields. While purely data-driven approaches are arguably insufficient in this context, standard physical modeling based approaches tend to be over-simplistic, inducing non-negligible errors. In this work, we introduce the APHYNITY framewor… ▽ More

    Submitted 10 May, 2022; v1 submitted 9 October, 2020; originally announced October 2020.

    Comments: Accepted at ICLR 2021 (Oral)

    Journal ref: J. Stat. Mech. (2021) 124012

  45. arXiv:2009.10684  [pdf, other

    cs.CL cs.AI cs.LG

    Let's Stop Incorrect Comparisons in End-to-end Relation Extraction!

    Authors: Bruno Taillé, Vincent Guigue, Geoffrey Scoutheeten, Patrick Gallinari

    Abstract: Despite efforts to distinguish three different evaluation setups (Bekoulis et al., 2018), numerous end-to-end Relation Extraction (RE) articles present unreliable performance comparison to previous work. In this paper, we first identify several patterns of invalid comparisons in published papers and describe them to avoid their propagation. We then propose a small empirical study to quantify the i… ▽ More

    Submitted 9 August, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: Accepted at EMNLP 2020

  46. arXiv:2009.08372  [pdf, other

    stat.ML cs.LG

    A Principle of Least Action for the Training of Neural Networks

    Authors: Skander Karkar, Ibrahim Ayed, Emmanuel de Bézenac, Patrick Gallinari

    Abstract: Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In… ▽ More

    Submitted 15 June, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: ECML PKDD 2020

  47. arXiv:2008.01352  [pdf, other

    cs.LG cs.NE stat.ML

    PDE-Driven Spatiotemporal Disentanglement

    Authors: Jérémie Donà, Jean-Yves Franceschi, Sylvain Lamprier, Patrick Gallinari

    Abstract: A recent line of work in the machine learning community addresses the problem of predicting high-dimensional spatiotemporal phenomena by leveraging specific tools from the differential equations theory. Following this direction, we propose in this article a novel and general paradigm for this task based on a resolution method for partial differential equations: the separation of variables. This in… ▽ More

    Submitted 23 March, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Journal ref: The Ninth International Conference on Learning Representations, International Conference on Representation Learning, May 2021, Vienne, Austria

  48. arXiv:2002.10832  [pdf, other

    cs.CL cs.CV cs.LG

    What BERT Sees: Cross-Modal Transfer for Visual Question Generation

    Authors: Thomas Scialom, Patrick Bordes, Paul-Alexis Dray, Jacopo Staiano, Patrick Gallinari

    Abstract: Pre-trained language models have recently contributed to significant advances in NLP tasks. Recently, multi-modal versions of BERT have been developed, using heavy pre-training relying on vast corpora of aligned textual and image data, primarily applied to classification tasks such as VQA. In this paper, we are interested in evaluating the visual capabilities of BERT out-of-the-box, by avoiding pr… ▽ More

    Submitted 16 December, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: INLG 2020

  49. arXiv:2002.09219  [pdf, other

    cs.CV cs.LG stat.ML

    Stochastic Latent Residual Video Prediction

    Authors: Jean-Yves Franceschi, Edouard Delasalles, Mickaël Chen, Sylvain Lamprier, Patrick Gallinari

    Abstract: Designing video prediction models that account for the inherent uncertainty of the future is challenging. Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and temporal dynamics. However, no such model for stochasti… ▽ More

    Submitted 7 August, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Journal ref: Thirty-seventh International Conference on Machine Learning, International Machine Learning Society, Jul 2020, Vienne, Austria. pp.89--102

  50. arXiv:2002.02734  [pdf, other

    cs.CL

    Incorporating Visual Semantics into Sentence Representations within a Grounded Space

    Authors: Patrick Bordes, Eloi Zablocki, Laure Soulier, Benjamin Piwowarski, Patrick Gallinari

    Abstract: Language grounding is an active field aiming at enriching textual representations with visual information. Generally, textual and visual elements are embedded in the same representation space, which implicitly assumes a one-to-one correspondence between modalities. This hypothesis does not hold when representing words, and becomes problematic when used to learn sentence representations --- the foc… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.