-
Fused-Planes: Improving Planar Representations for Learning Large Sets of 3D Scenes
Authors:
Karim Kassab,
Antoine Schnepf,
Jean-Yves Franceschi,
Laurent Caraffa,
Flavian Vasile,
Jeremie Mary,
Andrew Comport,
Valérie Gouet-Brunet
Abstract:
To learn large sets of scenes, Tri-Planes are commonly employed for their planar structure that enables an interoperability with image models, and thus diverse 3D applications. However, this advantage comes at the cost of resource efficiency, as Tri-Planes are not the most computationally efficient option. In this paper, we introduce Fused-Planes, a new planar architecture that improves Tri-Planes…
▽ More
To learn large sets of scenes, Tri-Planes are commonly employed for their planar structure that enables an interoperability with image models, and thus diverse 3D applications. However, this advantage comes at the cost of resource efficiency, as Tri-Planes are not the most computationally efficient option. In this paper, we introduce Fused-Planes, a new planar architecture that improves Tri-Planes resource-efficiency in the framework of learning large sets of scenes, which we call "multi-scene inverse graphics". To learn a large set of scenes, our method divides it into two subsets and operates as follows: (i) we train the first subset of scenes jointly with a compression model, (ii) we use that compression model to learn the remaining scenes. This compression model consists of a 3D-aware latent space in which Fused-Planes are learned, enabling a reduced rendering resolution, and shared structures across scenes that reduce scene representation complexity. Fused-Planes present competitive resource costs in multi-scene inverse graphics, while preserving Tri-Planes rendering quality, and maintaining their widely favored planar structure. Our codebase is publicly available as open-source. Our project page can be found at https://fused-planes.github.io .
△ Less
Submitted 31 January, 2025; v1 submitted 31 October, 2024;
originally announced October 2024.
-
Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
Authors:
Antoine Schnepf,
Karim Kassab,
Jean-Yves Franceschi,
Laurent Caraffa,
Flavian Vasile,
Jeremie Mary,
Andrew Comport,
Valerie Gouet-Brunet
Abstract:
While pre-trained image autoencoders are increasingly utilized in computer vision, the application of inverse graphics in 2D latent spaces has been under-explored. Yet, besides reducing the training and rendering complexity, applying inverse graphics in the latent space enables a valuable interoperability with other latent-based 2D methods. The major challenge is that inverse graphics cannot be di…
▽ More
While pre-trained image autoencoders are increasingly utilized in computer vision, the application of inverse graphics in 2D latent spaces has been under-explored. Yet, besides reducing the training and rendering complexity, applying inverse graphics in the latent space enables a valuable interoperability with other latent-based 2D methods. The major challenge is that inverse graphics cannot be directly applied to such image latent spaces because they lack an underlying 3D geometry. In this paper, we propose an Inverse Graphics Autoencoder (IG-AE) that specifically addresses this issue. To this end, we regularize an image autoencoder with 3D-geometry by aligning its latent space with jointly trained latent 3D scenes. We utilize the trained IG-AE to bring NeRFs to the latent space with a latent NeRF training pipeline, which we implement in an open-source extension of the Nerfstudio framework, thereby unlocking latent scene learning for its supported methods. We experimentally confirm that Latent NeRFs trained with IG-AE present an improved quality compared to a standard autoencoder, all while exhibiting training and rendering accelerations with respect to NeRFs trained in the image space. Our project page can be found at https://ig-ae.github.io .
△ Less
Submitted 24 February, 2025; v1 submitted 30 October, 2024;
originally announced October 2024.
-
Improving Consistency Models with Generator-Augmented Flows
Authors:
Thibaut Issenhuth,
Sangchul Lee,
Ludovic Dos Santos,
Jean-Yves Franceschi,
Chansoo Kim,
Alain Rakotomamonjy
Abstract:
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network. They can be learned in two ways: consistency distillation and consistency training. The former relies on the true velocity field of the corresponding differential equation, approximated by a pre-trained neural network. In contrast, the latter uses a single-sample Monte Carlo es…
▽ More
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network. They can be learned in two ways: consistency distillation and consistency training. The former relies on the true velocity field of the corresponding differential equation, approximated by a pre-trained neural network. In contrast, the latter uses a single-sample Monte Carlo estimate of this velocity field. The related estimation error induces a discrepancy between consistency distillation and training that, we show, still holds in the continuous-time limit. To alleviate this issue, we propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model. We prove that this flow reduces the previously identified discrepancy and the noise-data transport cost. Consequently, our method not only accelerates consistency training convergence but also enhances its overall performance. The code is available at: https://github.com/thibautissenhuth/consistency_GC.
△ Less
Submitted 5 February, 2025; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Exploring 3D-aware Latent Spaces for Efficiently Learning Numerous Scenes
Authors:
Antoine Schnepf,
Karim Kassab,
Jean-Yves Franceschi,
Laurent Caraffa,
Flavian Vasile,
Jeremie Mary,
Andrew Comport,
Valérie Gouet-Brunet
Abstract:
We present a method enabling the scaling of NeRFs to learn a large number of semantically-similar scenes. We combine two techniques to improve the required training time and memory cost per scene. First, we learn a 3D-aware latent space in which we train Tri-Plane scene representations, hence reducing the resolution at which scenes are learned. Moreover, we present a way to share common informatio…
▽ More
We present a method enabling the scaling of NeRFs to learn a large number of semantically-similar scenes. We combine two techniques to improve the required training time and memory cost per scene. First, we learn a 3D-aware latent space in which we train Tri-Plane scene representations, hence reducing the resolution at which scenes are learned. Moreover, we present a way to share common information across scenes, hence allowing for a reduction of model complexity to learn a particular scene. Our method reduces effective per-scene memory costs by 44% and per-scene time costs by 86% when training 1000 scenes. Our project page can be found at https://3da-ae.github.io .
△ Less
Submitted 17 May, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Differentially Private Gradient Flow based on the Sliced Wasserstein Distance
Authors:
Ilana Sebag,
Muni Sreenivas Pydi,
Jean-Yves Franceschi,
Alain Rakotomamonjy,
Mike Gartrell,
Jamal Atif,
Alexandre Allauzen
Abstract:
Safeguarding privacy in sensitive training data is paramount, particularly in the context of generative modeling. This can be achieved through either differentially private stochastic gradient descent or a differentially private metric for training models or generators. In this paper, we introduce a novel differentially private generative modeling approach based on a gradient flow in the space of…
▽ More
Safeguarding privacy in sensitive training data is paramount, particularly in the context of generative modeling. This can be achieved through either differentially private stochastic gradient descent or a differentially private metric for training models or generators. In this paper, we introduce a novel differentially private generative modeling approach based on a gradient flow in the space of probability measures. To this end, we define the gradient flow of the Gaussian-smoothed Sliced Wasserstein Distance, including the associated stochastic differential equation (SDE). By discretizing and defining a numerical scheme for solving this SDE, we demonstrate the link between smoothing and differential privacy based on a Gaussian mechanism, due to a specific form of the SDE's drift term. We then analyze the differential privacy guarantee of our gradient flow, which accounts for both the smoothing and the Wiener process introduced by the SDE itself. Experiments show that our proposed model can generate higher-fidelity data at a low privacy budget compared to a generator-based model, offering a promising alternative.
△ Less
Submitted 20 January, 2025; v1 submitted 13 December, 2023;
originally announced December 2023.
-
RefinedFields: Radiance Fields Refinement for Planar Scene Representations
Authors:
Karim Kassab,
Antoine Schnepf,
Jean-Yves Franceschi,
Laurent Caraffa,
Jeremie Mary,
Valérie Gouet-Brunet
Abstract:
Planar scene representations have recently witnessed increased interests for modeling scenes from images, as their lightweight planar structure enables compatibility with image-based models. Notably, K-Planes have gained particular attention as they extend planar scene representations to support in-the-wild scenes, in addition to object-level scenes. However, their visual quality has recently lagg…
▽ More
Planar scene representations have recently witnessed increased interests for modeling scenes from images, as their lightweight planar structure enables compatibility with image-based models. Notably, K-Planes have gained particular attention as they extend planar scene representations to support in-the-wild scenes, in addition to object-level scenes. However, their visual quality has recently lagged behind that of state-of-the-art techniques. To reduce this gap, we propose RefinedFields, a method that leverages pre-trained networks to refine K-Planes scene representations via optimization guidance using an alternating training procedure. We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections. RefinedFields enhances rendered scenes with richer details and improves upon its base representation on the task of novel view synthesis. Our project page can be found at https://refinedfields.github.io .
△ Less
Submitted 26 May, 2025; v1 submitted 1 December, 2023;
originally announced December 2023.
-
Unifying GANs and Score-Based Diffusion as Generative Particle Models
Authors:
Jean-Yves Franceschi,
Mike Gartrell,
Ludovic Dos Santos,
Thibaut Issenhuth,
Emmanuel de Bézenac,
Mickaël Chen,
Alain Rakotomamonjy
Abstract:
Particle-based deep generative models, such as gradient flows and score-based diffusion models, have recently gained traction thanks to their striking performance. Their principle of displacing particle distributions using differential equations is conventionally seen as opposed to the previously widespread generative adversarial networks (GANs), which involve training a pushforward generator netw…
▽ More
Particle-based deep generative models, such as gradient flows and score-based diffusion models, have recently gained traction thanks to their striking performance. Their principle of displacing particle distributions using differential equations is conventionally seen as opposed to the previously widespread generative adversarial networks (GANs), which involve training a pushforward generator network. In this paper we challenge this interpretation, and propose a novel framework that unifies particle and adversarial generative models by framing generator training as a generalization of particle models. This suggests that a generator is an optional addition to any such generative model. Consequently, integrating a generator into a score-based diffusion model and training a GAN without a generator naturally emerge from our framework. We empirically test the viability of these original models as proofs of concepts of potential applications of our framework.
△ Less
Submitted 21 December, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Modeling opinion polarization on social media: application to Covid-19 vaccination hesitancy in Italy
Authors:
Jonathan Franceschi,
Lorenzo Pareschi,
Elena Bellodi,
Marco Gavanelli,
Marco Bresadola
Abstract:
The SARS-CoV-2 pandemic reminded us how vaccination can be a divisive topic on which the public conversation is permeated by misleading claims, and thoughts tend to polarize, especially on online social networks. In this work, motivated by recent natural language processing techniques to systematically extract and quantify opinions from text messages, we present a differential framework for bivari…
▽ More
The SARS-CoV-2 pandemic reminded us how vaccination can be a divisive topic on which the public conversation is permeated by misleading claims, and thoughts tend to polarize, especially on online social networks. In this work, motivated by recent natural language processing techniques to systematically extract and quantify opinions from text messages, we present a differential framework for bivariate opinion formation dynamics that is coupled with a compartmental model for fake news dissemination. Thanks to a mean-field analysis we demonstrate that the resulting Fokker-Planck system permits to reproduce bimodal distributions of opinions as observed in polarization dynamics. The model is then applied to sentiment analysis data from social media platforms in Italy, in order to analyze the evolution of opinions about Covid-19 vaccination. We show through numerical simulations that the model is capable to describe correctly the formation of the bimodal opinion structure observed in the vaccine-hesitant dataset, which is witness of the known polarization effects that happen within closed online communities.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Authors:
Yuan Yin,
Matthieu Kirchmeyer,
Jean-Yves Franceschi,
Alain Rakotomamonjy,
Patrick Gallinari
Abstract:
Effective data-driven PDE forecasting methods often rely on fixed spatial and / or temporal discretizations. This raises limitations in real-world applications like weather prediction where flexible extrapolation at arbitrary spatiotemporal locations is required. We address this problem by introducing a new data-driven approach, DINo, that models a PDE's flow with continuous-time dynamics of spati…
▽ More
Effective data-driven PDE forecasting methods often rely on fixed spatial and / or temporal discretizations. This raises limitations in real-world applications like weather prediction where flexible extrapolation at arbitrary spatiotemporal locations is required. We address this problem by introducing a new data-driven approach, DINo, that models a PDE's flow with continuous-time dynamics of spatially continuous functions. This is achieved by embedding spatial observations independently of their discretization via Implicit Neural Representations in a small latent space temporally driven by a learned ODE. This separate and flexible treatment of time and space makes DINo the first data-driven model to combine the following advantages. It extrapolates at arbitrary spatial and temporal locations; it can learn from sparse irregular grids or manifolds; at test time, it generalizes to new grids or resolutions. DINo outperforms alternative neural PDE forecasters in a variety of challenging generalization scenarios on representative PDE systems.
△ Less
Submitted 15 February, 2023; v1 submitted 29 September, 2022;
originally announced September 2022.
-
Spreading of fake news, competence, and learning: kinetic modeling and numerical approximation
Authors:
Jonathan Franceschi,
Lorenzo Pareschi
Abstract:
The rise of social networks as the primary means of communication in almost every country in the world has simultaneously triggered an increase in the amount of fake news circulating online. This fact became particularly evident during the 2016 U.S. political elections and even more so with the advent of the COVID-19 pandemic. Several research studies have shown how the effects of fake news dissem…
▽ More
The rise of social networks as the primary means of communication in almost every country in the world has simultaneously triggered an increase in the amount of fake news circulating online. This fact became particularly evident during the 2016 U.S. political elections and even more so with the advent of the COVID-19 pandemic. Several research studies have shown how the effects of fake news dissemination can be mitigated by promoting greater competence through lifelong learning and discussion communities, and generally rigorous training in the scientific method and broad interdisciplinary education. The urgent need for models that can describe the growing infodemic of fake news has been highlighted by the current pandemic. The resulting slowdown in vaccination campaigns due to misinformation and generally the inability of individuals to discern the reliability of information is posing enormous risks to the governments of many countries. In this research using the tools of kinetic theory we describe the interaction between fake news spreading and competence of individuals through multi-population models in which fake news spreads analogously to an infectious disease with different impact depending on the level of competence of individuals. The level of competence, in particular, is subject to an evolutionary dynamic due to both social interactions between agents and external learning dynamics. The results show how the model is able to correctly describe the dynamics of diffusion of fake news and the important role of competence in their containment.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
A Neural Tangent Kernel Perspective of GANs
Authors:
Jean-Yves Franceschi,
Emmanuel de Bézenac,
Ibrahim Ayed,
Mickaël Chen,
Sylvain Lamprier,
Patrick Gallinari
Abstract:
We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling GANs' training scheme, are subject to ill-defined discriminator gradients. We overcome this issue which impedes a principled study of GAN training, solving it within our framework by taking into account the discriminator's…
▽ More
We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling GANs' training scheme, are subject to ill-defined discriminator gradients. We overcome this issue which impedes a principled study of GAN training, solving it within our framework by taking into account the discriminator's architecture. To this end, we leverage the theory of infinite-width neural networks for the discriminator via its Neural Tangent Kernel. We characterize the trained discriminator for a wide range of losses and establish general differentiability properties of the network. From this, we derive new insights about the convergence of the generated distribution, advancing our understanding of GANs' training dynamics. We empirically corroborate these results via an analysis toolkit based on our framework, unveiling intuitions that are consistent with GAN practice.
△ Less
Submitted 7 November, 2022; v1 submitted 10 June, 2021;
originally announced June 2021.
-
PDE-Driven Spatiotemporal Disentanglement
Authors:
Jérémie Donà,
Jean-Yves Franceschi,
Sylvain Lamprier,
Patrick Gallinari
Abstract:
A recent line of work in the machine learning community addresses the problem of predicting high-dimensional spatiotemporal phenomena by leveraging specific tools from the differential equations theory. Following this direction, we propose in this article a novel and general paradigm for this task based on a resolution method for partial differential equations: the separation of variables. This in…
▽ More
A recent line of work in the machine learning community addresses the problem of predicting high-dimensional spatiotemporal phenomena by leveraging specific tools from the differential equations theory. Following this direction, we propose in this article a novel and general paradigm for this task based on a resolution method for partial differential equations: the separation of variables. This inspiration allows us to introduce a dynamical interpretation of spatiotemporal disentanglement. It induces a principled model based on learning disentangled spatial and temporal representations of a phenomenon to accurately predict future observations. We experimentally demonstrate the performance and broad applicability of our method against prior state-of-the-art models on physical and synthetic video datasets.
△ Less
Submitted 23 March, 2021; v1 submitted 4 August, 2020;
originally announced August 2020.
-
Stochastic Latent Residual Video Prediction
Authors:
Jean-Yves Franceschi,
Edouard Delasalles,
Mickaël Chen,
Sylvain Lamprier,
Patrick Gallinari
Abstract:
Designing video prediction models that account for the inherent uncertainty of the future is challenging. Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and temporal dynamics. However, no such model for stochasti…
▽ More
Designing video prediction models that account for the inherent uncertainty of the future is challenging. Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and temporal dynamics. However, no such model for stochastic video prediction has been proposed in the literature yet, due to design and training difficulties. In this paper, we overcome these difficulties by introducing a novel stochastic temporal model whose dynamics are governed in a latent space by a residual update rule. This first-order scheme is motivated by discretization schemes of differential equations. It naturally models video dynamics as it allows our simpler, more interpretable, latent model to outperform prior state-of-the-art methods on challenging datasets.
△ Less
Submitted 7 August, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Unsupervised Scalable Representation Learning for Multivariate Time Series
Authors:
Jean-Yves Franceschi,
Aymeric Dieuleveut,
Martin Jaggi
Abstract:
Time series constitute a challenging data type for machine learning algorithms, due to their highly variable lengths and sparse labeling in practice. In this paper, we tackle this challenge by proposing an unsupervised method to learn universal embeddings of time series. Unlike previous works, it is scalable with respect to their length and we demonstrate the quality, transferability and practicab…
▽ More
Time series constitute a challenging data type for machine learning algorithms, due to their highly variable lengths and sparse labeling in practice. In this paper, we tackle this challenge by proposing an unsupervised method to learn universal embeddings of time series. Unlike previous works, it is scalable with respect to their length and we demonstrate the quality, transferability and practicability of the learned representations with thorough experiments and comparisons. To this end, we combine an encoder based on causal dilated convolutions with a novel triplet loss employing time-based negative sampling, obtaining general-purpose representations for variable length and multivariate time series.
△ Less
Submitted 3 January, 2020; v1 submitted 30 January, 2019;
originally announced January 2019.
-
Robustness of classifiers to uniform $\ell\_p$ and Gaussian noise
Authors:
Jean-Yves Franceschi,
Alhussein Fawzi,
Omar Fawzi
Abstract:
We study the robustness of classifiers to various kinds of random noise models. In particular, we consider noise drawn uniformly from the $\ell\_p$ ball for $p \in [1, \infty]$ and Gaussian noise with an arbitrary covariance matrix. We characterize this robustness to random noise in terms of the distance to the decision boundary of the classifier. This analysis applies to linear classifiers as wel…
▽ More
We study the robustness of classifiers to various kinds of random noise models. In particular, we consider noise drawn uniformly from the $\ell\_p$ ball for $p \in [1, \infty]$ and Gaussian noise with an arbitrary covariance matrix. We characterize this robustness to random noise in terms of the distance to the decision boundary of the classifier. This analysis applies to linear classifiers as well as classifiers with locally approximately flat decision boundaries, a condition which is satisfied by state-of-the-art deep neural networks. The predicted robustness is verified experimentally.
△ Less
Submitted 22 February, 2018;
originally announced February 2018.