-
LTX-Video: Realtime Video Latent Diffusion
Authors:
Yoav HaCohen,
Nisan Chiprut,
Benny Brazowski,
Daniel Shalem,
Dudu Moshe,
Eitan Richardson,
Eran Levin,
Guy Shiran,
Nir Zabari,
Ori Gordon,
Poriya Panet,
Sapir Weissbuch,
Victor Kulikov,
Yaki Bitterman,
Zeev Melumian,
Ofir Bibi
Abstract:
We introduce LTX-Video, a transformer-based latent diffusion model that adopts a holistic approach to video generation by seamlessly integrating the responsibilities of the Video-VAE and the denoising transformer. Unlike existing methods, which treat these components as independent, LTX-Video aims to optimize their interaction for improved efficiency and quality. At its core is a carefully designe…
▽ More
We introduce LTX-Video, a transformer-based latent diffusion model that adopts a holistic approach to video generation by seamlessly integrating the responsibilities of the Video-VAE and the denoising transformer. Unlike existing methods, which treat these components as independent, LTX-Video aims to optimize their interaction for improved efficiency and quality. At its core is a carefully designed Video-VAE that achieves a high compression ratio of 1:192, with spatiotemporal downscaling of 32 x 32 x 8 pixels per token, enabled by relocating the patchifying operation from the transformer's input to the VAE's input. Operating in this highly compressed latent space enables the transformer to efficiently perform full spatiotemporal self-attention, which is essential for generating high-resolution videos with temporal consistency. However, the high compression inherently limits the representation of fine details. To address this, our VAE decoder is tasked with both latent-to-pixel conversion and the final denoising step, producing the clean result directly in pixel space. This approach preserves the ability to generate fine details without incurring the runtime cost of a separate upsampling module. Our model supports diverse use cases, including text-to-video and image-to-video generation, with both capabilities trained simultaneously. It achieves faster-than-real-time generation, producing 5 seconds of 24 fps video at 768x512 resolution in just 2 seconds on an Nvidia H100 GPU, outperforming all existing models of similar scale. The source code and pre-trained models are publicly available, setting a new benchmark for accessible and scalable video generation.
△ Less
Submitted 30 December, 2024;
originally announced January 2025.
-
Transport and Optimal Control of Vaccination Dynamics for COVID-19
Authors:
Mohamed Abdelaziz Zaitri,
Mohand Ouamer Bibi,
Delfim F. M. Torres
Abstract:
We develop a mathematical model for transferring the vaccine BNT162b2 based on the heat diffusion equation. Then, we apply optimal control theory to the proposed generalized SEIR model. We introduce vaccination for the susceptible population to control the spread of the COVID-19 epidemic. For this, we use the Pontryagin minimum principle to find the necessary optimality conditions for the optimal…
▽ More
We develop a mathematical model for transferring the vaccine BNT162b2 based on the heat diffusion equation. Then, we apply optimal control theory to the proposed generalized SEIR model. We introduce vaccination for the susceptible population to control the spread of the COVID-19 epidemic. For this, we use the Pontryagin minimum principle to find the necessary optimality conditions for the optimal control. The optimal control problem and the heat diffusion equation are solved numerically. Finally, several simulations are done to study and predict the spread of the COVID-19 epidemic in Italy. In particular, we compare the model in the presence and absence of vaccination.
△ Less
Submitted 21 April, 2022; v1 submitted 1 February, 2022;
originally announced February 2022.
-
Temporally stable video segmentation without video annotations
Authors:
Aharon Azulay,
Tavi Halperin,
Orestis Vantzos,
Nadav Borenstein,
Ofir Bibi
Abstract:
Temporally consistent dense video annotations are scarce and hard to collect. In contrast, image segmentation datasets (and pre-trained models) are ubiquitous, and easier to label for any novel task. In this paper, we introduce a method to adapt still image segmentation models to video in an unsupervised manner, by using an optical flow-based consistency measure. To ensure that the inferred segmen…
▽ More
Temporally consistent dense video annotations are scarce and hard to collect. In contrast, image segmentation datasets (and pre-trained models) are ubiquitous, and easier to label for any novel task. In this paper, we introduce a method to adapt still image segmentation models to video in an unsupervised manner, by using an optical flow-based consistency measure. To ensure that the inferred segmented videos appear more stable in practice, we verify that the consistency measure is well correlated with human judgement via a user study. Training a new multi-input multi-output decoder using this measure as a loss, together with a technique for refining current image segmentation datasets and a temporal weighted-guided filter, we observe stability improvements in the generated segmented videos with minimal loss of accuracy.
△ Less
Submitted 17 March, 2022; v1 submitted 17 October, 2021;
originally announced October 2021.
-
Optimal Control to Limit the Spread of COVID-19 in Italy
Authors:
Mohamed Abdelaziz Zaitri,
Mohand Ouamer Bibi,
Delfim F. M. Torres
Abstract:
We apply optimal control theory to a generalized SEIR-type model. The proposed system has three controls, representing social distancing, preventive means, and treatment measures to combat the spread of the COVID-19 pandemic. We analyze such optimal control problem with respect to real data transmission in Italy. Our results show the appropriateness of the model, in particular with respect to the…
▽ More
We apply optimal control theory to a generalized SEIR-type model. The proposed system has three controls, representing social distancing, preventive means, and treatment measures to combat the spread of the COVID-19 pandemic. We analyze such optimal control problem with respect to real data transmission in Italy. Our results show the appropriateness of the model, in particular with respect to the number of quarantined/hospitalized (confirmed and infected) and recovered individuals. Considering the Pontryagin controls, we show how in a perfect world one could have drastically diminish the number of susceptible, exposed, infected, quarantined/hospitalized, and death individuals, by increasing the population of insusceptible/protected.
△ Less
Submitted 25 July, 2021;
originally announced July 2021.
-
Endless Loops: Detecting and Animating Periodic Patterns in Still Images
Authors:
Tavi Halperin,
Hanit Hakim,
Orestis Vantzos,
Gershon Hochman,
Netai Benaim,
Lior Sassy,
Michael Kupchik,
Ofir Bibi,
Ohad Fried
Abstract:
We present an algorithm for producing a seamless animated loop from a single image. The algorithm detects periodic structures, such as the windows of a building or the steps of a staircase, and generates a non-trivial displacement vector field that maps each segment of the structure onto a neighboring segment along a user- or auto-selected main direction of motion. This displacement field is used,…
▽ More
We present an algorithm for producing a seamless animated loop from a single image. The algorithm detects periodic structures, such as the windows of a building or the steps of a staircase, and generates a non-trivial displacement vector field that maps each segment of the structure onto a neighboring segment along a user- or auto-selected main direction of motion. This displacement field is used, together with suitable temporal and spatial smoothing, to warp the image and produce the frames of a continuous animation loop. Our cinemagraphs are created in under a second on a mobile device. Over 140,000 users downloaded our app and exported over 350,000 cinemagraphs. Moreover, we conducted two user studies that show that users prefer our method for creating surreal and structured cinemagraphs compared to more manual approaches and compared to previous methods.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Clear Skies Ahead: Towards Real-Time Automatic Sky Replacement in Video
Authors:
Tavi Halperin,
Harel Cain,
Ofir Bibi,
Michael Werman
Abstract:
Digital videos such as those captured by a smartphone often exhibit exposure inconsistencies, a poorly exposed sky, or simply suffer from an uninteresting or plain looking sky. Professionals may edit these videos using advanced and time-consuming tools unavailable to most users, to replace the sky with a more expressive or imaginative sky. In this work, we propose an algorithm for automatic replac…
▽ More
Digital videos such as those captured by a smartphone often exhibit exposure inconsistencies, a poorly exposed sky, or simply suffer from an uninteresting or plain looking sky. Professionals may edit these videos using advanced and time-consuming tools unavailable to most users, to replace the sky with a more expressive or imaginative sky. In this work, we propose an algorithm for automatic replacement of the sky region in a video with a different sky, providing nonprofessional users with a simple yet efficient tool to seamlessly replace the sky. The method is fast, achieving close to real-time performance on mobile devices and the user's involvement can remain as limited as simply selecting the replacement sky.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.