-
Training Video Foundation Models with NVIDIA NeMo
Authors:
Zeeshan Patel,
Ethan He,
Parth Mannan,
Xiaowei Ren,
Ryan Wolf,
Niket Agarwal,
Jacob Huffman,
Zhuoyao Wang,
Carl Wang,
Jack Chang,
Yan Bai,
Tommy Huang,
Linnan Wang,
Sahil Jain,
Shanmugam Ramasamy,
Joseph Jennings,
Ekaterina Sirazitdinova,
Oleg Sudakov,
Mingyuan Ma,
Bobby Chen,
Forrest Lin,
Hao Wang,
Vasanth Rao Naik Sabavat,
Sriharsha Niverty,
Rong Ou
, et al. (4 additional authors not shown)
Abstract:
Video Foundation Models (VFMs) have recently been used to simulate the real world to train physical AI systems and develop creative visual experiences. However, there are significant challenges in training large-scale, high quality VFMs that can generate high-quality videos. We present a scalable, open-source VFM training pipeline with NVIDIA NeMo, providing accelerated video dataset curation, mul…
▽ More
Video Foundation Models (VFMs) have recently been used to simulate the real world to train physical AI systems and develop creative visual experiences. However, there are significant challenges in training large-scale, high quality VFMs that can generate high-quality videos. We present a scalable, open-source VFM training pipeline with NVIDIA NeMo, providing accelerated video dataset curation, multimodal data loading, and parallelized video diffusion model training and inference. We also provide a comprehensive performance analysis highlighting best practices for efficient VFM training and inference.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Parallelization of Network Dynamics Computations in Heterogeneous Distributed Environment
Authors:
Oleksandr Sudakov,
Volodymyr Maistrenko
Abstract:
This paper addresses the problem of parallelizing computations to study non-linear dynamics in large networks of non-locally coupled oscillators using heterogeneous computing resources. The proposed approach can be applied to a variety of non-linear dynamics models with runtime specification of parameters and network topologies. Parallelizing the solution of equations for different network element…
▽ More
This paper addresses the problem of parallelizing computations to study non-linear dynamics in large networks of non-locally coupled oscillators using heterogeneous computing resources. The proposed approach can be applied to a variety of non-linear dynamics models with runtime specification of parameters and network topologies. Parallelizing the solution of equations for different network elements is performed transparently and, in contrast to available tools, does not require parallel programming from end-users. The runtime scheduler takes into account the performance of computing and communication resources to reduce downtime and to achieve a quasi-optimal parallelizing speed-up. The proposed approach was implemented, and its efficiency is proven by numerous applications for simulating large dynamical networks with 10^3-10^8 elements described by Hodgkin-Huxley, FitzHugh-Nagumo, and Kuramoto models, for investigating pathological synchronization during Parkinson's disease, analyzing multi-stability, for studying chimera and solitary states in 3D networks, etc. All the above computations may be performed using symmetrical multiprocessors, graphic processing units, and a network of workstations within the same run and it was demonstrated that near-linear speed-up can be achieved for large networks. The proposed approach is promising for extension to new hardware like edge-computing devices.
△ Less
Submitted 2 July, 2025; v1 submitted 24 October, 2024;
originally announced October 2024.
-
Intelligence and Motion Models of Continuum Robots: an Overview
Authors:
Oxana Shamilyan,
Ievgen Kabin,
Zoya Dyka,
Oleksandr Sudakov,
Andrii Cherninskyi,
Marcin Brzozowski,
Peter Langendoerfer
Abstract:
Many technical solutions are bio-inspired. Octopus-inspired robotic arms belong to continuum robots which are used in minimally invasive surgery or for technical system restoration in areas difficult-toaccess. Continuum robot missions are bounded with their motions, whereby the motion of the robots is controlled by humans via wireless communication. In case of a lost connection, robot autonomy is…
▽ More
Many technical solutions are bio-inspired. Octopus-inspired robotic arms belong to continuum robots which are used in minimally invasive surgery or for technical system restoration in areas difficult-toaccess. Continuum robot missions are bounded with their motions, whereby the motion of the robots is controlled by humans via wireless communication. In case of a lost connection, robot autonomy is required. Distributed control and distributed decision-making mechanisms based on artificial intelligence approaches can be a promising solution to achieve autonomy of technical systems and to increase their resilience. However these methods are not well investigated yet. Octopuses are the living example of natural distributed intelligence but their learning and decision-making mechanisms are also not fully investigated and understood yet. Our major interest is investigating mechanisms of Distributed Artificial Intelligence as a basis for improving resilience of complex systems. We decided to use a physical continuum robot prototype that is able to perform some basic movements for our research. The idea is to research how a technical system can be empowered to combine movements into sequences of motions by itself. For the experimental investigations a suitable physical prototype has to be selected, its motion control has to be implemented and automated. In this paper, we give an overview combining different fields of research, such as Distributed Artificial Intelligence and continuum robots based on 98 publications. We provide a detailed description of the basic motion control models of continuum robots based on the literature reviewed, discuss different aspects of autonomy and give an overview of physical prototypes of continuum robots.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Resilience Aspects in Distributed Wireless Electroencephalographic Sampling
Authors:
R. Natarov,
O. Sudakov,
Z. Dyka,
I. Kabin,
O. Maksymyuk,
O. Iegorova,
O. Krishtal,
P. Langendörfer
Abstract:
Resilience aspects of remote electroencephalography sampling are considered. The possibility to use motion sensors data and measurement of industrial power network interference for detection of failed sampling channels is demonstrated. No significant correlation between signals of failed channels and motion sensors data is shown. Level of 50 Hz spectral component from failed channels significantly…
▽ More
Resilience aspects of remote electroencephalography sampling are considered. The possibility to use motion sensors data and measurement of industrial power network interference for detection of failed sampling channels is demonstrated. No significant correlation between signals of failed channels and motion sensors data is shown. Level of 50 Hz spectral component from failed channels significantly differs from level of 50 Hz component of normally operating channel. Conclusions about application of these results for increasing resilience of electroencephalography sampling is made.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
Exploiting EEG Signals for Eye Motion Tracking
Authors:
R. Kovtun,
S. Radchenko,
A. Netreba,
O. Sudakov,
R. Natarov,
Z. Dyka,
I. Kabin,
P. Langendörfer
Abstract:
Human eye tracking devices can help to investigate principles of processing visual information by humans. The attention focus movement during the gaze can be used for behavioural analysis of humans. In this work we describe our experimental system that we designed for synchronous recording of electroencephalographic signals, events of external tests and gaze direction. As external tests we used vi…
▽ More
Human eye tracking devices can help to investigate principles of processing visual information by humans. The attention focus movement during the gaze can be used for behavioural analysis of humans. In this work we describe our experimental system that we designed for synchronous recording of electroencephalographic signals, events of external tests and gaze direction. As external tests we used virtual cognitive tests. We investigated the possibility to exploit electroencephalographic signals for eye motion tracking. Our experimental system is a first step for the designing an automatic eye tracking system and can additionally be used as a laboratory equipment for teaching students.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
Artificial Neural Network Surrogate Modeling of Oil Reservoir: a Case Study
Authors:
Oleg Sudakov,
Dmitri Koroteev,
Boris Belozerov,
Evgeny Burnaev
Abstract:
We develop a data-driven model, introducing recent advances in machine learning to reservoir simulation. We use a conventional reservoir modeling tool to generate training set and a special ensemble of artificial neural networks (ANNs) to build a predictive model. The ANN-based model allows to reproduce the time dependence of fluids and pressure distribution within the computational cells of the r…
▽ More
We develop a data-driven model, introducing recent advances in machine learning to reservoir simulation. We use a conventional reservoir modeling tool to generate training set and a special ensemble of artificial neural networks (ANNs) to build a predictive model. The ANN-based model allows to reproduce the time dependence of fluids and pressure distribution within the computational cells of the reservoir model. We compare the performance of the ANN-based model with conventional reservoir modeling and illustrate that ANN-based model (1) is able to capture all the output parameters of the conventional model with very high accuracy and (2) demonstrate much higher computational performance. We finally elaborate on further options for research and developments within the area of reservoir modeling.
△ Less
Submitted 19 May, 2019;
originally announced May 2019.
-
Reconstruction of 3D Porous Media From 2D Slices
Authors:
Denis Volkhonskiy,
Ekaterina Muravleva,
Oleg Sudakov,
Denis Orlov,
Boris Belozerov,
Evgeny Burnaev,
Dmitry Koroteev
Abstract:
In many branches of earth sciences, the problem of rock study on the micro-level arises. However, a significant number of representative samples is not always feasible. Thus the problem of the generation of samples with similar properties becomes actual. In this paper, we propose a novel deep learning architecture for three-dimensional porous media reconstruction from two-dimensional slices. We fi…
▽ More
In many branches of earth sciences, the problem of rock study on the micro-level arises. However, a significant number of representative samples is not always feasible. Thus the problem of the generation of samples with similar properties becomes actual. In this paper, we propose a novel deep learning architecture for three-dimensional porous media reconstruction from two-dimensional slices. We fit a distribution on all possible three-dimensional structures of a specific type based on the given dataset of samples. Then, given partial information (central slices), we recover the three-dimensional structure around such slices as the most probable one according to that constructed distribution. Technically, we implement this in the form of a deep neural network with encoder, generator and discriminator modules. Numerical experiments show that this method provides a good reconstruction in terms of Minkowski functionals.
△ Less
Submitted 6 August, 2021; v1 submitted 29 January, 2019;
originally announced January 2019.
-
Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks
Authors:
Oleg Sudakov,
Evgeny Burnaev,
Dmitry Koroteev
Abstract:
We present a research study aimed at testing of applicability of machine learning techniques for prediction of permeability of digitized rock samples. We prepare a training set containing 3D images of sandstone samples imaged with X-ray microtomography and corresponding permeability values simulated with Pore Network approach. We also use Minkowski functionals and Deep Learning-based descriptors o…
▽ More
We present a research study aimed at testing of applicability of machine learning techniques for prediction of permeability of digitized rock samples. We prepare a training set containing 3D images of sandstone samples imaged with X-ray microtomography and corresponding permeability values simulated with Pore Network approach. We also use Minkowski functionals and Deep Learning-based descriptors of 3D images and 2D slices as input features for predictive model training and prediction. We compare predictive power of various feature sets and methods. The later include Gradient Boosting and various architectures of Deep Neural Networks (DNN). The results demonstrate applicability of machine learning for image-based permeability prediction and open a new area of Digital Rock research.
△ Less
Submitted 14 March, 2018; v1 submitted 2 March, 2018;
originally announced March 2018.