Skip to main content

Showing 1–5 of 5 results for author: Savov, N

.
  1. arXiv:2505.22246  [pdf, ps, other

    cs.CV

    StateSpaceDiffuser: Bringing Long Context to Diffusion World Models

    Authors: Nedko Savov, Naser Kazemi, Deheng Zhang, Danda Pani Paudel, Xi Wang, Luc Van Gool

    Abstract: World models have recently become promising tools for predicting realistic visuals based on actions in complex environments. However, their reliance on a short sequence of observations causes them to quickly lose track of context. As a result, visual consistency breaks down after just a few steps, and generated scenes no longer reflect information seen earlier. This limitation of the state-of-the-… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  2. arXiv:2504.02515  [pdf, other

    cs.CV

    Exploration-Driven Generative Interactive Environments

    Authors: Nedko Savov, Naser Kazemi, Mohammad Mahdi, Danda Pani Paudel, Xi Wang, Luc Van Gool

    Abstract: Modern world models require costly and time-consuming collection of large video datasets with action demonstrations by people or by environment-specific agents. To simplify training, we focus on using many virtual environments for inexpensive, automatically collected interaction data. Genie, a recent multi-environment world model, demonstrates simulation abilities of many environments with shared… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

    Comments: Accepted at CVPR 2025

  3. arXiv:2409.06445  [pdf, other

    cs.CV cs.AI

    Learning Generative Interactive Environments By Trained Agent Exploration

    Authors: Naser Kazemi, Nedko Savov, Danda Paudel, Luc Van Gool

    Abstract: World models are increasingly pivotal in interpreting and simulating the rules and actions of complex environments. Genie, a recent model, excels at learning from visually diverse environments but relies on costly human-collected data. We observe that their alternative method of using random agents is too limited to explore the environment. We propose to improve the model by employing reinforcemen… ▽ More

    Submitted 18 October, 2024; v1 submitted 10 September, 2024; originally announced September 2024.

  4. arXiv:2312.08558  [pdf, other

    cs.CV

    Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction

    Authors: M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel, Nikola Popovic, Christian Vater, Otmar Hilliges, Luc Van Gool, Xi Wang

    Abstract: Understanding drivers' decision-making is crucial for road safety. Although predicting the ego-vehicle's path is valuable for driver-assistance systems, existing methods mainly focus on external factors like other vehicles' motions, often neglecting the driver's attention and intent. To address this gap, we infer the ego-trajectory by integrating the driver's gaze and the surrounding scene. We int… ▽ More

    Submitted 15 April, 2025; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted to 13th International Conference on Learning Representations (ICLR 2025), 29 pages

  5. arXiv:2010.11844  [pdf, other

    cs.CV

    Spatio-temporal Features for Generalized Detection of Deepfake Videos

    Authors: Ipek Ganiyusufoglu, L. Minh NgĂ´, Nedko Savov, Sezer Karaoglu, Theo Gevers

    Abstract: For deepfake detection, video-level detectors have not been explored as extensively as image-level detectors, which do not exploit temporal data. In this paper, we empirically show that existing approaches on image and sequence classifiers generalize poorly to new manipulation techniques. To this end, we propose spatio-temporal features, modeled by 3D CNNs, to extend the generalization capabilitie… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Submitted to Computer Vision and Image Understanding (CVIU)