Skip to main content

Showing 1–8 of 8 results for author: Kalatzis, D

.
  1. arXiv:2506.21446  [pdf, other

    cs.CV

    Controllable 3D Placement of Objects with Scene-Aware Diffusion Models

    Authors: Mohamed Omran, Dimitris Kalatzis, Jens Petersen, Amirhossein Habibian, Auke Wiggers

    Abstract: Image editing approaches have become more powerful and flexible with the advent of powerful text-conditioned generative models. However, placing objects in an environment with a precise location and orientation still remains a challenge, as this typically requires carefully crafted inpainting masks or prompts. In this work, we show that a carefully designed visual map, combined with coarse object… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  2. arXiv:2504.16740  [pdf, other

    cs.CV

    Gaussian Splatting is an Effective Data Generator for 3D Object Detection

    Authors: Farhad G. Zanjani, Davide Abati, Auke Wiggers, Dimitris Kalatzis, Jens Petersen, Hong Cai, Amirhossein Habibian

    Abstract: We investigate data augmentation for 3D object detection in autonomous driving. We utilize recent advancements in 3D reconstruction based on Gaussian Splatting for 3D object placement in driving scenes. Unlike existing diffusion-based methods that synthesize images conditioned on BEV layouts, our approach places 3D objects directly in the reconstructed 3D space with explicitly imposed geometric tr… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  3. arXiv:2310.01258  [pdf, other

    eess.IV cs.CV cs.LG

    MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device

    Authors: Ties van Rozendaal, Tushar Singhal, Hoang Le, Guillaume Sautiere, Amir Said, Krishna Buska, Anjuman Raha, Dimitris Kalatzis, Hitarth Mehta, Frank Mayer, Liang Zhang, Markus Nagel, Auke Wiggers

    Abstract: Neural video codecs have recently become competitive with standard codecs such as HEVC in the low-delay setting. However, most neural codecs are large floating-point networks that use pixel-dense warping operations for temporal modeling, making them too computationally expensive for deployment on mobile devices. Recent work has demonstrated that running a neural decoder in real time on mobile is f… ▽ More

    Submitted 15 November, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Matches version published at WACV 2024

  4. arXiv:2106.05367  [pdf, other

    cs.LG stat.ML

    Pulling back information geometry

    Authors: Georgios Arvanitidis, Miguel González-Duque, Alison Pouplin, Dimitris Kalatzis, Søren Hauberg

    Abstract: Latent space geometry has shown itself to provide a rich and rigorous framework for interacting with the latent variables of deep generative models. The existing theory, however, relies on the decoder being a Gaussian distribution as its simple reparametrization allows us to interpret the generating process as a random projection of a deterministic manifold. Consequently, this approach breaks down… ▽ More

    Submitted 23 April, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: Presented at AISTATS 2022

  5. arXiv:2106.03500  [pdf, other

    cs.LG stat.ML

    Density estimation on smooth manifolds with normalizing flows

    Authors: Dimitris Kalatzis, Johan Ziruo Ye, Alison Pouplin, Jesper Wohlert, Søren Hauberg

    Abstract: We present a framework for learning probability distributions on topologically non-trivial manifolds, utilizing normalizing flows. Current methods focus on manifolds that are homeomorphic to Euclidean space, enforce strong structural priors on the learned models or use operations that do not easily scale to high dimensions. In contrast, our method learns distributions on a data manifold by "gluing… ▽ More

    Submitted 9 July, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  6. arXiv:2002.05227  [pdf, other

    cs.LG stat.ML

    Variational Autoencoders with Riemannian Brownian Motion Priors

    Authors: Dimitris Kalatzis, David Eklund, Georgios Arvanitidis, Søren Hauberg

    Abstract: Variational Autoencoders (VAEs) represent the given data in a low-dimensional latent space, which is generally assumed to be Euclidean. This assumption naturally leads to the common choice of a standard Gaussian prior over continuous latent variables. Recent work has, however, shown that this prior has a detrimental effect on model capacity, leading to subpar performance. We propose that the Eucli… ▽ More

    Submitted 7 August, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: Published in ICML 2020

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 119, 2020

  7. arXiv:1810.10401  [pdf, other

    cs.CL

    Image-based Natural Language Understanding Using 2D Convolutional Neural Networks

    Authors: Erinc Merdivan, Anastasios Vafeiadis, Dimitrios Kalatzis, Sten Hanke, Johannes Kropf, Konstantinos Votis, Dimitrios Giakoumis, Dimitrios Tzovaras, Liming Chen, Raouf Hamzaoui, Matthieu Geist

    Abstract: We propose a new approach to natural language understanding in which we consider the input text as an image and apply 2D Convolutional Neural Networks to learn the local and global semantics of the sentences from the variations ofthe visual patterns of words. Our approach demonstrates that it is possible to get semantically meaningful features from images with text without using optical character… ▽ More

    Submitted 6 November, 2018; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: Natural Language Processing (NLP), Sentiment Analysis, Dialogue Modeling

  8. arXiv:1612.00347  [pdf, other

    cs.CL cs.AI cs.HC

    Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data

    Authors: Dimitrios Kalatzis, Arash Eshghi, Oliver Lemon

    Abstract: We present a method for inducing new dialogue systems from very small amounts of unannotated dialogue data, showing how word-level exploration using Reinforcement Learning (RL), combined with an incremental and semantic grammar - Dynamic Syntax (DS) - allows systems to discover, generate, and understand many new dialogue variants. The method avoids the use of expensive and time-consuming dialogue… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.