Skip to main content

Showing 1–5 of 5 results for author: Karypidis, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.16064  [pdf, other

    cs.CV

    Boosting Generative Image Modeling via Joint Image-Feature Synthesis

    Authors: Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis

    Abstract: Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges this gap by leveraging a diffusion model to jointly model low-level image latents (from a variational autoencoder) and high-level semantic features (from a pretra… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  2. arXiv:2503.15697  [pdf, other

    cs.CV

    Technical Report for the 5th CLVision Challenge at CVPR: Addressing the Class-Incremental with Repetition using Unlabeled Data -- 4th Place Solution

    Authors: Panagiota Moraiti, Efstathios Karypidis

    Abstract: This paper outlines our approach to the 5th CLVision challenge at CVPR, which addresses the Class-Incremental with Repetition (CIR) scenario. In contrast to traditional class incremental learning, this novel setting introduces unique challenges and research opportunities, particularly through the integration of unlabeled data into the training process. In the CIR scenario, encountered classes may… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  3. arXiv:2501.08303  [pdf, other

    cs.CV

    Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers

    Authors: Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis

    Abstract: Semantic future prediction is important for autonomous systems navigating dynamic environments. This paper introduces FUTURIST, a method for multimodal future semantic prediction that uses a unified and efficient visual sequence transformer architecture. Our approach incorporates a multimodal masked visual modeling objective and a novel masking mechanism designed for multimodal training. This allo… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  4. arXiv:2412.11673  [pdf, other

    cs.CV

    DINO-Foresight: Looking into the Future with DINO

    Authors: Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis

    Abstract: Predicting future dynamics is crucial for applications like autonomous driving and robotics, where understanding the environment is key. Existing pixel-level methods are computationally expensive and often focus on irrelevant details. To address these challenges, we introduce DINO-Foresight, a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs)… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  5. Comparison Analysis of Traditional Machine Learning and Deep Learning Techniques for Data and Image Classification

    Authors: Efstathios Karypidis, Stylianos G. Mouslech, Kassiani Skoulariki, Alexandros Gazis

    Abstract: The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks. Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN). Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture. Thirdly, we w… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 9 pages, 9 figures, 4 tables. This is an Accepted Manuscript of an article published by Wseas Transactions on Mathematics on 2022, available online: https://doi.org/10.37394/23206.2022.21.19

    ACM Class: K.6.3; C.5.2; C.5.3; C.5.5; C.5.m; C.5.0

    Journal ref: WSEAS, Transactions on Mathematics, vol. 21, pp. 122-130, March, 2022