Skip to main content

Showing 1–50 of 137 results for author: Zafeiriou, S

.
  1. arXiv:2505.16724  [pdf, ps, other

    cs.LG cs.AI cs.HC

    Advancing Brainwave Modeling with a Codebook-Based Foundation Model

    Authors: Konstantinos Barmpas, Na Lee, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

    Abstract: Recent advances in large-scale pre-trained Electroencephalogram (EEG) models have shown great promise, driving progress in Brain-Computer Interfaces (BCIs) and healthcare applications. However, despite their success, many existing pre-trained models have struggled to fully capture the rich information content of neural oscillations, a limitation that fundamentally constrains their performance and… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2504.14219  [pdf, ps, other

    cs.GR cs.CV

    PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling

    Authors: Alara Dirik, Tuanfeng Wang, Duygu Ceylan, Stefanos Zafeiriou, Anna Frühstück

    Abstract: We present PRISM, a unified framework that enables multiple image generation and editing tasks in a single foundational model. Starting from a pre-trained text-to-image diffusion model, PRISM proposes an effective fine-tuning strategy to produce RGB images along with intrinsic maps (referred to as X layers) simultaneously. Unlike previous approaches, which infer intrinsic properties individually o… ▽ More

    Submitted 14 May, 2025; v1 submitted 19 April, 2025; originally announced April 2025.

  3. arXiv:2504.10716  [pdf, other

    cs.CV

    SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models

    Authors: Stathis Galanakis, Alexandros Lattas, Stylianos Moschoglou, Bernhard Kainz, Stefanos Zafeiriou

    Abstract: Despite recent progress in diffusion models, generating realistic head portraits from novel viewpoints remains a significant challenge. Most current approaches are constrained to limited angular ranges, predominantly focusing on frontal or near-frontal views. Moreover, although the recent emerging large-scale diffusion models have been proven robust in handling 3D scenes, they underperform on faci… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  4. arXiv:2501.05379  [pdf, other

    cs.CV

    Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance

    Authors: Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias, Alexandros Lattas, Stefanos Zafeiriou

    Abstract: Inspired by the effectiveness of 3D Gaussian Splatting (3DGS) in reconstructing detailed 3D scenes within multi-view setups and the emergence of large 2D human foundation models, we introduce Arc2Avatar, the first SDS-based method utilizing a human face foundation model as guidance with just a single image as input. To achieve that, we extend such a model for diverse-view human head generation by… ▽ More

    Submitted 13 January, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

    Comments: Project Page https://arc2avatar.github.io

  5. arXiv:2412.12861  [pdf, ps, other

    cs.CV

    Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

    Authors: Zhengdi Yu, Stefanos Zafeiriou, Tolga Birdal

    Abstract: We propose Dyn-HaMR, to the best of our knowledge, the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild. Reconstructing accurate 3D hand meshes from monocular videos is a crucial task for understanding human behaviour, with significant applications in augmented and virtual reality (AR/VR). However, existing methods for monocular hand… ▽ More

    Submitted 31 May, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

    Comments: Project page is available at https://dyn-hamr.github.io/

  6. arXiv:2411.17799  [pdf, other

    cs.CV cs.CL

    Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator

    Authors: Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Sign language is a visual language that encompasses all linguistic features of natural languages and serves as the primary communication method for the deaf and hard-of-hearing communities. Although many studies have successfully adapted pretrained language models (LMs) for sign language translation (sign-to-text), the reverse task-sign language generation (text-to-sign)-remains largely unexplored… ▽ More

    Submitted 8 March, 2025; v1 submitted 26 November, 2024; originally announced November 2024.

  7. arXiv:2409.12259  [pdf, other

    cs.CV

    WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild

    Authors: Rolandos Alexandros Potamias, Jinglei Zhang, Jiankang Deng, Stefanos Zafeiriou

    Abstract: In recent years, 3D hand pose estimation methods have garnered significant attention due to their extensive applications in human-computer interaction, virtual reality, and robotics. In contrast, there has been a notable gap in hand detection pipelines, posing significant challenges in constructing effective real-world multi-hand reconstruction systems. In this work, we present a data-driven pipel… ▽ More

    Submitted 26 March, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: CVPR 2025, Project Page https://rolpotamias.github.io/WiLoR

  8. arXiv:2408.16762  [pdf, other

    cs.CV cs.GR cs.LG

    UV-free Texture Generation with Denoising and Geodesic Heat Diffusions

    Authors: Simone Foti, Stefanos Zafeiriou, Tolga Birdal

    Abstract: Seams, distortions, wasted UV space, vertex-duplication, and varying resolution over the surface are the most prominent issues of the standard UV-based texturing of meshes. These issues are particularly acute when automatic UV-unwrapping techniques are used. For this reason, instead of generating textures in automatically generated UV-planes like most state-of-the-art methods, we propose to repres… ▽ More

    Submitted 10 October, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  9. arXiv:2407.03835  [pdf, other

    cs.CV

    7th ABAW Competition: Multi-Task Learning and Compound Expression Recognition

    Authors: Dimitrios Kollias, Stefanos Zafeiriou, Irene Kotsia, Abhinav Dhall, Shreya Ghosh, Chunchang Shao, Guanyu Hu

    Abstract: This paper describes the 7th Affective Behavior Analysis in-the-wild (ABAW) Competition, which is part of the respective Workshop held in conjunction with ECCV 2024. The 7th ABAW Competition addresses novel challenges in understanding human expressions and behaviors, crucial for the development of human-centered technologies. The Competition comprises of two sub-challenges: i) Multi-Task Learning… ▽ More

    Submitted 8 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  10. arXiv:2405.16570  [pdf, other

    cs.CV cs.AI

    ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling

    Authors: Francesca Babiloni, Alexandros Lattas, Jiankang Deng, Stefanos Zafeiriou

    Abstract: We propose ID-to-3D, a method to generate identity- and text-guided 3D human heads with disentangled expressions, starting from even a single casually captured in-the-wild image of a subject. The foundation of our approach is anchored in compositionality, alongside the use of task-specific 2D diffusion models as priors for optimization. First, we extend a foundational model with a lightweight expr… ▽ More

    Submitted 28 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: Explore our 3D results at: https://idto3d.github.io ; fixed broken url to project page

  11. arXiv:2405.10864  [pdf, other

    cs.CV cs.LG

    Improving face generation quality and prompt following with synthetic captions

    Authors: Michail Tarasiou, Stylianos Moschoglou, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Recent advancements in text-to-image generation using diffusion models have significantly improved the quality of generated images and expanded the ability to depict a wide range of objects. However, ensuring that these models adhere closely to the text prompts remains a considerable challenge. This issue is particularly pronounced when trying to generate photorealistic images of humans. Without s… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  12. arXiv:2404.19149  [pdf, other

    cs.CV

    SAGS: Structure-Aware 3D Gaussian Splatting

    Authors: Evangelos Ververas, Rolandos Alexandros Potamias, Jifei Song, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Following the advent of NeRFs, 3D Gaussian Splatting (3D-GS) has paved the way to real-time neural rendering overcoming the computational burden of volumetric methods. Following the pioneering work of 3D-GS, several methods have attempted to achieve compressible and high-fidelity performance alternatives. However, by employing a geometry-agnostic optimization scheme, these methods neglect the inhe… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 15 pages, 8 figures, 3 tables

  13. arXiv:2404.02686  [pdf, other

    cs.CV

    Design2Cloth: 3D Cloth Generation from 2D Masks

    Authors: Jiali Zheng, Rolandos Alexandros Potamias, Stefanos Zafeiriou

    Abstract: In recent years, there has been a significant shift in the field of digital avatar research, towards modeling, animating and reconstructing clothed human representations, as a key step towards creating realistic avatars. However, current 3D cloth generation methods are garment specific or trained completely on synthetic data, hence lacking fine details and realism. In this work, we make a step tow… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024, Project page: https://jiali-zheng.github.io/Design2Cloth/

  14. arXiv:2403.19773  [pdf, other

    cs.CV

    ShapeFusion: A 3D diffusion model for localized shape editing

    Authors: Rolandos Alexandros Potamias, Michail Tarasiou, Stylianos Ploumpis, Stefanos Zafeiriou

    Abstract: In the realm of 3D computer vision, parametric models have emerged as a ground-breaking methodology for the creation of realistic and expressive 3D avatars. Traditionally, they rely on Principal Component Analysis (PCA), given its ability to decompose data to an orthonormal space that maximally captures shape variations. However, due to the orthogonality constraints and the global nature of PCA's… ▽ More

    Submitted 4 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Project Page: https://rolpotamias.github.io/Shapefusion/

  15. arXiv:2403.17213  [pdf, other

    cs.CV

    AnimateMe: 4D Facial Expressions via Diffusion Models

    Authors: Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias, Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Stefanos Zafeiriou

    Abstract: The field of photorealistic 3D avatar reconstruction and generation has garnered significant attention in recent years; however, animating such avatars remains challenging. Recent advances in diffusion models have notably enhanced the capabilities of generative models in 2D animation. In this work, we directly utilize these models within the 3D domain to achieve controllable and high-fidelity 4D f… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  16. arXiv:2403.11641  [pdf, other

    cs.CV

    Arc2Face: A Foundation Model for ID-Consistent Human Faces

    Authors: Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Jiankang Deng, Bernhard Kainz, Stefanos Zafeiriou

    Abstract: This paper presents Arc2Face, an identity-conditioned face foundation model, which, given the ArcFace embedding of a person, can generate diverse photo-realistic images with an unparalleled degree of face similarity than existing models. Despite previous attempts to decode face recognition features into detailed images, we find that common high-resolution datasets (e.g. FFHQ) lack sufficient ident… ▽ More

    Submitted 22 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: ECCV 2024 (Oral), 29 pages, 20 figures. Project page: https://arc2face.github.io/

  17. arXiv:2402.19344  [pdf, other

    cs.CV

    The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition

    Authors: Dimitrios Kollias, Panagiotis Tzirakis, Alan Cowen, Stefanos Zafeiriou, Irene Kotsia, Alice Baird, Chris Gagne, Chunchang Shao, Guanyu Hu

    Abstract: This paper describes the 6th Affective Behavior Analysis in-the-wild (ABAW) Competition, which is part of the respective Workshop held in conjunction with IEEE CVPR 2024. The 6th ABAW Competition addresses contemporary challenges in understanding human emotions and behaviors, crucial for the development of human-centered technologies. In more detail, the Competition focuses on affect related bench… ▽ More

    Submitted 12 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  18. Spatio-temporal Prompting Network for Robust Video Feature Extraction

    Authors: Guanxiong Sun, Chi Wang, Zhaoyu Zhang, Jiankang Deng, Stefanos Zafeiriou, Yang Hua

    Abstract: Frame quality deterioration is one of the main challenges in the field of video understanding. To compensate for the information loss caused by deteriorated frames, recent approaches exploit transformer-based integration modules to obtain spatio-temporal information. However, these integration modules are heavy and complex. Furthermore, each integration module is specifically tailored for its targ… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Journal ref: 2023 International Conference on Computer Vision (ICCV) 13541-13551

  19. arXiv:2401.02937  [pdf, other

    cs.CV

    Locally Adaptive Neural 3D Morphable Models

    Authors: Michail Tarasiou, Rolandos Alexandros Potamias, Eimear O'Sullivan, Stylianos Ploumpis, Stefanos Zafeiriou

    Abstract: We present the Locally Adaptive Morphable Model (LAMM), a highly flexible Auto-Encoder (AE) framework for learning to generate and manipulate 3D meshes. We train our architecture following a simple self-supervised training scheme in which input displacements over a set of sparse control vertices are used to overwrite the encoded geometry in order to transform one training sample into another. Duri… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 10 pages, 9 figures, 2 tables

  20. arXiv:2401.01219  [pdf, ps, other

    cs.CV

    Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond

    Authors: Dimitrios Kollias, Viktoriia Sharmanska, Stefanos Zafeiriou

    Abstract: Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space, or parameter transfer. To provide sufficient learning support, modern MTL uses annotated data with full, or sufficiently large overlap across tasks, i.e., each input sample is annotated for all, or most of the tasks. However, collecting such annotations is proh… ▽ More

    Submitted 3 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: accepted at AAAI 2024. arXiv admin note: text overlap with arXiv:2105.03790

  21. arXiv:2312.04465  [pdf, other

    cs.CV

    FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models

    Authors: Stathis Galanakis, Alexandros Lattas, Stylianos Moschoglou, Stefanos Zafeiriou

    Abstract: The remarkable progress in 3D face reconstruction has resulted in high-detail and photorealistic facial representations. Recently, Diffusion Models have revolutionized the capabilities of generative methods by surpassing the performance of GANs. In this work, we present FitDiff, a diffusion-based 3D facial avatar generative model. Leveraging diffusion principles, our model accurately generates rel… ▽ More

    Submitted 1 March, 2025; v1 submitted 7 December, 2023; originally announced December 2023.

  22. arXiv:2312.02702  [pdf, other

    cs.CV

    Neural Sign Actors: A diffusion model for 3D sign language production from text

    Authors: Vasileios Baltatzis, Rolandos Alexandros Potamias, Evangelos Ververas, Guanxiong Sun, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Sign Languages (SL) serve as the primary mode of communication for the Deaf and Hard of Hearing communities. Deep learning methods for SL recognition and translation have achieved promising results. However, Sign Language Production (SLP) poses a challenge as the generated motions must be realistic and have precise semantic meaning. Most SLP methods rely on 2D data, which hinders their realism. In… ▽ More

    Submitted 5 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted at CVPR 2024, Project page: https://baltatzisv.github.io/neural-sign-actors/

  23. arXiv:2312.00627  [pdf, other

    cs.CV

    Rethinking the Domain Gap in Near-infrared Face Recognition

    Authors: Michail Tarasiou, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Heterogeneous face recognition (HFR) involves the intricate task of matching face images across the visual domains of visible (VIS) and near-infrared (NIR). While much of the existing literature on HFR identifies the domain gap as a primary challenge and directs efforts towards bridging it at either the input or feature level, our work deviates from this trend. We observe that large neural network… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 5 pages, 3 figures, 6 tables

  24. arXiv:2311.17968  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    Latent Alignment with Deep Set EEG Decoders

    Authors: Stylianos Bakas, Siegfried Ludwig, Dimitrios A. Adamos, Nikolaos Laskaris, Yannis Panagakis, Stefanos Zafeiriou

    Abstract: The variability in EEG signals between different individuals poses a significant challenge when implementing brain-computer interfaces (BCI). Commonly proposed solutions to this problem include deep learning models, due to their increased capacity and generalization, as well as explicit domain adaptation techniques. Here, we introduce the Latent Alignment method that won the Benchmarks for EEG Tra… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    ACM Class: I.2.6

  25. arXiv:2310.03952  [pdf, other

    cs.CV

    ILSH: The Imperial Light-Stage Head Dataset for Human Head View Synthesis

    Authors: Jiali Zheng, Youngkyoon Jang, Athanasios Papaioannou, Christos Kampouris, Rolandos Alexandros Potamias, Foivos Paraperas Papantoniou, Efstathios Galanakis, Ales Leonardis, Stefanos Zafeiriou

    Abstract: This paper introduces the Imperial Light-Stage Head (ILSH) dataset, a novel light-stage-captured human head dataset designed to support view synthesis academic challenges for human heads. The ILSH dataset is intended to facilitate diverse approaches, such as scene-specific or generic neural rendering, multiple-view geometry, 3D vision, and computer graphics, to further advance the development of p… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: ICCV 2023 Workshop, 9 pages, 6 figures

  26. arXiv:2305.09641  [pdf, other

    cs.CV cs.GR cs.LG

    FitMe: Deep Photorealistic 3D Morphable Model Avatars

    Authors: Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Jiankang Deng, Stefanos Zafeiriou

    Abstract: In this paper, we introduce FitMe, a facial reflectance model and a differentiable rendering optimization pipeline, that can be used to acquire high-fidelity renderable human avatars from single or multiple images. The model consists of a multi-modal style-based generator, that captures facial appearance in terms of diffuse and specular reflectance, and a PCA-based shape model. We employ a fast di… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted at CVPR 2023, project page at https://lattas.github.io/fitme , 17 pages including supplementary material

    ACM Class: I.2.10; I.3.7; I.4.1

  27. arXiv:2305.06077  [pdf, other

    cs.CV

    Relightify: Relightable 3D Faces from a Single Image via Diffusion Models

    Authors: Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Stefanos Zafeiriou

    Abstract: Following the remarkable success of diffusion models on image generation, recent works have also demonstrated their impressive ability to address a number of inverse problems in an unsupervised way, by properly constraining the sampling process based on a conditioning input. Motivated by this, in this paper, we present the first approach to use diffusion models as a prior for highly accurate 3D fa… ▽ More

    Submitted 21 August, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: ICCV 2023, 15 pages, 14 figures. Project page: https://foivospar.github.io/Relightify/

  28. arXiv:2303.01498  [pdf, ps, other

    cs.CV cs.LG

    ABAW: Valence-Arousal Estimation, Expression Recognition, Action Unit Detection & Emotional Reaction Intensity Estimation Challenges

    Authors: Dimitrios Kollias, Panagiotis Tzirakis, Alice Baird, Alan Cowen, Stefanos Zafeiriou

    Abstract: The fifth Affective Behavior Analysis in-the-wild (ABAW) Competition is part of the respective ABAW Workshop which will be held in conjunction with IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2023. The 5th ABAW Competition is a continuation of the Competitions held at ECCV 2022, IEEE CVPR 2022, ICCV 2021, IEEE FG 2020 and CVPR 2017 Conferences, and is dedicated at automatically… ▽ More

    Submitted 20 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: text overlap with arXiv:2202.10659

  29. arXiv:2301.04944  [pdf, other

    cs.CV cs.LG

    ViTs for SITS: Vision Transformers for Satellite Image Time Series

    Authors: Michail Tarasiou, Erik Chavez, Stefanos Zafeiriou

    Abstract: In this paper we introduce the Temporo-Spatial Vision Transformer (TSViT), a fully-attentional model for general Satellite Image Time Series (SITS) processing based on the Vision Transformer (ViT). TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder. We argue, that in contrast to natural im… ▽ More

    Submitted 14 April, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: 11 pages, 5 figures, 2 tables

  30. arXiv:2212.02997  [pdf, other

    cs.CV

    3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from Synthetic Views

    Authors: Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng, Michail Christos Doukas, Jia Guo, Stefanos Zafeiriou

    Abstract: Developing gaze estimation models that generalize well to unseen domains and in-the-wild conditions remains a challenge with no known best solution. This is mostly due to the difficulty of acquiring ground truth data that cover the distribution of faces, head poses, and environments that exist in the real world. Most recent methods attempt to close the gap between specific source and target domain… ▽ More

    Submitted 12 December, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: 17 pages, 13 figures

  31. arXiv:2211.13994  [pdf, other

    cs.CV

    Dynamic Neural Portraits

    Authors: Michail Christos Doukas, Stylianos Ploumpis, Stefanos Zafeiriou

    Abstract: We present Dynamic Neural Portraits, a novel approach to the problem of full-head reenactment. Our method generates photo-realistic video portraits by explicitly controlling head pose, facial expressions and eye gaze. Our proposed architecture is different from existing methods that rely on GAN-based image-to-image translation networks for transforming renderings of 3D faces into photo-realistic i… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

  32. arXiv:2211.06408  [pdf, other

    cs.CV

    Physically-Based Face Rendering for NIR-VIS Face Recognition

    Authors: Yunqi Miao, Alexandros Lattas, Jiankang Deng, Jungong Han, Stefanos Zafeiriou

    Abstract: Near infrared (NIR) to Visible (VIS) face matching is challenging due to the significant domain gaps as well as a lack of sufficient data for cross-modality model training. To overcome this problem, we propose a novel method for paired NIR-VIS facial image generation. Specifically, we reconstruct 3D face shape and reflectance from a large 2D facial dataset and introduce a novel method of transform… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  33. arXiv:2211.02831  [pdf, other

    cs.CV

    A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal

    Authors: Tao Wang, Kaihao Zhang, Xuanxi Chen, Wenhan Luo, Jiankang Deng, Tong Lu, Xiaochun Cao, Wei Liu, Hongdong Li, Stefanos Zafeiriou

    Abstract: Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: 21 pages, 19 figures

  34. arXiv:2209.07366  [pdf, other

    cs.CV

    3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling

    Authors: Stathis Galanakis, Baris Gecer, Alexandros Lattas, Stefanos Zafeiriou

    Abstract: Facial 3D Morphable Models are a main computer vision subject with countless applications and have been highly optimized in the last two decades. The tremendous improvements of deep generative networks have created various possibilities for improving such models and have attracted wide interest. Moreover, the recent advances in neural radiance fields, are revolutionising novel-view synthesis of kn… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  35. Inverse Image Frequency for Long-tailed Image Recognition

    Authors: Konstantinos Panagiotis Alexandridis, Shan Luo, Anh Nguyen, Jiankang Deng, Stefanos Zafeiriou

    Abstract: The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we… ▽ More

    Submitted 7 October, 2023; v1 submitted 11 September, 2022; originally announced September 2022.

    Journal ref: IEEE Transactions on Image Processing 2023

  36. Redesigning Multi-Scale Neural Network for Crowd Counting

    Authors: Zhipeng Du, Miaojing Shi, Jiankang Deng, Stefanos Zafeiriou

    Abstract: Perspective distortions and crowd variations make crowd counting a challenging task in computer vision. To tackle it, many previous works have used multi-scale architecture in deep neural networks (DNNs). Multi-scale branches can be either directly merged (e.g. by concatenation) or merged through the guidance of proxies (e.g. attentions) in the DNNs. Despite their prevalence, these combination met… ▽ More

    Submitted 3 July, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

    Comments: IEEE Transactions on Image Processing

  37. arXiv:2208.02210  [pdf, other

    cs.CV

    Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control

    Authors: Michail Christos Doukas, Evangelos Ververas, Viktoriia Sharmanska, Stefanos Zafeiriou

    Abstract: We present Free-HeadGAN, a person-generic neural talking head synthesis system. We show that modeling faces with sparse 3D facial landmarks are sufficient for achieving state-of-the-art generative performance, without relying on strong statistical priors of the face, such as 3D Morphable Models. Apart from 3D pose and facial expressions, our method is capable of fully transferring the eye gaze, fr… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

  38. Fast Multilevel Algorithms for Compressive Principle Component Pursuit

    Authors: Vahan Hovhannisyan, Yannis Panagakis, Panos Parpas, Stefanos Zafeiriou

    Abstract: Recovering a low-rank matrix from highly corrupted measurements arises in compressed sensing of structured high-dimensional signals (e.g., videos and hyperspectral images among others). Robust principal component analysis (RPCA), solved via principal component pursuit (PCP), recovers a low-rank matrix from sparse corruptions that are of unknown value and support by decomposing the observation matr… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Journal ref: SIAM Journal on Imaging Sciences 12.1 (2019): 624-649

  39. arXiv:2205.15217  [pdf, other

    cs.CV

    GraphWalks: Efficient Shape Agnostic Geodesic Shortest Path Estimation

    Authors: Rolandos Alexandros Potamias, Alexandros Neofytou, Kyriaki-Margarita Bintsi, Stefanos Zafeiriou

    Abstract: Geodesic paths and distances are among the most popular intrinsic properties of 3D surfaces. Traditionally, geodesic paths on discrete polygon surfaces were computed using shortest path algorithms, such as Dijkstra. However, such algorithms have two major limitations. They are non-differentiable which limits their direct usage in learnable pipelines and they are considerably time demanding. To add… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

    Comments: CVPRw 2022

  40. arXiv:2203.14448  [pdf, other

    cs.CV

    Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing

    Authors: Qingping Zheng, Jiankang Deng, Zheng Zhu, Ying Li, Stefanos Zafeiriou

    Abstract: This paper probes intrinsic factors behind typical failure cases (e.g. spatial inconsistency and boundary confusion) produced by the existing state-of-the-art method in face parsing. To tackle these problems, we propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation (DML-CSR) for face parsing. Specifically, DML-CSR designs a multi-task model which comprises face parsing, binar… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

  41. arXiv:2203.09692  [pdf, other

    cs.CV

    Facial Geometric Detail Recovery via Implicit Representation

    Authors: Xingyu Ren, Alexandros Lattas, Baris Gecer, Jiankang Deng, Chao Ma, Xiaokang Yang, Stefanos Zafeiriou

    Abstract: Learning a dense 3D model with fine-scale details from a single facial image is highly challenging and ill-posed. To address this problem, many approaches fit smooth geometries through facial prior while learning details as additional displacement maps or personalized basis. However, these techniques typically require vast datasets of paired multi-view data or 3D scans, whereas such datasets are s… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

  42. arXiv:2203.06041  [pdf, other

    cs.CV cs.LG

    Embedding Earth: Self-supervised contrastive pre-training for dense land cover classification

    Authors: Michail Tarasiou, Stefanos Zafeiriou

    Abstract: In training machine learning models for land cover semantic segmentation there is a stark contrast between the availability of satellite imagery to be used as inputs and ground truth data to enable supervised learning. While thousands of new satellite images become freely available on a daily basis, getting ground truth data is still very challenging, time consuming and costly. In this paper we pr… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Self-supervised pre-training for semantic segmentation. Replacement to random initialization

  43. arXiv:2202.12950  [pdf, other

    eess.SP cs.AI cs.LG

    2021 BEETL Competition: Advancing Transfer Learning for Subject Independence & Heterogenous EEG Data Sets

    Authors: Xiaoxi Wei, A. Aldo Faisal, Moritz Grosse-Wentrup, Alexandre Gramfort, Sylvain Chevallier, Vinay Jayaram, Camille Jeunet, Stylianos Bakas, Siegfried Ludwig, Konstantinos Barmpas, Mehdi Bahri, Yannis Panagakis, Nikolaos Laskaris, Dimitrios A. Adamos, Stefanos Zafeiriou, William C. Duong, Stephen M. Gordon, Vernon J. Lawhern, Maciej Śliwowski, Vincent Rouanne, Piotr Tempczyk

    Abstract: Transfer learning and meta-learning offer some of the most promising avenues to unlock the scalability of healthcare and consumer technologies driven by biosignal data. This is because current methods cannot generalise well across human subjects' data and handle learning from different heterogeneously collected data sets, thus limiting the scale of training data. On the other side, developments in… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

    Comments: PrePrint of the NeurIPS2021 BEETL Competition Submitted to Proceedings of Machine Learning Research (PMLR)

  44. arXiv:2202.03267  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    Team Cogitat at NeurIPS 2021: Benchmarks for EEG Transfer Learning Competition

    Authors: Stylianos Bakas, Siegfried Ludwig, Konstantinos Barmpas, Mehdi Bahri, Yannis Panagakis, Nikolaos Laskaris, Dimitrios A. Adamos, Stefanos Zafeiriou

    Abstract: Building subject-independent deep learning models for EEG decoding faces the challenge of strong covariate-shift across different datasets, subjects and recording sessions. Our approach to address this difficulty is to explicitly align feature distributions at various layers of the deep learning model, using both simple statistical techniques as well as trainable methods with more representational… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    ACM Class: I.2.6

  45. AvatarMe++: Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs

    Authors: Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Abhijeet Ghosh, Stefanos Zafeiriou

    Abstract: Over the last years, many face analysis tasks have accomplished astounding performance, with applications including face generation and 3D face reconstruction from a single "in-the-wild" image. Nevertheless, to the best of our knowledge, there is no method which can produce render-ready high-resolution 3D faces from "in-the-wild" images and this can be attributed to the: (a) scarcity of available… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

    Comments: Project and Dataset page: ( https://github.com/lattas/AvatarMe ). 20 pages, including supplemental materials. Accepted for publishing at IEEE Transactions on Pattern Analysis and Machine Intelligence on 13 November 2021. Copyright 2021 IEEE. Personal use of this material is permitted

    ACM Class: I.4.1; I.3.7; I.2.10

  46. arXiv:2110.10009  [pdf, other

    cs.LG cs.HC

    EEGminer: Discovering Interpretable Features of Brain Activity with Learnable Filters

    Authors: Siegfried Ludwig, Stylianos Bakas, Dimitrios A. Adamos, Nikolaos Laskaris, Yannis Panagakis, Stefanos Zafeiriou

    Abstract: Patterns of brain activity are associated with different brain processes and can be used to identify different brain states and make behavioral predictions. However, the relevant features are not readily apparent and accessible. To mine informative latent representations from multichannel recordings of ongoing EEG activity, we propose a novel differentiable decoding pipeline consisting of learnabl… ▽ More

    Submitted 2 February, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: 14 pages, 8 figures

    ACM Class: I.2.6

  47. arXiv:2110.05031  [pdf, other

    cs.CV

    EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset

    Authors: Kaihao Zhang, Dongxu Li, Wenhan Luo, Jingyu Liu, Jiankang Deng, Wei Liu, Stefanos Zafeiriou

    Abstract: Recent deep face hallucination methods show stunning performance in super-resolving severely degraded facial images, even surpassing human ability. However, these algorithms are mainly evaluated on non-public synthetic datasets. It is thus unclear how these algorithms perform on public face hallucination datasets. Meanwhile, most of the existing datasets do not well consider the distribution of ra… ▽ More

    Submitted 8 June, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  48. arXiv:2109.14982  [pdf, other

    cs.CV

    Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach

    Authors: Rolandos Alexandros Potamias, Giorgos Bouritsas, Stefanos Zafeiriou

    Abstract: The recent advances in 3D sensing technology have made possible the capture of point clouds in significantly high resolution. However, increased detail usually comes at the expense of high storage, as well as computational costs in terms of processing and visualization operations. Mesh and Point Cloud simplification methods aim to reduce the complexity of 3D models while retaining visual quality a… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  49. arXiv:2108.08191  [pdf, other

    cs.CV

    Masked Face Recognition Challenge: The InsightFace Track Report

    Authors: Jiankang Deng, Jia Guo, Xiang An, Zheng Zhu, Stefanos Zafeiriou

    Abstract: During the COVID-19 coronavirus epidemic, almost everyone wears a facial mask, which poses a huge challenge to deep face recognition. In this workshop, we organize Masked Face Recognition (MFR) challenge and focus on bench-marking deep face recognition methods under the existence of facial masks. In the MFR challenge, there are two main tracks: the InsightFace track and the WebFace260M track. For… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: The WebFace260M Track of ICCV-21 MFR Challenge is still open in https://github.com/deepinsight/insightface/tree/master/challenges/iccv21-mfr

  50. Tensor Methods in Computer Vision and Deep Learning

    Authors: Yannis Panagakis, Jean Kossaifi, Grigorios G. Chrysos, James Oldfield, Mihalis A. Nicolaou, Anima Anandkumar, Stefanos Zafeiriou

    Abstract: Tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions. Inherently able to efficiently capture structured, latent semantic spaces and high-order interactions, tensors have a long history of applications in a wide span of computer vision problems. With the advent of the deep learning paradigm shift in computer vision, tensors have be… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: Proceedings of the IEEE (2021)