Skip to main content

Showing 1–12 of 12 results for author: Koutras, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10180  [pdf, other

    cs.CV

    MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

    Authors: Eric-Tuan Lê, Antonis Kakolyris, Petros Koutras, Himmy Tam, Efstratios Skordos, George Papandreou, Rıza Alp Güler, Iasonas Kokkinos

    Abstract: DensePose provides a pixel-accurate association of images with 3D mesh coordinates, but does not provide a 3D mesh, while Human Mesh Reconstruction (HMR) systems have high 2D reprojection error, as measured by DensePose localization metrics. In this work we introduce MeshPose to jointly tackle DensePose and HMR. For this we first introduce new losses that allow us to use weak DensePose supervision… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    MSC Class: 68 ACM Class: I.2.10

    Journal ref: CVPR 2024

  2. arXiv:2305.11729  [pdf, other

    cs.CV

    ViDaS Video Depth-aware Saliency Network

    Authors: Ioanna Diamanti, Antigoni Tsiami, Petros Koutras, Petros Maragos

    Abstract: We introduce ViDaS, a two-stream, fully convolutional Video, Depth-Aware Saliency network to address the problem of attention modeling ``in-the-wild", via saliency prediction in videos. Contrary to existing visual saliency approaches using only RGB frames as input, our network employs also depth as an additional modality. The network consists of two visual streams, one for the RGB frames, and one… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  3. arXiv:2008.12818  [pdf, other

    cs.RO cs.CV cs.LG

    ChildBot: Multi-Robot Perception and Interaction with Children

    Authors: Niki Efthymiou, Panagiotis P. Filntisis, Petros Koutras, Antigoni Tsiami, Jack Hadfield, Gerasimos Potamianos, Petros Maragos

    Abstract: In this paper we present an integrated robotic system capable of participating in and performing a wide range of educational and entertainment tasks, in collaboration with one or more children. The system, called ChildBot, features multimodal perception modules and multiple robotic agents that monitor the interaction environment, and can robustly coordinate complex Child-Robot Interaction use-case… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

    Comments: 19 pages, 10 figures

    ACM Class: I.4; I.5

  4. arXiv:2004.10335  [pdf, other

    cs.CV

    How to track your dragon: A Multi-Attentional Framework for real-time RGB-D 6-DOF Object Pose Tracking

    Authors: Isidoros Marougkas, Petros Koutras, Nikos Kardaris, Georgios Retsinas, Georgia Chalvatzaki, Petros Maragos

    Abstract: We present a novel multi-attentional convolutional architecture to tackle the problem of real-time RGB-D 6D object pose tracking of single, known objects. Such a problem poses multiple challenges originating both from the objects' nature and their interaction with their environment, which previous approaches have failed to fully address. The proposed framework encapsulates methods for background c… ▽ More

    Submitted 15 September, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: 14 pages, accepted at the 6th Workshop on Recovering 6D Object Pose of the ECCV 2020

  5. arXiv:2001.03063  [pdf, other

    cs.CV

    STAViS: Spatio-Temporal AudioVisual Saliency Network

    Authors: Antigoni Tsiami, Petros Koutras, Petros Maragos

    Abstract: We introduce STAViS, a spatio-temporal audiovisual saliency network that combines spatio-temporal visual and auditory information in order to efficiently address the problem of saliency estimation in videos. Our approach employs a single network that combines visual saliency and auditory features and learns to appropriately localize sound sources and to fuse the two saliencies in order to obtain a… ▽ More

    Submitted 14 June, 2020; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: CVPR 2020. Project page: https://github.com/atsiami/STAViS

  6. arXiv:1902.05829  [pdf, other

    cs.CV

    Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection

    Authors: Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Koutras, Athanasia Zlatintsi, Petros Maragos

    Abstract: Detecting visual relationships, i.e. <Subject, Predicate, Object> triplets, is a challenging Scene Understanding task approached in the past via linguistic priors or spatial information in a single feature branch. We introduce a new deeply supervised two-branch architecture, the Multimodal Attentional Translation Embeddings, where the visual features of each branch are driven by a multimodal atten… ▽ More

    Submitted 15 February, 2019; originally announced February 2019.

  7. Fusing Body Posture with Facial Expressions for Joint Recognition of Affect in Child-Robot Interaction

    Authors: Panagiotis P. Filntisis, Niki Efthymiou, Petros Koutras, Gerasimos Potamianos, Petros Maragos

    Abstract: In this paper we address the problem of multi-cue affect recognition in challenging scenarios such as child-robot interaction. Towards this goal we propose a method for automatic recognition of affect that leverages body expressions alongside facial ones, as opposed to traditional methods that typically focus only on the latter. Our deep-learning based method uses hierarchical multi-label annotati… ▽ More

    Submitted 5 September, 2019; v1 submitted 7 January, 2019; originally announced January 2019.

    Comments: To be presented in IROS 2019

    Journal ref: IEEE Robotics and Automation Letters, 4(4), 4011-4018, 2019

  8. arXiv:1812.00722  [pdf, other

    cs.CV

    SUSiNet: See, Understand and Summarize it

    Authors: Petros Koutras, Petros Maragos

    Abstract: In this work we propose a multi-task spatio-temporal network, called SUSiNet, that can jointly tackle the spatio-temporal problems of saliency estimation, action recognition and video summarization. Our approach employs a single network that is jointly end-to-end trained for all tasks with multiple and diverse datasets related to the exploring tasks. The proposed network uses a unified architectur… ▽ More

    Submitted 13 April, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

    Comments: CVPR Workshops 2019 (Mutual benefits of cognitive and computer vision)

  9. arXiv:1812.00253  [pdf, other

    cs.RO cs.CV cs.LG

    A Deep Learning Approach for Multi-View Engagement Estimation of Children in a Child-Robot Joint Attention task

    Authors: Jack Hadfield, Georgia Chalvatzaki, Petros Koutras, Mehdi Khamassi, Costas S. Tzafestas, Petros Maragos

    Abstract: In this work we tackle the problem of child engagement estimation while children freely interact with a robot in their room. We propose a deep-based multi-view solution that takes advantage of recent developments in human pose detection. We extract the child's pose from different RGB-D cameras placed elegantly in the room, fuse the results and feed them to a deep neural network trained for classif… ▽ More

    Submitted 1 December, 2018; originally announced December 2018.

    Comments: 7 pages, 6 figures

  10. arXiv:1812.00252  [pdf, other

    cs.RO cs.CV cs.LG

    LSTM-based Network for Human Gait Stability Prediction in an Intelligent Robotic Rollator

    Authors: Georgia Chalvatzaki, Petros Koutras, Jack Hadfield, Xanthi S. Papageorgiou, Costas S. Tzafestas, Petros Maragos

    Abstract: In this work, we present a novel framework for on-line human gait stability prediction of the elderly users of an intelligent robotic rollator using Long Short Term Memory (LSTM) networks, fusing multimodal RGB-D and Laser Range Finder (LRF) data from non-wearable sensors. A Deep Learning (DL) based approach is used for the upper body pose estimation. The detected pose is used for estimating the b… ▽ More

    Submitted 5 March, 2019; v1 submitted 1 December, 2018; originally announced December 2018.

    Comments: 8 pages, 4 figures accepted to ICRA 2019

  11. Multimodal Visual Concept Learning with Weakly Supervised Techniques

    Authors: Giorgos Bouritsas, Petros Koutras, Athanasia Zlatintsi, Petros Maragos

    Abstract: Despite the availability of a huge amount of video data accompanied by descriptive texts, it is not always easy to exploit the information contained in natural language in order to automatically recognize video concepts. Towards this goal, in this paper we use textual cues as means of supervision, introducing two weakly supervised techniques that extend the Multiple Instance Learning (MIL) framewo… ▽ More

    Submitted 4 April, 2018; v1 submitted 3 December, 2017; originally announced December 2017.

    Comments: CVPR 2018

    Journal ref: Proc. IEEE/CVF Conf. Comp. Vis. Patt. Rec. (CVPR) pp. 4914 - 4923 (2018)

  12. arXiv:1711.01775  [pdf, other

    cs.MM cs.HC cs.RO

    Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot

    Authors: A. Zlatintsi, I. Rodomagoulakis, P. Koutras, A. C. Dometios, V. Pitsikalis, C. S. Tzafestas, P. Maragos

    Abstract: We explore new aspects of assistive living on smart human-robot interaction (HRI) that involve automatic recognition and online validation of speech and gestures in a natural interface, providing social features for HRI. We introduce a whole framework and resources of a real-life scenario for elderly subjects supported by an assistive bathing robot, addressing health and hygiene care issues. We co… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.