Skip to main content

Showing 1–48 of 48 results for author: Giancola, S

.
  1. arXiv:2505.19175  [pdf, other

    cs.CV

    Triangle Splatting for Real-Time Radiance Field Rendering

    Authors: Jan Held, Renaud Vandeghen, Adrien Deliege, Abdullah Hamdi, Silvio Giancola, Anthony Cioppa, Andrea Vedaldi, Bernard Ghanem, Andrea Tagliasacchi, Marc Van Droogenbroeck

    Abstract: The field of computer graphics was revolutionized by models such as Neural Radiance Fields and 3D Gaussian Splatting, displacing triangles as the dominant representation for photogrammetry. In this paper, we argue for a triangle comeback. We develop a differentiable renderer that directly optimizes triangles via end-to-end gradients. We achieve this by rendering each triangle as differentiable spl… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: 18 pages, 13 figures, 10 tables

  2. arXiv:2504.12021  [pdf, other

    cs.CV

    Action Anticipation from SoccerNet Football Video Broadcasts

    Authors: Mohamad Dalal, Artur Xarles, Anthony Cioppa, Silvio Giancola, Marc Van Droogenbroeck, Bernard Ghanem, Albert Clapés, Sergio Escalera, Thomas B. Moeslund

    Abstract: Artificial intelligence has revolutionized the way we analyze sports videos, whether to understand the actions of games in long untrimmed videos or to anticipate the player's motion in future frames. Despite these efforts, little attention has been given to anticipating game actions before they occur. In this work, we introduce the task of action anticipation for football broadcast videos, which c… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 15 pages, 14 figures. To be published in the CVSports CVPR workshop

    ACM Class: I.2.10; I.4.8

  3. arXiv:2502.20361  [pdf, other

    cs.CV

    OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection

    Authors: Shuming Liu, Chen Zhao, Fatimah Zohra, Mattia Soldan, Alejandro Pardo, Mengmeng Xu, Lama Alssum, Merey Ramazanova, Juan León Alcázar, Anthony Cioppa, Silvio Giancola, Carlos Hinojosa, Bernard Ghanem

    Abstract: Temporal action detection (TAD) is a fundamental video understanding task that aims to identify human actions and localize their temporal boundaries in videos. Although this field has achieved remarkable progress in recent years, further progress and real-world applications are impeded by the absence of a standardized framework. Currently, different methods are compared under different implementat… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  4. arXiv:2411.14974  [pdf, other

    cs.CV

    3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes

    Authors: Jan Held, Renaud Vandeghen, Abdullah Hamdi, Adrien Deliege, Anthony Cioppa, Silvio Giancola, Andrea Vedaldi, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Recent advances in radiance field reconstruction, such as 3D Gaussian Splatting (3DGS), have achieved high-quality novel view synthesis and fast rendering by representing scenes with compositions of Gaussian primitives. However, 3D Gaussians present several limitations for scene reconstruction. Accurately capturing hard edges is challenging without significantly increasing the number of Gaussians,… ▽ More

    Submitted 25 May, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

    Comments: Accepted at CVPR 2025 as Highlight. 13 pages, 13 figures, 10 tables

  5. arXiv:2410.01304  [pdf, other

    cs.CV

    Deep learning for action spotting in association football videos

    Authors: Silvio Giancola, Anthony Cioppa, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: The task of action spotting consists in both identifying actions and precisely localizing them in time with a single timestamp in long, untrimmed video streams. Automatically extracting those actions is crucial for many sports applications, including sports analytics to produce extended statistics on game actions, coaching to provide support to video analysts, or fan engagement to automatically ov… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 31 pages, 2 figures, 5 tables

  6. arXiv:2409.10587  [pdf, other

    cs.CV

    SoccerNet 2024 Challenges Results

    Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Victor Joos, Floriane Magera, Jan Held, Seyed Abolfazl Ghasemzadeh, Xin Zhou, Karolina Seweryn, Mateusz Kowalczyk, Zuzanna Mróz, Szymon Łukasik, Michał Hałoń, Hassan Mkhallati, Adrien Deliège, Carlos Hinojosa, Karen Sanchez, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Adam Gorski , et al. (59 additional authors not shown)

    Abstract: The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely loca… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

    Comments: 7 pages, 1 figure

  7. arXiv:2408.10739  [pdf, other

    cs.CV

    TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks

    Authors: Jinjie Mai, Wenxuan Zhu, Sara Rojas, Jesus Zarzar, Abdullah Hamdi, Guocheng Qian, Bing Li, Silvio Giancola, Bernard Ghanem

    Abstract: Neural radiance fields (NeRFs) generally require many images with accurate poses for accurate novel view synthesis, which does not reflect realistic setups where views can be sparse and poses can be noisy. Previous solutions for learning NeRFs with sparse views and noisy poses only consider local geometry consistency with pairs of views. Closely following \textit{bundle adjustment} in Structure-fr… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: ECCV 2024 (supplemental pages included)

  8. arXiv:2407.12483  [pdf, other

    cs.CV

    Towards AI-Powered Video Assistant Referee System (VARS) for Association Football

    Authors: Jan Held, Anthony Cioppa, Silvio Giancola, Abdullah Hamdi, Christel Devue, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Over the past decade, the technology used by referees in football has improved substantially, enhancing the fairness and accuracy of decisions. This progress has culminated in the implementation of the Video Assistant Referee (VAR), an innovation that enables backstage referees to review incidents on the pitch from multiple points of view. However, the VAR is currently limited to professional leag… ▽ More

    Submitted 18 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: The paper is subject to the peer review process of Sports Engineering

  9. arXiv:2407.08023  [pdf, other

    cs.CV

    Hybrid Structure-from-Motion and Camera Relocalization for Enhanced Egocentric Localization

    Authors: Jinjie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem

    Abstract: We built our pipeline EgoLoc-v1, mainly inspired by EgoLoc. We propose a model ensemble strategy to improve the camera pose estimation part of the VQ3D task, which has been proven to be essential in previous work. The core idea is not only to do SfM for egocentric videos but also to do 2D-3D matching between existing 3D scans and 2D video frames. In this way, we have a hybrid SfM and camera reloca… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 1st place winner of the 2024 Ego4D-Ego-Exo4D Challenge in VQ3D

  10. arXiv:2407.02370  [pdf, other

    cs.CV

    Investigating Event-Based Cameras for Video Frame Interpolation in Sports

    Authors: Antoine Deckyvere, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Slow-motion replays provide a thrilling perspective on pivotal moments within sports games, offering a fresh and captivating visual experience. However, capturing slow-motion footage typically demands high-tech, expensive cameras and infrastructures. Deep learning Video Frame Interpolation (VFI) techniques have emerged as a promising avenue, capable of generating high-speed footage from regular ca… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  11. arXiv:2407.01265  [pdf, other

    cs.CV

    OSL-ActionSpotting: A Unified Library for Action Spotting in Sports Videos

    Authors: Yassine Benzakour, Bruno Cabado, Silvio Giancola, Anthony Cioppa, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Action spotting is crucial in sports analytics as it enables the precise identification and categorization of pivotal moments in sports matches, providing insights that are essential for performance analysis and tactical decision-making. The fragmentation of existing methodologies, however, impedes the progression of sports analytics, necessitating a unified codebase to support the development and… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2405.16574  [pdf, other

    math.OC

    Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent

    Authors: Peter Richtárik, Simone Maria Giancola, Dymitr Lubczyk, Robin Yadav

    Abstract: We contribute to the growing body of knowledge on more powerful and adaptive stepsizes for convex optimization, empowered by local curvature information. We do not go the route of fully-fledged second-order methods which require the expensive computation of the Hessian. Instead, our key observation is that, for some problems (e.g., when minimizing the sum of squares of absolutely convex functions)… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 53 pages, 9 figures, 3 algorithms

  13. arXiv:2405.07354  [pdf, other

    cs.SD cs.IR cs.LG cs.MM eess.AS

    SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset

    Authors: Sushant Gautam, Mehdi Houshmand Sarkhoosh, Jan Held, Cise Midoglu, Anthony Cioppa, Silvio Giancola, Vajira Thambawita, Michael A. Riegler, Pål Halvorsen, Mubarak Shah

    Abstract: The application of Automatic Speech Recognition (ASR) technology in soccer offers numerous opportunities for sports analytics. Specifically, extracting audio commentaries with ASR provides valuable insights into the events of the game, and opens the door to several downstream applications such as automatic highlight generation. This paper presents SoccerNet-Echoes, an augmentation of the SoccerNet… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    ACM Class: I.2.7; I.7

  14. arXiv:2404.11335  [pdf, other

    cs.CV cs.AI cs.LG

    SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap

    Authors: Vladimir Somers, Victor Joos, Anthony Cioppa, Silvio Giancola, Seyed Abolfazl Ghasemzadeh, Floriane Magera, Baptiste Standaert, Amir Mohammad Mansourian, Xin Zhou, Shohreh Kasaei, Bernard Ghanem, Alexandre Alahi, Marc Van Droogenbroeck, Christophe De Vleeschouwer

    Abstract: Tracking and identifying athletes on the pitch holds a central role in collecting essential insights from the game, such as estimating the total distance covered by players or understanding team tactics. This tracking and identification process is crucial for reconstructing the game state, defined by the athletes' positions and identities on a 2D top-view of the pitch, (i.e. a minimap). However, r… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Journal ref: 2024 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Work. (CVPRW)

  15. X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model

    Authors: Jan Held, Hani Itani, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: The rapid advancement of artificial intelligence has led to significant improvements in automated decision-making. However, the increased performance of models often comes at the cost of explainability and transparency of their decision-making processes. In this paper, we investigate the capabilities of large language models to explain decisions, using football refereeing as a testing ground, give… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  16. Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders

    Authors: Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Self-supervised pre-training of image encoders is omnipresent in the literature, particularly following the introduction of Masked autoencoders (MAE). Current efforts attempt to learn object-centric representations from motion in videos. In particular, SiamMAE recently introduced a Siamese network, training a shared-weight encoder from two frames of a video with a high asymmetric masking ratio (95… ▽ More

    Submitted 18 July, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 19 pages, 7 figures, 5 tables, 3 pages of supplementary material. Paper accepted at ECCV 2024

    ACM Class: I.2.6; I.2.10

  17. arXiv:2312.10639  [pdf, other

    cs.CV cs.AI physics.optics

    Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s

    Authors: Maksim Makarenko, Qizhou Wang, Arturo Burguete-Lopez, Silvio Giancola, Bernard Ghanem, Luca Passone, Andrea Fratalocchi

    Abstract: Foundation models, exemplified by GPT technology, are discovering new horizons in artificial intelligence by executing tasks beyond their designers' expectations. While the present generation provides fundamental advances in understanding language and images, the next frontier is video comprehension. Progress in this area must overcome the 1 Tb/s data rate demanded to grasp real-time multidimensio… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  18. SoccerNet 2023 Challenges Results

    Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim , et al. (77 additional authors not shown)

    Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  19. arXiv:2309.05490  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Semantic Segmentation with Query Points Supervision on Aerial Images

    Authors: Santiago Rivier, Carlos Hinojosa, Silvio Giancola, Bernard Ghanem

    Abstract: Semantic segmentation is crucial in remote sensing, where high-resolution satellite images are segmented into meaningful regions. Recent advancements in deep learning have significantly improved satellite image segmentation. However, most of these methods are typically trained in fully supervised settings that require high-quality pixel-level annotations, which are expensive and time-consuming to… ▽ More

    Submitted 5 August, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Paper Accepted at ICIP 2024 (Oral Presentation)

  20. VARS: Video Assistant Referee System for Automated Soccer Decision Making from Multiple Views

    Authors: Jan Held, Anthony Cioppa, Silvio Giancola, Abdullah Hamdi, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: The Video Assistant Referee (VAR) has revolutionized association football, enabling referees to review incidents on the pitch, make informed decisions, and ensure fairness. However, due to the lack of referees in many countries and the high cost of the VAR infrastructure, only professional leagues can benefit from it. In this paper, we propose a Video Assistant Referee System (VARS) that can autom… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted at CVSports'23

  21. SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries

    Authors: Hassan Mkhallati, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Soccer is more than just a game - it is a passion that transcends borders and unites people worldwide. From the roar of the crowds to the excitement of the commentators, every moment of a soccer match is a thrill. Yet, with so many games happening simultaneously, fans cannot watch them all live. Notifications for main actions can help, but lack the engagement of live commentary, leaving fans feeli… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  22. Towards Active Learning for Action Spotting in Association Football Videos

    Authors: Silvio Giancola, Anthony Cioppa, Julia Georgieva, Johsan Billingham, Andreas Serner, Kerry Peek, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Association football is a complex and dynamic sport, with numerous actions occurring simultaneously in each game. Analyzing football videos is challenging and requires identifying subtle and diverse spatio-temporal patterns. Despite recent advances in computer vision, current algorithms still face significant challenges when learning from limited annotated data, lowering their performance in detec… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

    Comments: Accepted at CVSports'23

  23. arXiv:2212.13462  [pdf, other

    cs.CV cs.AI cs.GR

    MVTN: Learning Multi-View Transformations for 3D Understanding

    Authors: Abdullah Hamdi, Faisal AlZahrani, Silvio Giancola, Bernard Ghanem

    Abstract: Multi-view projection techniques have shown themselves to be highly effective in achieving top-performing results in the recognition of 3D shapes. These methods involve learning how to combine information from multiple view-points. However, the camera view-points from which these views are obtained are often fixed for all shapes. To overcome the static nature of current multi-view techniques, we p… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: under review journal extension for the ICCV 2021 paper arXiv:2011.13244

  24. arXiv:2212.06969  [pdf, other

    cs.CV

    EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

    Authors: Jinjie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem

    Abstract: With the recent advances in video and 3D understanding, novel 4D spatio-temporal methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic Memory Benchmark proposed a task for Visual Queries with 3D Localization (VQ3D). Given an egocentric video clip and an image crop depicting a query object, the goal is to localize the 3D position of the center of that query object wi… ▽ More

    Submitted 28 August, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: ICCV 2023

  25. arXiv:2211.11215  [pdf, other

    cs.CV

    SegNeRF: 3D Part Segmentation with Neural Radiance Fields

    Authors: Jesus Zarzar, Sara Rojas, Silvio Giancola, Bernard Ghanem

    Abstract: Recent advances in Neural Radiance Fields (NeRF) boast impressive performances for generative tasks such as novel view synthesis and 3D reconstruction. Methods based on neural radiance fields are able to represent the 3D world implicitly by relying exclusively on posed images. Yet, they have seldom been explored in the realm of discriminative tasks such as 3D part segmentation. In this work, we at… ▽ More

    Submitted 22 November, 2022; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Fixed abstract typo

  26. arXiv:2211.10284  [pdf, other

    cs.CV

    Estimating more camera poses for ego-centric videos is essential for VQ3D

    Authors: Jinjie Mai, Chen Zhao, Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

    Abstract: Visual queries 3D localization (VQ3D) is a task in the Ego4D Episodic Memory Benchmark. Given an egocentric video, the goal is to answer queries of the form "Where did I last see object X?", where the query object X is specified as a static image, and the answer should be a 3D displacement vector pointing to object X. However, current techniques use naive ways to estimate the camera poses of video… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Second International Ego4D Workshop at ECCV 2022

  27. SoccerNet 2022 Challenges Results

    Authors: Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao , et al. (69 additional authors not shown)

    Abstract: The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on det… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted at ACM MMSports 2022

  28. SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos

    Authors: Anthony Cioppa, Silvio Giancola, Adrien Deliege, Le Kang, Xin Zhou, Zhiyu Cheng, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Tracking objects in soccer videos is extremely important to gather both player and team statistics, whether it is to estimate the total distance run, the ball possession or the team formation. Video processing can help automating the extraction of those information, without the need of any invasive sensor, hence applicable to any team on any stadium. Yet, the availability of datasets to train lear… ▽ More

    Submitted 20 April, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Paper accepted for the CVsports workshop at CVPR2022. This document contains 8 pages + references

  29. arXiv:2204.05687  [pdf, other

    cs.CV

    3DeformRS: Certifying Spatial Deformations on Point Clouds

    Authors: Gabriel Pérez S., Juan C. Pérez, Motasem Alfarra, Silvio Giancola, Bernard Ghanem

    Abstract: 3D computer vision models are commonly used in security-critical applications such as autonomous driving and surgical robotics. Emerging concerns over the robustness of these models against real-world deformations must be addressed practically and reliably. In this work, we propose 3DeformRS, a method to certify the robustness of point cloud Deep Neural Networks (DNNs) against real-world deformati… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPR 2022

  30. arXiv:2204.02084  [pdf, other

    cs.CV eess.IV

    Real-time Hyperspectral Imaging in Hardware via Trained Metasurface Encoders

    Authors: Maksim Makarenko, Arturo Burguete-Lopez, Qizhou Wang, Fedor Getman, Silvio Giancola, Bernard Ghanem, Andrea Fratalocchi

    Abstract: Hyperspectral imaging has attracted significant attention to identify spectral signatures for image classification and automated pattern recognition in computer vision. State-of-the-art implementations of snapshot hyperspectral imaging rely on bulky, non-integrated, and expensive optical elements, including lenses, spectrometers, and filters. These macroscopic components do not allow fast data pro… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

  31. arXiv:2112.00431  [pdf, other

    cs.CV cs.AI

    MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions

    Authors: Mattia Soldan, Alejandro Pardo, Juan León Alcázar, Fabian Caba Heilbron, Chen Zhao, Silvio Giancola, Bernard Ghanem

    Abstract: The recent and increasing interest in video-language research has driven the development of large-scale datasets that enable data-intensive machine learning techniques. In comparison, limited effort has been made at assessing the fitness of these datasets for the video-language grounding task. Recent works have begun to discover significant limitations in these datasets, suggesting that state-of-t… ▽ More

    Submitted 28 March, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: 12 Pages, 6 Figures, 7 Tables

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR 2022

  32. arXiv:2111.15363  [pdf, other

    cs.CV cs.LG

    Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding

    Authors: Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

    Abstract: Multi-view projection methods have demonstrated promising performance on 3D understanding tasks like 3D classification and segmentation. However, it remains unclear how to combine such multi-view methods with the widely available 3D point clouds. Previous methods use unlearned heuristics to combine features at the point level. To this end, we introduce the concept of the multi-view point cloud (Vo… ▽ More

    Submitted 25 January, 2023; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: Accepted at ICLR 2023. The code is available at https://github.com/ajhamdi/vointcloud

    MSC Class: 68T45

  33. arXiv:2105.04447  [pdf, other

    cs.CV cs.AI

    SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation

    Authors: Bing Li, Cheng Zheng, Silvio Giancola, Bernard Ghanem

    Abstract: We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds. Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform. Such unstructured data poses difficulties in matching corresponding points between point clouds, leading to inaccurate flow estimation. We propose a novel architectu… ▽ More

    Submitted 9 March, 2022; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted to the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)

  34. arXiv:2104.09333  [pdf, other

    cs.CV

    Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting

    Authors: Anthony Cioppa, Adrien Deliège, Floriane Magera, Silvio Giancola, Olivier Barnich, Bernard Ghanem, Marc Van Droogenbroeck

    Abstract: Soccer broadcast video understanding has been drawing a lot of attention in recent years within data scientists and industrial companies. This is mainly due to the lucrative potential unlocked by effective deep learning techniques developed in the field of computer vision. In this work, we focus on the topic of camera calibration and on its current limitations for the scientific community. More pr… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: Paper accepted at the CVsports workshop at CVPR2021

  35. arXiv:2104.06779  [pdf, other

    cs.CV

    Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

    Authors: Silvio Giancola, Bernard Ghanem

    Abstract: Toward the goal of automatic production for sports broadcasts, a paramount task consists in understanding the high-level semantic information of the game in play. For instance, recognizing and localizing the main actions of the game would allow producers to adapt and automatize the broadcast production, focusing on the important details of the game and maximizing the spectator engagement. In this… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: 8 pages, Camera-Ready for CVSports 2021 (CVPRW)

  36. arXiv:2012.14929  [pdf, other

    cs.CV

    SALA: Soft Assignment Local Aggregation for Parameter Efficient 3D Semantic Segmentation

    Authors: Hani Itani, Silvio Giancola, Ali Thabet, Bernard Ghanem

    Abstract: In this work, we focus on designing a point local aggregation function that yields parameter efficient networks for 3D point cloud semantic segmentation. We explore the idea of using learnable neighbor-to-grid soft assignment in grid-based aggregation functions. Previous methods in literature operate on a predefined geometric grid such as local volume partitions or irregular kernel points. A more… ▽ More

    Submitted 5 April, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

  37. arXiv:2011.13367  [pdf, other

    cs.CV

    SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos

    Authors: Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, Marc Van Droogenbroeck

    Abstract: Understanding broadcast videos is a challenging task in computer vision, as it requires generic reasoning capabilities to appreciate the content offered by the video editing. In this work, we propose SoccerNet-v2, a novel large-scale corpus of manual annotations for the SoccerNet video dataset, along with open challenges to encourage more research in soccer understanding and broadcast production.… ▽ More

    Submitted 19 April, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: Paper accepted for the CVsports workshop at CVPR2021. This document contains 8 pages + references + supplementary material

  38. arXiv:2011.13244  [pdf, other

    cs.CV cs.LG

    MVTN: Multi-View Transformation Network for 3D Shape Recognition

    Authors: Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

    Abstract: Multi-view projection methods have demonstrated their ability to reach state-of-the-art performance on 3D shape recognition. Those methods learn different ways to aggregate information from multiple views. However, the camera view-points for those views tend to be heuristically set and fixed for all shapes. To circumvent the lack of dynamism of current multi-view methods, we propose to learn those… ▽ More

    Submitted 17 August, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: Published at ICCV 2021

    MSC Class: 68T45

  39. arXiv:2011.11479  [pdf, other

    cs.CV

    TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

    Authors: Humam Alwassel, Silvio Giancola, Bernard Ghanem

    Abstract: Due to the large memory footprint of untrimmed videos, current state-of-the-art video localization methods operate atop precomputed video clip features. These features are extracted from video encoders typically trained for trimmed action classification tasks, making such features not necessarily suitable for temporal localization. In this work, we propose a novel supervised pretraining paradigm f… ▽ More

    Submitted 17 August, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

    Comments: Accepted to ICCV 2021 workshops proceedings

  40. arXiv:2008.10309  [pdf, other

    cs.CV cs.AI

    LC-NAS: Latency Constrained Neural Architecture Search for Point Cloud Networks

    Authors: Guohao Li, Mengmeng Xu, Silvio Giancola, Ali Thabet, Bernard Ghanem

    Abstract: Point cloud architecture design has become a crucial problem for 3D deep learning. Several efforts exist to manually design architectures with high accuracy in point cloud tasks such as classification, segmentation, and detection. Recent progress in automatic Neural Architecture Search (NAS) minimizes the human effort in network design and optimizes high performing architectures. However, these ef… ▽ More

    Submitted 24 August, 2020; originally announced August 2020.

    Comments: Originally submitted to ECCV'2020 but rejected. This work was filed with the United States Patent and Trademark Office (USPTO) on May 19, 2020 and assigned Serial No. 63/027,241

  41. arXiv:1912.01326  [pdf, other

    cs.CV cs.LG eess.IV

    A Context-Aware Loss Function for Action Spotting in Soccer Videos

    Authors: Anthony Cioppa, Adrien Deliège, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, Thomas B. Moeslund

    Abstract: In video understanding, action spotting consists in temporally localizing human-induced events annotated with single timestamps. In this paper, we propose a novel loss function that specifically considers the temporal context naturally present around each action, rather than focusing on the single annotated frame to spot. We benchmark our loss on a large dataset of soccer videos, SoccerNet, and ac… ▽ More

    Submitted 30 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: Accepted for CVPR2020 main conference. This document contains 8 pages + references + supplementary material

  42. arXiv:1911.12236  [pdf, other

    cs.CV

    PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement

    Authors: Jesus Zarzar, Silvio Giancola, Bernard Ghanem

    Abstract: In autonomous driving pipelines, perception modules provide a visual understanding of the surrounding road scene. Among the perception tasks, vehicle detection is of paramount importance for a safe driving as it identifies the position of other agents sharing the road. In our work, we propose PointRGCN: a graph-based 3D object detection pipeline based on graph convolutional networks (GCNs) which o… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  43. arXiv:1903.10168  [pdf, other

    cs.CV

    Efficient Bird Eye View Proposals for 3D Siamese Tracking

    Authors: Jesus Zarzar, Silvio Giancola, Bernard Ghanem

    Abstract: Tracking vehicles in LIDAR point clouds is a challenging task due to the sparsity of the data and the dense search space. The lack of structure in point clouds impedes the use of convolution filters usually employed in 2D object tracking. In addition, structuring point clouds is cumbersome and implies losing fine-grained information. As a result, generating proposals in 3D space is expensive and i… ▽ More

    Submitted 6 May, 2020; v1 submitted 25 March, 2019; originally announced March 2019.

  44. arXiv:1903.01784  [pdf, other

    cs.CV

    Leveraging Shape Completion for 3D Siamese Tracking

    Authors: Silvio Giancola, Jesus Zarzar, Bernard Ghanem

    Abstract: Point clouds are challenging to process due to their sparsity, therefore autonomous vehicles rely more on appearance attributes than pure geometric features. However, 3D LIDAR perception can provide crucial information for urban navigation in challenging light or weather conditions. In this paper, we investigate the versatility of Shape Completion for 3D Object Tracking in LIDAR point clouds. We d… ▽ More

    Submitted 28 March, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

    Comments: Accepted in CVPR19

  45. SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

    Authors: Silvio Giancola, Mohieddine Amine, Tarek Dghaily, Bernard Ghanem

    Abstract: In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The dataset is composed of 500 complete soccer games from six main European leagues, covering three seasons from 2014 to 2017 and a total duration of 764 hours. A total of 6,637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (… ▽ More

    Submitted 22 April, 2018; v1 submitted 12 April, 2018; originally announced April 2018.

    Comments: CVPR Workshop on Computer Vision in Sports 2018

  46. arXiv:1803.10794  [pdf, other

    cs.CV cs.RO

    TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

    Authors: Matthias Müller, Adel Bibi, Silvio Giancola, Salman Al-Subaihi, Bernard Ghanem

    Abstract: Despite the numerous developments in object tracking, further development of current tracking algorithms is limited by small and mostly saturated datasets. As a matter of fact, data-hungry trackers based on deep-learning currently rely on object detection datasets due to the scarcity of dedicated large-scale tracking datasets. In this work, we present TrackingNet, the first large-scale dataset and… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

    Comments: preprint

  47. Integration of Absolute Orientation Measurements in the KinectFusion Reconstruction pipeline

    Authors: Silvio Giancola, Jens Schneider, Peter Wonka, Bernard S. Ghanem

    Abstract: In this paper, we show how absolute orientation measurements provided by low-cost but high-fidelity IMU sensors can be integrated into the KinectFusion pipeline. We show that integration improves both runtime, robustness and quality of the 3D reconstruction. In particular, we use this orientation data to seed and regularize the ICP registration technique. We also present a technique to filter the… ▽ More

    Submitted 22 April, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: CVPR Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues 2018

  48. arXiv:1708.02033  [pdf, other

    cs.CV

    A Solution for Crime Scene Reconstruction using Time-of-Flight Cameras

    Authors: Silvio Giancola, Daniele Piron, Pasquale Poppa, Remo Sala

    Abstract: In this work, we propose a method for three-dimensional (3D) reconstruction of wide crime scene, based on a Simultaneous Localization and Mapping (SLAM) approach. We used a Kinect V2 Time-of-Flight (TOF) RGB-D camera to provide colored dense point clouds at a 30 Hz frequency. This device is moved freely (6 degrees of freedom) during the scene exploration. The implemented SLAM solution aligns succe… ▽ More

    Submitted 7 August, 2017; originally announced August 2017.