Skip to main content

Showing 1–9 of 9 results for author: Nejadasl, F K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.17201  [pdf, ps, other

    cs.CV

    A Framework for Multi-View Multiple Object Tracking using Single-View Multi-Object Trackers on Fish Data

    Authors: Chaim Chai Elchik, Fatemeh Karimi Nejadasl, Seyed Sahand Mohammadi Ziabari, Ali Mohammed Mansoor Alsahag

    Abstract: Multi-object tracking (MOT) in computer vision has made significant advancements, yet tracking small fish in underwater environments presents unique challenges due to complex 3D motions and data noise. Traditional single-view MOT models often fall short in these settings. This thesis addresses these challenges by adapting state-of-the-art single-view MOT models, FairMOT and YOLOv8, for underwater… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2407.18289  [pdf, other

    cs.CV

    MARINE: A Computer Vision Model for Detecting Rare Predator-Prey Interactions in Animal Videos

    Authors: Zsófia Katona, Seyed Sahand Mohammadi Ziabari, Fatemeh Karimi Nejadasl

    Abstract: Encounters between predator and prey play an essential role in ecosystems, but their rarity makes them difficult to detect in video recordings. Although advances in action recognition (AR) and temporal action detection (AD), especially transformer-based models and vision foundation models, have achieved high performance on human action datasets, animal videos remain relatively under-researched. Th… ▽ More

    Submitted 5 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

    Comments: This is an MSc thesis by Zsofia Katona, supervised by the two other authors

  3. arXiv:2407.18288  [pdf, other

    cs.CV

    Leveraging Foundation Models via Knowledge Distillation in Multi-Object Tracking: Distilling DINOv2 Features to FairMOT

    Authors: Niels G. Faber, Seyed Sahand Mohammadi Ziabari, Fatemeh Karimi Nejadasl

    Abstract: Multiple Object Tracking (MOT) is a computer vision task that has been employed in a variety of sectors. Some common limitations in MOT are varying object appearances, occlusions, or crowded scenes. To address these challenges, machine learning methods have been extensively deployed, leveraging large datasets, sophisticated models, and substantial computational resources. Due to practical limitati… ▽ More

    Submitted 5 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

    Comments: This is an MSc thesis by Niels Faber, supervised by the two other authors

  4. arXiv:2406.09126  [pdf, other

    cs.CV

    3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation

    Authors: Weijie Wei, Osman Ülger, Fatemeh Karimi Nejadasl, Theo Gevers, Martin R. Oswald

    Abstract: Open-Vocabulary Segmentation (OVS) methods offer promising capabilities in detecting unseen object categories, but the category must be known and needs to be provided by a human, either via a text prompt or pre-labeled datasets, thus limiting their scalability. We propose 3D-AVS, a method for Auto-Vocabulary Segmentation of 3D point clouds for which the vocabulary is unknown and auto-generated for… ▽ More

    Submitted 30 March, 2025; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: v3 is the camera-ready version for CVPR 2025, while v2 serves as both a preview and the camera-ready version for the CVPR 2024 OpenSun3D Workshop

  5. arXiv:2312.10217  [pdf, other

    cs.CV

    T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning

    Authors: Weijie Wei, Fatemeh Karimi Nejadasl, Theo Gevers, Martin R. Oswald

    Abstract: The scarcity of annotated data in LiDAR point cloud understanding hinders effective representation learning. Consequently, scholars have been actively investigating efficacious self-supervised pre-training paradigms. Nevertheless, temporal information, which is inherent in the LiDAR point cloud sequence, is consistently disregarded. To better utilize this property, we propose an effective pre-trai… ▽ More

    Submitted 22 July, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted to ECCV 2024

  6. arXiv:2309.17162  [pdf, other

    cs.CV

    APNet: Urban-level Scene Segmentation of Aerial Images and Point Clouds

    Authors: Weijie Wei, Martin R. Oswald, Fatemeh Karimi Nejadasl, Theo Gevers

    Abstract: In this paper, we focus on semantic segmentation method for point clouds of urban scenes. Our fundamental concept revolves around the collaborative utilization of diverse scene representations to benefit from different context information and network architectures. To this end, the proposed network architecture, called APNet, is split into two branches: a point cloud branch and an aerial image bra… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV Workshop 2023 and selected as an oral

  7. arXiv:2308.04770  [pdf, other

    cs.CV

    Objects do not disappear: Video object detection by single-frame object location anticipation

    Authors: Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij, Silvia L. Pintea

    Abstract: Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neig… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  8. arXiv:2103.15395  [pdf, other

    cs.CV

    No frame left behind: Full Video Action Recognition

    Authors: Xin Liu, Silvia L. Pintea, Fatemeh Karimi Nejadasl, Olaf Booij, Jan C. van Gemert

    Abstract: Not all video frames are equally informative for recognizing an action. It is computationally infeasible to train deep networks on all video frames when actions develop over hundreds of frames. A common heuristic is uniformly sampling a small number of video frames and using these to recognize the action. Instead, here we propose full video action recognition and consider all video frames. To make… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR 2021

  9. arXiv:2007.12668  [pdf, other

    cs.CV cs.LG

    KPRNet: Improving projection-based LiDAR semantic segmentation

    Authors: Deyvid Kochanov, Fatemeh Karimi Nejadasl, Olaf Booij

    Abstract: Semantic segmentation is an important component in the perception systems of autonomous vehicles. In this work, we adopt recent advances in both image and point cloud segmentation to achieve a better accuracy in the task of segmenting LiDAR scans. KPRNet improves the convolutional neural network architecture of 2D projection methods and utilizes KPConv to replace the commonly used post-processing… ▽ More

    Submitted 21 August, 2020; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: "ECCV 2020. Code and pre-trained models at https://github.com/DeyvidKochanov-TomTom/kprnet"