Skip to main content

Showing 1–50 of 104 results for author: van Gemert, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08612  [pdf, ps, other

    cs.CV

    Data-Efficient Challenges in Visual Inductive Priors: A Retrospective

    Authors: Robert-Jan Bruintjes, Attila Lengyel, Osman Semih Kayhan, Davide Zambrano, Nergis Tömen, Hadi Jamali-Rad, Jan van Gemert

    Abstract: Deep Learning requires large amounts of data to train models that work well. In data-deficient settings, performance can be degraded. We investigate which Deep Learning methods benefit training models in a data-deficient setting, by organizing the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" workshop series, featuring four editions of data-impaired challenges. These challen… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  2. arXiv:2505.21187  [pdf, ps, other

    cs.CV

    Making Every Event Count: Balancing Data Efficiency and Accuracy in Event Camera Subsampling

    Authors: Hesam Araghi, Jan van Gemert, Nergis Tomen

    Abstract: Event cameras offer high temporal resolution and power efficiency, making them well-suited for edge AI applications. However, their high event rates present challenges for data transmission and processing. Subsampling methods provide a practical solution, but their effect on downstream visual tasks remains underexplored. In this work, we systematically evaluate six hardware-friendly subsampling me… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  3. arXiv:2505.13137  [pdf, ps, other

    cs.CV

    Learning to Adapt to Position Bias in Vision Transformer Classifiers

    Authors: Robert-Jan Bruintjes, Jan van Gemert

    Abstract: How discriminative position information is for image classification depends on the data. On the one hand, the camera position is arbitrary and objects can appear anywhere in the image, arguing for translation invariance. At the same time, position information is key for exploiting capture/center bias, and scene layout, e.g.: the sky is up. We show that position bias, the level to which a dataset i… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  4. arXiv:2503.18123  [pdf, other

    cs.CV

    End-to-End Implicit Neural Representations for Classification

    Authors: Alexander Gielisse, Jan van Gemert

    Abstract: Implicit neural representations (INRs) such as NeRF and SIREN encode a signal in neural network parameters and show excellent results for signal reconstruction. Using INRs for downstream tasks, such as classification, is however not straightforward. Inherent symmetries in the parameters pose challenges and current works primarily focus on designing architectures that are equivariant to these symme… ▽ More

    Submitted 23 March, 2025; originally announced March 2025.

    Comments: Accepted to CVPR 2025. 8 pages, supplementary material included

  5. arXiv:2503.15156  [pdf, other

    cs.CV

    ARC: Anchored Representation Clouds for High-Resolution INR Classification

    Authors: Joost Luijmes, Alexander Gielisse, Roman Knyazhitskiy, Jan van Gemert

    Abstract: Implicit neural representations (INRs) encode signals in neural network weights as a memory-efficient representation, decoupling sampling resolution from the associated resource costs. Current INR image classification methods are demonstrated on low-resolution data and are sensitive to image-space transformations. We attribute these issues to the global, fully-connected MLP neural network architec… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: Accepted at the ICLR 2025 Workshop on Neural Network Weights as a New Data Modality

  6. arXiv:2412.06439  [pdf, other

    cs.CV

    Local Attention Transformers for High-Detail Optical Flow Upsampling

    Authors: Alexander Gielisse, Nergis Tömen, Jan van Gemert

    Abstract: Most recent works on optical flow use convex upsampling as the last step to obtain high-resolution flow. In this work, we show and discuss several issues and limitations of this currently widely adopted convex upsampling approach. We propose a series of changes, in an attempt to resolve current issues. First, we propose to decouple the weights for the final convex upsampler, making it easier to fi… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: Note; this work is an extension of my Master's thesis, available as "Optical Flow Upsamplers Ignore Details: Neighborhood Attention Transformers for Convex Upsampling"

  7. arXiv:2410.11062  [pdf, other

    cs.SD cs.AI cs.CV eess.AS

    CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning

    Authors: Sjoerd Groot, Qinyu Chen, Jan C. van Gemert, Chang Gao

    Abstract: This paper presents CleanUMamba, a time-domain neural network architecture designed for real-time causal audio denoising directly applied to raw waveforms. CleanUMamba leverages a U-Net encoder-decoder structure, incorporating the Mamba state-space model in the bottleneck layer. By replacing conventional self-attention and LSTM mechanisms with Mamba, our architecture offers superior denoising perf… ▽ More

    Submitted 10 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: This paper has been accepted to be presented at the 2025 International Symposium on Circuits and Systems (ISCAS)

    Journal ref: 2025 IEEE International Symposium on Circuits and Systems (ISCAS)

  8. arXiv:2410.01376  [pdf, other

    cs.CV physics.comp-ph

    Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

    Authors: Alejandro Castañeda Garcia, Jan van Gemert, Daan Brinks, Nergis Tömen

    Abstract: Extracting physical dynamical system parameters from recorded observations is key in natural science. Current methods for automatic parameter estimation from video train supervised deep networks on large datasets. Such datasets require labels, which are difficult to acquire. While some unsupervised techniques--which depend on frame prediction--exist, they suffer from long training times, initializ… ▽ More

    Submitted 24 March, 2025; v1 submitted 2 October, 2024; originally announced October 2024.

  9. arXiv:2410.00580  [pdf, other

    cs.CV

    Deep activity propagation via weight initialization in spiking neural networks

    Authors: Aurora Micheli, Olaf Booij, Jan van Gemert, Nergis Tömen

    Abstract: Spiking Neural Networks (SNNs) and neuromorphic computing offer bio-inspired advantages such as sparsity and ultra-low power consumption, providing a promising alternative to conventional networks. However, training deep SNNs from scratch remains a challenge, as SNNs process and transmit information by quantizing the real-valued membrane potentials into binary spikes. This can lead to information… ▽ More

    Submitted 20 May, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

  10. arXiv:2409.10641  [pdf, other

    cs.CV

    HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions

    Authors: Alexandru Bobe, Jan C. van Gemert

    Abstract: Video annotation is a critical and time-consuming task in computer vision research and applications. This paper presents a novel annotation pipeline that uses pre-extracted features and dimensionality reduction to accelerate the temporal video annotation process. Our approach uses Hierarchical Stochastic Neighbor Embedding (HSNE) to create a multi-scale representation of video features, allowing a… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

  11. arXiv:2409.08953  [pdf, other

    cs.CV

    Pushing the boundaries of event subsampling in event-based video classification using CNNs

    Authors: Hesam Araghi, Jan van Gemert, Nergis Tomen

    Abstract: Event cameras offer low-power visual sensing capabilities ideal for edge-device applications. However, their high event rate, driven by high temporal details, can be restrictive in terms of bandwidth and computational resources. In edge AI applications, determining the minimum amount of events for specific tasks can allow reducing the event rate to improve bandwidth, memory, and processing efficie… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  12. arXiv:2409.08943  [pdf, other

    cs.CV eess.IV

    Pushing Joint Image Denoising and Classification to the Edge

    Authors: Thomas C Markhorst, Jan C van Gemert, Osman S Kayhan

    Abstract: In this paper, we jointly combine image classification and image denoising, aiming to enhance human perception of noisy images captured by edge devices, like low-light security cameras. In such settings, it is important to retain the ability of humans to verify the automatic classification decision and thus jointly denoise the image to enhance human perception. Since edge devices have little compu… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: Accepted paper at the ECCV 2024 workshop on Advances in Image Manipulation (AIM)

  13. arXiv:2408.10844  [pdf, other

    cs.CV

    Aligning Object Detector Bounding Boxes with Human Preference

    Authors: Ombretta Strafforello, Osman S. Kayhan, Oana Inel, Klamer Schutte, Jan van Gemert

    Abstract: Previous work shows that humans tend to prefer large bounding boxes over small bounding boxes with the same IoU. However, we show here that commonly used object detectors predict large and small boxes equally often. In this work, we investigate how to align automatically detected object boxes with human preference and study whether this improves human quality perception. We evaluate the performanc… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted paper at the ECCV 2024 workshop on Assistive Computer Vision and Robotics (ACVR)

  14. arXiv:2407.10121  [pdf, other

    cs.CV

    MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes

    Authors: Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn, Yuntae Jeon, Michael Franzen, Matthias Standfest, Jan van Gemert, Seyran Khademi

    Abstract: Diverse and realistic floor plan data are essential for the development of useful computer-aided methods in architectural design. Today's large-scale floor plan datasets predominantly feature simple floor plan layouts, typically representing single-apartment dwellings only. To compensate for the mismatch between current datasets and the real world, we develop \textbf{Modified Swiss Dwellings} (MSD… ▽ More

    Submitted 24 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Comments: ECCV 2024 (incl. Suppl. Mat.)

  15. arXiv:2406.18176  [pdf, other

    cs.CV

    VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning Challenges

    Authors: Robert-Jan Bruintjes, Attila Lengyel, Marcos Baptista Rios, Osman Semih Kayhan, Davide Zambrano, Nergis Tomen, Jan van Gemert

    Abstract: The fourth edition of the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" workshop features two data-impaired challenges. These challenges address the problem of training deep learning models for computer vision tasks with limited data. Participants are limited to training models from scratch using a low number of training samples and are not allowed to use any form of transfe… ▽ More

    Submitted 1 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.19688

  16. arXiv:2404.10718  [pdf, other

    cs.CV

    GazeHTA: End-to-end Gaze Target Detection with Head-Target Association

    Authors: Zhi-Yi Lin, Jouh Yeong Chew, Jan van Gemert, Xucong Zhang

    Abstract: Precisely detecting which object a person is paying attention to is critical for human-robot interaction since it provides important cues for the next action from the human user. We propose an end-to-end approach for gaze target detection: predicting a head-target connection between individuals and the target image regions they are looking at. Most of the existing methods use independent component… ▽ More

    Submitted 5 February, 2025; v1 submitted 16 April, 2024; originally announced April 2024.

  17. arXiv:2402.01557  [pdf, other

    cs.CV

    Deep Continuous Networks

    Authors: Nergis Tomen, Silvia L. Pintea, Jan C. van Gemert

    Abstract: CNNs and computational models of biological vision share some fundamental principles, which opened new avenues of research. However, fruitful cross-field research is hampered by conventional CNN architectures being based on spatially and depthwise discrete representations, which cannot accommodate certain aspects of biological complexity such as continuously varying receptive field sizes and dynam… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Presented at ICML 2021

    Journal ref: In International Conference on Machine Learning 2021 Jul 1 (pp. 10324-10335). PMLR

  18. arXiv:2401.17821  [pdf, other

    cs.CV cs.HC

    Do Object Detection Localization Errors Affect Human Performance and Trust?

    Authors: Sven de Witte, Ombretta Strafforello, Jan van Gemert

    Abstract: Bounding boxes are often used to communicate automatic object detection results to humans, aiding humans in a multitude of tasks. We investigate the relationship between bounding box localization errors and human task performance. We use observer performance studies on a visual multi-object counting task to measure both human trust and performance with different levels of bounding box accuracy. Th… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  19. arXiv:2311.01916  [pdf, other

    eess.IV cs.CV

    Contrast-Agnostic Groupwise Registration by Robust PCA for Quantitative Cardiac MRI

    Authors: Xinqi Li, Yi Zhang, Yidong Zhao, Jan van Gemert, Qian Tao

    Abstract: Quantitative cardiac magnetic resonance imaging (MRI) is an increasingly important diagnostic tool for cardiovascular diseases. Yet, co-registration of all baseline images within the quantitative MRI sequence is essential for the accuracy and precision of quantitative maps. However, co-registering all baseline images from a quantitative cardiac MRI sequence remains a nontrivial task because of the… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  20. arXiv:2310.19368  [pdf, other

    cs.CV

    Color Equivariant Convolutional Networks

    Authors: Attila Lengyel, Ombretta Strafforello, Robert-Jan Bruintjes, Alexander Gielisse, Jan van Gemert

    Abstract: Color is a crucial visual cue readily exploited by Convolutional Neural Networks (CNNs) for object recognition. However, CNNs struggle if there is data imbalance between color variations introduced by accidental recording conditions. Color invariance addresses this issue but does so at the cost of removing all color information, which sacrifices discriminative power. In this paper, we propose Colo… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023. Code available on https://github.com/Attila94/ceconv

  21. End-to-End Chess Recognition

    Authors: Athanasios Masouris, Jan van Gemert

    Abstract: Chess recognition is the task of extracting the chess piece configuration from a chessboard image. Current approaches use a pipeline of separate, independent, modules such as chessboard detection, square localization, and piece classification. Instead, we follow the deep learning philosophy and explore an end-to-end approach to directly predict the configuration from the image, thus avoiding the e… ▽ More

    Submitted 14 December, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: 12 pages, 7 figures

  22. arXiv:2309.06102  [pdf, other

    cs.CV

    Can we predict the Most Replayed data of video streaming platforms?

    Authors: Alessandro Duico, Ombretta Strafforello, Jan van Gemert

    Abstract: Predicting which specific parts of a video users will replay is important for several applications, including targeted advertisement placement on video platforms and assisting video creators. In this work, we explore whether it is possible to predict the Most Replayed (MR) data from YouTube videos. To this end, we curate a large video benchmark, the YTMR500 dataset, which comprises 500 YouTube vid… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Accepted Extended Abstract at ICCV 2023 Workshop on AI for Creative Video Editing and Understanding

  23. arXiv:2309.04357  [pdf, other

    cs.CV cs.GR

    SSIG: A Visually-Guided Graph Edit Distance for Floor Plan Similarity

    Authors: Casper van Engelenburg, Seyran Khademi, Jan van Gemert

    Abstract: We propose a simple yet effective metric that measures structural similarity between visual instances of architectural floor plans, without the need for learning. Qualitatively, our experiments show that the retrieval results are similar to deeply learned methods. Effectively comparing instances of floor plan data is paramount to the success of machine understanding of floor plan data, including t… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: To be published in ICCVW 2023, 10 pages

  24. arXiv:2308.13082  [pdf, other

    cs.CV

    Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

    Authors: Jan Warchocki, Teodor Oprescu, Yunhan Wang, Alexandru Damacus, Paul Misterka, Robert-Jan Bruintjes, Attila Lengyel, Ombretta Strafforello, Jan van Gemert

    Abstract: In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of-the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures ho… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted to the CVEU workshop at ICCV 2023

  25. arXiv:2308.11316  [pdf, other

    cs.CV

    Using and Abusing Equivariance

    Authors: Tom Edixhoven, Attila Lengyel, Jan van Gemert

    Abstract: In this paper we show how Group Equivariant Convolutional Neural Networks use subsampling to learn to break equivariance to their symmetries. We focus on 2D rotations and reflections and investigate the impact of broken equivariance on network performance. We show that a change in the input dimension of a network as small as a single pixel can be enough for commonly used architectures to become ap… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  26. arXiv:2308.11249  [pdf, other

    cs.CV

    Video BagNet: short temporal receptive fields increase robustness in long-term action recognition

    Authors: Ombretta Strafforello, Xin Liu, Klamer Schutte, Jan van Gemert

    Abstract: Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF). We argue that these models are not always the best choice for temporal modeling in videos. A large temporal receptive field allows the model to encode the exact sub-action order of a video, which causes a performance decrease when testing videos have a differe… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  27. arXiv:2308.11244  [pdf, other

    cs.CV

    Are current long-term video understanding datasets long-term?

    Authors: Ombretta Strafforello, Klamer Schutte, Jan van Gemert

    Abstract: Many real-world applications, from sport analysis to surveillance, benefit from automatic long-term action recognition. In the current deep learning paradigm for automatic action recognition, it is imperative that models are trained and tested on datasets and tasks that evaluate if such models actually learn and reason over long-term information. In this work, we propose a method to evaluate how s… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  28. arXiv:2308.10603  [pdf, other

    cs.CV

    A step towards understanding why classification helps regression

    Authors: Silvia L. Pintea, Yancong Lin, Jouke Dijkstra, Jan C. van Gemert

    Abstract: A number of computer vision deep regression approaches report improved results when adding a classification loss to the regression loss. Here, we explore why this is useful in practice and when it is beneficial. To do so, we start from precisely controlled dataset variations and data samplings and find that the effect of adding a classification loss is the most pronounced for regression with imbal… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV-2023

  29. arXiv:2308.05533  [pdf, other

    cs.CV

    Is there progress in activity progress prediction?

    Authors: Frans de Boer, Jan C. van Gemert, Jouke Dijkstra, Silvia L. Pintea

    Abstract: Activity progress prediction aims to estimate what percentage of an activity has been completed. Currently this is done with machine learning approaches, trained and evaluated on complicated and realistic video datasets. The videos in these datasets vary drastically in length and appearance. And some of the activities have unanticipated developments, making activity progression difficult to estima… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCVw-2023 (AI for Creative Video Editing and Understanding, ICCV workshop 2023)

  30. arXiv:2308.04770  [pdf, other

    cs.CV

    Objects do not disappear: Video object detection by single-frame object location anticipation

    Authors: Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij, Silvia L. Pintea

    Abstract: Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neig… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  31. arXiv:2307.08483  [pdf, other

    cs.CV

    Differentiable Transportation Pruning

    Authors: Yunqiang Li, Jan C. van Gemert, Torsten Hoefler, Bert Moons, Evangelos Eleftheriou, Bram-Ernst Verhoef

    Abstract: Deep learning algorithms are increasingly employed at the edge. However, edge devices are resource constrained and thus require efficient deployment of deep neural networks. Pruning methods are a key tool for edge deployment as they can improve storage, compute, memory bandwidth, and energy usage. In this paper we propose a novel accurate pruning technique that allows precise control over the outp… ▽ More

    Submitted 31 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: ICCV 2023

  32. arXiv:2305.19688  [pdf, other

    cs.CV

    VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning Challenges

    Authors: Robert-Jan Bruintjes, Attila Lengyel, Marcos Baptista Rios, Osman Semih Kayhan, Davide Zambrano, Nergis Tomen, Jan van Gemert

    Abstract: The third edition of the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" workshop featured four data-impaired challenges, focusing on addressing the limitations of data availability in training deep learning models for computer vision tasks. The challenges comprised of four distinct data-impaired tasks, where participants were required to train models from scratch using a redu… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: text overlap with arXiv:2201.08625

  33. arXiv:2304.02628  [pdf, other

    cs.CV cs.AI

    What Affects Learned Equivariance in Deep Image Recognition Models?

    Authors: Robert-Jan Bruintjes, Tomasz Motyka, Jan van Gemert

    Abstract: Equivariance w.r.t. geometric transformations in neural networks improves data efficiency, parameter efficiency and robustness to out-of-domain perspective shifts. When equivariance is not designed into a neural network, the network can still learn equivariant functions from the data. We quantify this learned equivariance, by proposing an improved measure for equivariance. We find evidence for a c… ▽ More

    Submitted 7 April, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

    Comments: Accepted at CVPR workshop L3D-IVU 2023

  34. arXiv:2303.02452  [pdf, other

    cs.CV cs.LG

    Understanding weight-magnitude hyperparameters in training binary networks

    Authors: Joris Quist, Yunqiang Li, Jan van Gemert

    Abstract: Binary Neural Networks (BNNs) are compact and efficient by using binary weights instead of real-valued weights. Current BNNs use latent real-valued weights during training, where several training hyper-parameters are inherited from real-valued networks. The interpretation of several of these hyperparameters is based on the magnitude of the real-valued weights. For BNNs, however, the magnitude of b… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Comments: Conference: ICLR 2023

  35. Towards Single Camera Human 3D-Kinematics

    Authors: Marian Bittner, Wei-Tse Yang, Xucong Zhang, Ajay Seth, Jan van Gemert, Frans C. T. van der Helm

    Abstract: Markerless estimation of 3D Kinematics has the great potential to clinically diagnose and monitor movement disorders without referrals to expensive motion capture labs; however, current approaches are limited by performing multiple de-coupled steps to estimate the kinematics of a person from videos. Most current techniques work in a multi-step approach by first detecting the pose of the body and t… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: Published in the MDPI Sensors special Issue "Sensors and Musculoskeletal Dynamics to Evaluate Human Movement" on December 28, 2022

    Journal ref: Sensors 2023, 23(1), 341

  36. arXiv:2211.14074  [pdf, other

    cs.CV

    Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation

    Authors: Liang Zeng, Attila Lengyel, Nergis Tömen, Jan van Gemert

    Abstract: In this work, we leverage estimated depth to boost self-supervised contrastive learning for segmentation of urban scenes, where unlabeled videos are readily available for training self-supervised depth estimation. We argue that the semantics of a coherent group of pixels in 3D space is self-contained and invariant to the contexts in which they appear. We group coherent, semantically related pixels… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: BMVC 2022 Best Student Paper Award(Honourable Mention)

  37. arXiv:2210.13858  [pdf, other

    cs.LG cs.AI cs.CV

    LAB: Learnable Activation Binarizer for Binary Neural Networks

    Authors: Sieger Falkena, Hadi Jamali-Rad, Jan van Gemert

    Abstract: Binary Neural Networks (BNNs) are receiving an upsurge of attention for bringing power-hungry deep learning towards edge devices. The traditional wisdom in this space is to employ sign() for binarizing featuremaps. We argue and illustrate that sign() is a uniqueness bottleneck, limiting information propagation throughout the network. To alleviate this, we propose to dispense sign(), replacing it w… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: This paper is accepted to appear in the proceedings of WACV 2023

  38. arXiv:2208.13555  [pdf, other

    cs.CV

    Explainability of Deep Learning models for Urban Space perception

    Authors: Ruben Sangers, Jan van Gemert, Sander van Cranenburgh

    Abstract: Deep learning based computer vision models are increasingly used by urban planners to support decision making for shaping urban environments. Such models predict how people perceive the urban environment quality in terms of e.g. its safety or beauty. However, the blackbox nature of deep learning models hampers urban planners to understand what landscape objects contribute to a particularly high qu… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: 5 pages, 3 figures

  39. arXiv:2208.02509  [pdf, other

    cs.CV

    Heart rate estimation in intense exercise videos

    Authors: Yeshwanth Napolean, Anwesh Marwade, Nergis Tomen, Puck Alkemade, Thijs Eijsvogels, Jan van Gemert

    Abstract: Estimating heart rate from video allows non-contact health monitoring with applications in patient care, human interaction, and sports. Existing work can robustly measure heart rate under some degree of motion by face tracking. However, this is not always possible in unconstrained settings, as the face might be occluded or even outside the camera. Here, we present IntensePhysio: a challenging vide… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: 4 pages, 4 figures, accepted at ICIP 2022

  40. arXiv:2207.14221  [pdf, other

    cs.CV cs.HC

    Humans disagree with the IoU for measuring object detector localization error

    Authors: Ombretta Strafforello, Vanathi Rajasekart, Osman S. Kayhan, Oana Inel, Jan van Gemert

    Abstract: The localization quality of automatic object detectors is typically evaluated by the Intersection over Union (IoU) score. In this work, we show that humans have a different view on localization quality. To evaluate this, we conduct a survey with more than 70 participants. Results show that for localization errors with the exact same IoU score, humans might not consider that these errors are equal,… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Published at ICIP 2022. Ombretta Strafforello, Vanathi Rajasekart, Osman S. Kayhan and Oana Inel contributed equally to this work

  41. arXiv:2206.00506  [pdf, other

    cs.CV cs.AI cs.LG

    Proximally Sensitive Error for Anomaly Detection and Feature Learning

    Authors: Amogh Gudi, Fritjof Büttner, Jan van Gemert

    Abstract: Mean squared error (MSE) is one of the most widely used metrics to expression differences between multi-dimensional entities, including images. However, MSE is not locally sensitive as it does not take into account the spatial arrangement of the (pixel) differences, which matters for structured data types like images. Such spatial arrangements carry information about the source of the differences;… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  42. arXiv:2205.02887  [pdf, other

    cs.CV cs.AI cs.LG

    Evaluating Context for Deep Object Detectors

    Authors: Osman Semih Kayhan, Jan C. van Gemert

    Abstract: Which object detector is suitable for your context sensitive task? Deep object detectors exploit scene context for recognition differently. In this paper, we group object detectors into 3 categories in terms of context use: no context by cropping the input (RCNN), partial context by cropping the featuremap (two-stage methods) and full context without any cropping (single-stage methods). We systema… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: 4 pages, 5 figures

  43. arXiv:2203.16291  [pdf, other

    cs.CV

    AmsterTime: A Visual Place Recognition Benchmark Dataset for Severe Domain Shift

    Authors: Burak Yildiz, Seyran Khademi, Ronald Maria Siebes, Jan van Gemert

    Abstract: We introduce AmsterTime: a challenging dataset to benchmark visual place recognition (VPR) in presence of a severe domain shift. AmsterTime offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existi… ▽ More

    Submitted 26 June, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted to ICPR 2022 (26th International Conference on Pattern Recognition), Dataset and evaluation code: https://github.com/seyrankhademi/AmsterTime

  44. arXiv:2203.08586  [pdf, other

    cs.CV

    Deep vanishing point detection: Geometric priors make dataset variations vanish

    Authors: Yancong Lin, Ruben Wiersma, Silvia L. Pintea, Klaus Hildebrandt, Elmar Eisemann, Jan C. van Gemert

    Abstract: Deep learning has improved vanishing point detection in images. Yet, deep networks require expensive annotated datasets trained on costly hardware and do not generalize to even slightly different domains, and minor problem variants. Here, we address these issues by injecting deep vanishing point detection networks with prior knowledge. This prior knowledge no longer needs to be learned from data,… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: CVPR2022, code available at https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere

  45. arXiv:2201.08625  [pdf, other

    cs.CV cs.AI

    VIPriors 2: Visual Inductive Priors for Data-Efficient Deep Learning Challenges

    Authors: Attila Lengyel, Robert-Jan Bruintjes, Marcos Baptista Rios, Osman Semih Kayhan, Davide Zambrano, Nergis Tomen, Jan van Gemert

    Abstract: The second edition of the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" challenges featured five data-impaired challenges, where models are trained from scratch on a reduced number of training samples for various key computer vision tasks. To encourage new and creative ideas on incorporating relevant inductive biases to improve the data efficiency of deep learning models, we… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

    Comments: 11 pages, 11 figures

  46. arXiv:2112.12579  [pdf, other

    cs.CV

    NeRD++: Improved 3D-mirror symmetry learning from a single image

    Authors: Yancong Lin, Silvia-Laura Pintea, Jan van Gemert

    Abstract: Many objects are naturally symmetric, and this symmetry can be exploited to infer unseen 3D properties from a single 2D image. Recently, NeRD is proposed for accurate 3D mirror plane estimation from a single image. Despite the unprecedented accuracy, it relies on large annotated datasets for training and suffers from slow inference. Here we aim to improve its data and compute efficiency. We do awa… ▽ More

    Submitted 7 October, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: BMVC 2022

  47. arXiv:2112.03406  [pdf, other

    cs.LG cs.CV

    Equal Bits: Enforcing Equally Distributed Binary Network Weights

    Authors: Yunqiang Li, Silvia L. Pintea, Jan C. van Gemert

    Abstract: Binary networks are extremely efficient as they use only two symbols to define the network: $\{+1,-1\}$. One can make the prior distribution of these symbols a design choice. The recent IR-Net of Qin et al. argues that imposing a Bernoulli distribution with equal priors (equal bit ratios) over the binary weights leads to maximum entropy and thus minimizes information loss. However, prior work cann… ▽ More

    Submitted 6 March, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

  48. arXiv:2111.06660  [pdf, other

    cs.CV

    Frequency learning for structured CNN filters with Gaussian fractional derivatives

    Authors: Nikhil Saldanha, Silvia L. Pintea, Jan C. van Gemert, Nergis Tomen

    Abstract: Frequency information lies at the base of discriminating between textures, and therefore between different objects. Classical CNN architectures limit the frequency learning through fixed filter sizes, and lack a way of explicitly controlling it. Here, we build on the structured receptive field filters with Gaussian derivative basis. Yet, rather than using predetermined derivative orders, which typ… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: Accepted at BMVC 2021

  49. arXiv:2110.12216  [pdf, other

    cs.CV cs.LG

    Domain Adaptation for Rare Classes Augmented with Synthetic Samples

    Authors: Tuhin Das, Robert-Jan Bruintjes, Attila Lengyel, Jan van Gemert, Sara Beery

    Abstract: To alleviate lower classification performance on rare classes in imbalanced datasets, a possible solution is to augment the underrepresented classes with synthetic samples. Domain adaptation can be incorporated in a classifier to decrease the domain discrepancy between real and synthetic samples. While domain adaptation is generally applied on completely synthetic source domains and real target do… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: 14 pages, 6 figures, to be published

  50. arXiv:2110.08059  [pdf, other

    cs.CV cs.LG

    FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

    Authors: David W. Romero, Robert-Jan Bruintjes, Jakub M. Tomczak, Erik J. Bekkers, Mark Hoogendoorn, Jan C. van Gemert

    Abstract: When designing Convolutional Neural Networks (CNNs), one must select the size\break of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size h… ▽ More

    Submitted 17 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: First two authors contributed equally to this work