-
Deep Fusion of Ultra-Low-Resolution Thermal Camera and Gyroscope Data for Lighting-Robust and Compute-Efficient Rotational Odometry
Authors:
Farida Mohsen,
Ali Safa
Abstract:
Accurate rotational odometry is crucial for autonomous robotic systems, particularly for small, power-constrained platforms such as drones and mobile robots. This study introduces thermal-gyro fusion, a novel sensor fusion approach that integrates ultra-low-resolution thermal imaging with gyroscope readings for rotational odometry. Unlike RGB cameras, thermal imaging is invariant to lighting condi…
▽ More
Accurate rotational odometry is crucial for autonomous robotic systems, particularly for small, power-constrained platforms such as drones and mobile robots. This study introduces thermal-gyro fusion, a novel sensor fusion approach that integrates ultra-low-resolution thermal imaging with gyroscope readings for rotational odometry. Unlike RGB cameras, thermal imaging is invariant to lighting conditions and, when fused with gyroscopic data, mitigates drift which is a common limitation of inertial sensors. We first develop a multimodal data acquisition system to collect synchronized thermal and gyroscope data, along with rotational speed labels, across diverse environments. Subsequently, we design and train a lightweight Convolutional Neural Network (CNN) that fuses both modalities for rotational speed estimation. Our analysis demonstrates that thermal-gyro fusion enables a significant reduction in thermal camera resolution without significantly compromising accuracy, thereby improving computational efficiency and memory utilization. These advantages make our approach well-suited for real-time deployment in resource-constrained robotic systems. Finally, to facilitate further research, we publicly release our dataset as supplementary material.
△ Less
Submitted 14 June, 2025;
originally announced June 2025.
-
Federated Learning of Low-Rank One-Shot Image Detection Models in Edge Devices with Scalable Accuracy and Compute Complexity
Authors:
Abdul Hannaan,
Zubair Shah,
Aiman Erbad,
Amr Mohamed,
Ali Safa
Abstract:
This paper introduces a novel federated learning framework termed LoRa-FL designed for training low-rank one-shot image detection models deployed on edge devices. By incorporating low-rank adaptation techniques into one-shot detection architectures, our method significantly reduces both computational and communication overhead while maintaining scalable accuracy. The proposed framework leverages f…
▽ More
This paper introduces a novel federated learning framework termed LoRa-FL designed for training low-rank one-shot image detection models deployed on edge devices. By incorporating low-rank adaptation techniques into one-shot detection architectures, our method significantly reduces both computational and communication overhead while maintaining scalable accuracy. The proposed framework leverages federated learning to collaboratively train lightweight image recognition models, enabling rapid adaptation and efficient deployment across heterogeneous, resource-constrained devices. Experimental evaluations on the MNIST and CIFAR10 benchmark datasets, both in an independent-and-identically-distributed (IID) and non-IID setting, demonstrate that our approach achieves competitive detection performance while significantly reducing communication bandwidth and compute complexity. This makes it a promising solution for adaptively reducing the communication and compute power overheads, while not sacrificing model accuracy.
△ Less
Submitted 23 April, 2025;
originally announced April 2025.
-
Benchmarking Online Object Trackers for Underwater Robot Position Locking Applications
Authors:
Ali Safa,
Waqas Aman,
Ali Al-Zawqari,
Saif Al-Kuwari
Abstract:
Autonomously controlling the position of Remotely Operated underwater Vehicles (ROVs) is of crucial importance for a wide range of underwater engineering applications, such as in the inspection and maintenance of underwater industrial structures. Consequently, studying vision-based underwater robot navigation and control has recently gained increasing attention to counter the numerous challenges f…
▽ More
Autonomously controlling the position of Remotely Operated underwater Vehicles (ROVs) is of crucial importance for a wide range of underwater engineering applications, such as in the inspection and maintenance of underwater industrial structures. Consequently, studying vision-based underwater robot navigation and control has recently gained increasing attention to counter the numerous challenges faced in underwater conditions, such as lighting variability, turbidity, camera image distortions (due to bubbles), and ROV positional disturbances (due to underwater currents). In this paper, we propose (to the best of our knowledge) a first rigorous unified benchmarking of more than seven Machine Learning (ML)-based one-shot object tracking algorithms for vision-based position locking of ROV platforms. We propose a position-locking system that processes images of an object of interest in front of which the ROV must be kept stable. Then, our proposed system uses the output result of different object tracking algorithms to automatically correct the position of the ROV against external disturbances. We conducted numerous real-world experiments using a BlueROV2 platform within an indoor pool and provided clear demonstrations of the strengths and weaknesses of each tracking approach. Finally, to help alleviate the scarcity of underwater ROV data, we release our acquired data base as open-source with the hope of benefiting future research.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
Co-Design of a Robot Controller Board and Indoor Positioning System for IoT-Enabled Applications
Authors:
Ali Safa,
Ali Al-Zawqari
Abstract:
This paper describes the development of a cost-effective yet precise indoor robot navigation system composed of a custom robot controller board and an indoor positioning system. First, the proposed robot controller board has been specially designed for emerging IoT-based robot applications and is capable of driving two 6-Amp motor channels. The controller board also embeds an on-board micro-contro…
▽ More
This paper describes the development of a cost-effective yet precise indoor robot navigation system composed of a custom robot controller board and an indoor positioning system. First, the proposed robot controller board has been specially designed for emerging IoT-based robot applications and is capable of driving two 6-Amp motor channels. The controller board also embeds an on-board micro-controller with WIFI connectivity, enabling robot-to-server communications for IoT applications. Then, working together with the robot controller board, the proposed positioning system detects the robot's location using a down-looking webcam and uses the robot's position on the webcam images to estimate the real-world position of the robot in the environment. The positioning system can then send commands via WIFI to the robot in order to steer it to any arbitrary location in the environment. Our experiments show that the proposed system reaches a navigation error smaller or equal to 0.125 meters while being more than two orders of magnitude more cost-effective compared to off-the-shelve motion capture (MOCAP) positioning systems.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
Rotational Odometry using Ultra Low Resolution Thermal Cameras
Authors:
Ali Safa
Abstract:
This letter provides what is, to the best of our knowledge, a first study on the applicability of ultra-low-resolution thermal cameras for providing rotational odometry measurements to navigational devices such as rovers and drones. Our use of an ultra-low-resolution thermal camera instead of other modalities such as an RGB camera is motivated by its robustness to lighting conditions, while being…
▽ More
This letter provides what is, to the best of our knowledge, a first study on the applicability of ultra-low-resolution thermal cameras for providing rotational odometry measurements to navigational devices such as rovers and drones. Our use of an ultra-low-resolution thermal camera instead of other modalities such as an RGB camera is motivated by its robustness to lighting conditions, while being one order of magnitude less cost-expensive compared to higher-resolution thermal cameras. After setting up a custom data acquisition system and acquiring thermal camera data together with its associated rotational speed label, we train a small 4-layer Convolutional Neural Network (CNN) for regressing the rotational speed from the thermal data. Experiments and ablation studies are conducted for determining the impact of thermal camera resolution and the number of successive frames on the CNN estimation precision. Finally, our novel dataset for the study of low-resolution thermal odometry is openly released with the hope of benefiting future research.
△ Less
Submitted 2 November, 2024;
originally announced November 2024.
-
A Systematic Survey on Instructional Text: From Representation Formats to Downstream NLP Tasks
Authors:
Abdulfattah Safa,
Tamta Kapanadze,
Arda Uzunoğlu,
Gözde Gül Şahin
Abstract:
Recent advances in large language models have demonstrated promising capabilities in following simple instructions through instruction tuning. However, real-world tasks often involve complex, multi-step instructions that remain challenging for current NLP systems. Despite growing interest in this area, there lacks a comprehensive survey that systematically analyzes the landscape of complex instruc…
▽ More
Recent advances in large language models have demonstrated promising capabilities in following simple instructions through instruction tuning. However, real-world tasks often involve complex, multi-step instructions that remain challenging for current NLP systems. Despite growing interest in this area, there lacks a comprehensive survey that systematically analyzes the landscape of complex instruction understanding and processing. Through a systematic review of the literature, we analyze available resources, representation schemes, and downstream tasks related to instructional text. Our study examines 177 papers, identifying trends, challenges, and opportunities in this emerging field. We provide AI/NLP researchers with essential background knowledge and a unified view of various approaches to complex instruction understanding, bridging gaps between different research directions and highlighting future research opportunities.
△ Less
Submitted 30 October, 2024; v1 submitted 24 October, 2024;
originally announced October 2024.
-
A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding
Authors:
Abdulfattah Safa,
Gözde Gül Şahin
Abstract:
Dialogue State Tracking (DST) is crucial for understanding user needs and executing appropriate system actions in task-oriented dialogues. Majority of existing DST methods are designed to work within predefined ontologies and assume the availability of gold domain labels, struggling with adapting to new slots values. While Large Language Models (LLMs)-based systems show promising zero-shot DST per…
▽ More
Dialogue State Tracking (DST) is crucial for understanding user needs and executing appropriate system actions in task-oriented dialogues. Majority of existing DST methods are designed to work within predefined ontologies and assume the availability of gold domain labels, struggling with adapting to new slots values. While Large Language Models (LLMs)-based systems show promising zero-shot DST performance, they either require extensive computational resources or they underperform existing fully-trained systems, limiting their practicality. To address these limitations, we propose a zero-shot, open-vocabulary system that integrates domain classification and DST in a single pipeline. Our approach includes reformulating DST as a question-answering task for less capable models and employing self-refining prompts for more adaptable ones. Our system does not rely on fixed slot values defined in the ontology allowing the system to adapt dynamically. We compare our approach with existing SOTA, and show that it provides up to 20% better Joint Goal Accuracy (JGA) over previous methods on datasets like Multi-WOZ 2.1, with up to 90% fewer requests to the LLM API.
△ Less
Submitted 7 March, 2025; v1 submitted 24 September, 2024;
originally announced September 2024.
-
Continual Learning with Hebbian Plasticity in Sparse and Predictive Coding Networks: A Survey and Perspective
Authors:
Ali Safa
Abstract:
Recently, the use of bio-inspired learning techniques such as Hebbian learning and its closely-related Spike-Timing-Dependent Plasticity (STDP) variant have drawn significant attention for the design of compute-efficient AI systems that can continuously learn on-line at the edge. A key differentiating factor regarding this emerging class of neuromorphic continual learning system lies in the fact t…
▽ More
Recently, the use of bio-inspired learning techniques such as Hebbian learning and its closely-related Spike-Timing-Dependent Plasticity (STDP) variant have drawn significant attention for the design of compute-efficient AI systems that can continuously learn on-line at the edge. A key differentiating factor regarding this emerging class of neuromorphic continual learning system lies in the fact that learning must be carried using a data stream received in its natural order, as opposed to conventional gradient-based offline training, where a static training dataset is assumed available a priori and randomly shuffled to make the training set independent and identically distributed (i.i.d). In contrast, the emerging class of neuromorphic continual learning systems covered in this survey must learn to integrate new information on the fly in a non-i.i.d manner, which makes these systems subject to catastrophic forgetting. In order to build the next generation of neuromorphic AI systems that can continuously learn at the edge, a growing number of research groups are studying the use of Sparse and Predictive Coding-based Hebbian neural network architectures and the related Spiking Neural Networks (SNNs) equipped with STDP learning. However, since this research field is still emerging, there is a need for providing a holistic view of the different approaches proposed in the literature so far. To this end, this survey covers a number of recent works in the field of neuromorphic continual learning based on state-of-the-art Sparse and Predictive Coding technology; provides background theory to help interested researchers quickly learn the key concepts; and discusses important future research questions in light of the different works covered in this paper. It is hoped that this survey will contribute towards future research in the field of neuromorphic continual learning.
△ Less
Submitted 17 November, 2024; v1 submitted 24 July, 2024;
originally announced July 2024.
-
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset
Authors:
Arda Uzunoglu,
Abdalfatah Rashid Safa,
Gözde Gül Şahin
Abstract:
Recently, there has been growing interest within the community regarding whether large language models are capable of planning or executing plans. However, most prior studies use LLMs to generate high-level plans for simplified scenarios lacking linguistic complexity and domain diversity, limiting analysis of their planning abilities. These setups constrain evaluation methods (e.g., predefined act…
▽ More
Recently, there has been growing interest within the community regarding whether large language models are capable of planning or executing plans. However, most prior studies use LLMs to generate high-level plans for simplified scenarios lacking linguistic complexity and domain diversity, limiting analysis of their planning abilities. These setups constrain evaluation methods (e.g., predefined action space), architectural choices (e.g., only generative models), and overlook the linguistic nuances essential for realistic analysis. To tackle this, we present PARADISE, an abductive reasoning task using Q\&A format on practical procedural text sourced from wikiHow. It involves warning and tip inference tasks directly associated with goals, excluding intermediary steps, with the aim of testing the ability of the models to infer implicit knowledge of the plan solely from the given goal. Our experiments, utilizing fine-tuned language models and zero-shot prompting, reveal the effectiveness of task-specific small models over large language models in most scenarios. Despite advancements, all models fall short of human performance. Notably, our analysis uncovers intriguing insights, such as variations in model behavior with dropped keywords, struggles of BERT-family and GPT-4 with physical and abstract goals, and the proposed tasks offering valuable prior knowledge for other unseen procedural tasks. The PARADISE dataset and associated resources are publicly available for further research exploration with https://github.com/GGLAB-KU/paradise.
△ Less
Submitted 6 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Towards Chip-in-the-loop Spiking Neural Network Training via Metropolis-Hastings Sampling
Authors:
Ali Safa,
Vikrant Jaltare,
Samira Sebt,
Kameron Gano,
Johannes Leugering,
Georges Gielen,
Gert Cauwenberghs
Abstract:
This paper studies the use of Metropolis-Hastings sampling for training Spiking Neural Network (SNN) hardware subject to strong unknown non-idealities, and compares the proposed approach to the common use of the backpropagation of error (backprop) algorithm and surrogate gradients, widely used to train SNNs in literature. Simulations are conducted within a chip-in-the-loop training context, where…
▽ More
This paper studies the use of Metropolis-Hastings sampling for training Spiking Neural Network (SNN) hardware subject to strong unknown non-idealities, and compares the proposed approach to the common use of the backpropagation of error (backprop) algorithm and surrogate gradients, widely used to train SNNs in literature. Simulations are conducted within a chip-in-the-loop training context, where an SNN subject to unknown distortion must be trained to detect cancer from measurements, within a biomedical application context. Our results show that the proposed approach strongly outperforms the use of backprop by up to $27\%$ higher accuracy when subject to strong hardware non-idealities. Furthermore, our results also show that the proposed approach outperforms backprop in terms of SNN generalization, needing $>10 \times$ less training data for achieving effective accuracy. These findings make the proposed training approach well-suited for SNN implementations in analog subthreshold circuits and other emerging technologies where unknown hardware non-idealities can jeopardize backprop.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation
Authors:
Ali Safa,
Wout Mommen,
Lars Keuninckx
Abstract:
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor processed by a Spiking Neural Network (SNN) followed by Sparse Segmentation and feature-based gesture classification via Robust Principal Component Analysis (R-PCA). Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations whil…
▽ More
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor processed by a Spiking Neural Network (SNN) followed by Sparse Segmentation and feature-based gesture classification via Robust Principal Component Analysis (R-PCA). Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations while being significantly less expensive compared to high-frequency radars, time-of-flight cameras and high-resolution thermal sensors previously used in literature. Crucially, this paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity compared to deep learning approaches, while reaching a top gesture recognition accuracy of 93.9% using a 5-class thermal camera dataset acquired in a car cabin, within an automotive context. Our dataset is released for helping future research.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Active Inference in Hebbian Learning Networks
Authors:
Ali Safa,
Tim Verbelen,
Lars Keuninckx,
Ilja Ocket,
André Bourdoux,
Francky Catthoor,
Georges Gielen,
Gert Cauwenberghs
Abstract:
This work studies how brain-inspired neural ensembles equipped with local Hebbian plasticity can perform active inference (AIF) in order to control dynamical agents. A generative model capturing the environment dynamics is learned by a network composed of two distinct Hebbian ensembles: a posterior network, which infers latent states given the observations, and a state transition network, which pr…
▽ More
This work studies how brain-inspired neural ensembles equipped with local Hebbian plasticity can perform active inference (AIF) in order to control dynamical agents. A generative model capturing the environment dynamics is learned by a network composed of two distinct Hebbian ensembles: a posterior network, which infers latent states given the observations, and a state transition network, which predicts the next expected latent state given current state-action pairs. Experimental studies are conducted using the Mountain Car environment from the OpenAI gym suite, to study the effect of the various Hebbian network parameters on the task performance. It is shown that the proposed Hebbian AIF approach outperforms the use of Q-learning, while not requiring any replay buffer, as in typical reinforcement learning systems. These results motivate further investigations of Hebbian learning for the design of AIF networks that can learn environment dynamics without the need for revisiting past buffered experiences.
△ Less
Submitted 22 June, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Open the box of digital neuromorphic processor: Towards effective algorithm-hardware co-design
Authors:
Guangzhi Tang,
Ali Safa,
Kevin Shidqi,
Paul Detterer,
Stefano Traferro,
Mario Konijnenburg,
Manolis Sifalakis,
Gert-Jan van Schaik,
Amirreza Yousefzadeh
Abstract:
Sparse and event-driven spiking neural network (SNN) algorithms are the ideal candidate solution for energy-efficient edge computing. Yet, with the growing complexity of SNN algorithms, it isn't easy to properly benchmark and optimize their computational cost without hardware in the loop. Although digital neuromorphic processors have been widely adopted to benchmark SNN algorithms, their black-box…
▽ More
Sparse and event-driven spiking neural network (SNN) algorithms are the ideal candidate solution for energy-efficient edge computing. Yet, with the growing complexity of SNN algorithms, it isn't easy to properly benchmark and optimize their computational cost without hardware in the loop. Although digital neuromorphic processors have been widely adopted to benchmark SNN algorithms, their black-box nature is problematic for algorithm-hardware co-optimization. In this work, we open the black box of the digital neuromorphic processor for algorithm designers by presenting the neuron processing instruction set and detailed energy consumption of the SENeCA neuromorphic architecture. For convenient benchmarking and optimization, we provide the energy cost of the essential neuromorphic components in SENeCA, including neuron models and learning rules. Moreover, we exploit the SENeCA's hierarchical memory and exhibit an advantage over existing neuromorphic processors. We show the energy efficiency of SNN algorithms for video processing and online learning, and demonstrate the potential of our work for optimizing algorithm designs. Overall, we present a practical approach to enable algorithm designers to accurately benchmark SNN algorithms and pave the way towards effective algorithm-hardware co-design.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
FMCW Radar Sensing for Indoor Drones Using Learned Representations
Authors:
Ali Safa,
Tim Verbelen,
Ozan Catal,
Toon Van de Maele,
Matthias Hartmann,
Bart Dhoedt,
André Bourdoux
Abstract:
Frequency-modulated continuous-wave (FMCW) radar is a promising sensor technology for indoor drones as it provides range, angular as well as Doppler-velocity information about obstacles in the environment. Recently, deep learning approaches have been proposed for processing FMCW data, outperforming traditional detection techniques on range-Doppler or range-azimuth maps. However, these techniques c…
▽ More
Frequency-modulated continuous-wave (FMCW) radar is a promising sensor technology for indoor drones as it provides range, angular as well as Doppler-velocity information about obstacles in the environment. Recently, deep learning approaches have been proposed for processing FMCW data, outperforming traditional detection techniques on range-Doppler or range-azimuth maps. However, these techniques come at a cost; for each novel task a deep neural network architecture has to be trained on high-dimensional input data, stressing both data bandwidth and processing budget. In this paper, we investigate unsupervised learning techniques that generate low-dimensional representations from FMCW radar data, and evaluate to what extent these representations can be reused for multiple downstream tasks. To this end, we introduce a novel dataset of raw radar ADC data recorded from a radar mounted on a flying drone platform in an indoor environment, together with ground truth detection targets. We show with real radar data that, utilizing our learned representations, we match the performance of conventional radar processing techniques and that our model can be trained on different input modalities such as raw ADC samples of only two consecutively transmitted chirps.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
Fusing Event-based Camera and Radar for SLAM Using Spiking Neural Networks with Continual STDP Learning
Authors:
Ali Safa,
Tim Verbelen,
Ilja Ocket,
André Bourdoux,
Hichem Sahli,
Francky Catthoor,
Georges Gielen
Abstract:
This work proposes a first-of-its-kind SLAM architecture fusing an event-based camera and a Frequency Modulated Continuous Wave (FMCW) radar for drone navigation. Each sensor is processed by a bio-inspired Spiking Neural Network (SNN) with continual Spike-Timing-Dependent Plasticity (STDP) learning, as observed in the brain. In contrast to most learning-based SLAM systems%, which a) require the ac…
▽ More
This work proposes a first-of-its-kind SLAM architecture fusing an event-based camera and a Frequency Modulated Continuous Wave (FMCW) radar for drone navigation. Each sensor is processed by a bio-inspired Spiking Neural Network (SNN) with continual Spike-Timing-Dependent Plasticity (STDP) learning, as observed in the brain. In contrast to most learning-based SLAM systems%, which a) require the acquisition of a representative dataset of the environment in which navigation must be performed and b) require an off-line training phase, our method does not require any offline training phase, but rather the SNN continuously learns features from the input data on the fly via STDP. At the same time, the SNN outputs are used as feature descriptors for loop closure detection and map correction. We conduct numerous experiments to benchmark our system against state-of-the-art RGB methods and we demonstrate the robustness of our DVS-Radar SLAM approach under strong lighting variations.
△ Less
Submitted 9 October, 2022;
originally announced October 2022.
-
Learning to SLAM on the Fly in Unknown Environments: A Continual Learning Approach for Drones in Visually Ambiguous Scenes
Authors:
Ali Safa,
Tim Verbelen,
Ilja Ocket,
André Bourdoux,
Hichem Sahli,
Francky Catthoor,
Georges Gielen
Abstract:
Learning to safely navigate in unknown environments is an important task for autonomous drones used in surveillance and rescue operations. In recent years, a number of learning-based Simultaneous Localisation and Mapping (SLAM) systems relying on deep neural networks (DNNs) have been proposed for applications where conventional feature descriptors do not perform well. However, such learning-based…
▽ More
Learning to safely navigate in unknown environments is an important task for autonomous drones used in surveillance and rescue operations. In recent years, a number of learning-based Simultaneous Localisation and Mapping (SLAM) systems relying on deep neural networks (DNNs) have been proposed for applications where conventional feature descriptors do not perform well. However, such learning-based SLAM systems rely on DNN feature encoders trained offline in typical deep learning settings. This makes them less suited for drones deployed in environments unseen during training, where continual adaptation is paramount. In this paper, we present a new method for learning to SLAM on the fly in unknown environments, by modulating a low-complexity Dictionary Learning and Sparse Coding (DLSC) pipeline with a newly proposed Quadratic Bayesian Surprise (QBS) factor. We experimentally validate our approach with data collected by a drone in a challenging warehouse scenario, where the high number of ambiguous scenes makes visual disambiguation hard.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Continuously Learning to Detect People on the Fly: A Bio-inspired Visual System for Drones
Authors:
Ali Safa,
Ilja Ocket,
André Bourdoux,
Hichem Sahli,
Francky Catthoor,
Georges Gielen
Abstract:
This paper demonstrates for the first time that a biologically-plausible spiking neural network (SNN) equipped with Spike-Timing-Dependent Plasticity (STDP) can continuously learn to detect walking people on the fly using retina-inspired, event-based cameras. Our pipeline works as follows. First, a short sequence of event data ($<2$ minutes), capturing a walking human by a flying drone, is forward…
▽ More
This paper demonstrates for the first time that a biologically-plausible spiking neural network (SNN) equipped with Spike-Timing-Dependent Plasticity (STDP) can continuously learn to detect walking people on the fly using retina-inspired, event-based cameras. Our pipeline works as follows. First, a short sequence of event data ($<2$ minutes), capturing a walking human by a flying drone, is forwarded to a convolutional SNNSTDP system which also receives teacher spiking signals from a readout (forming a semi-supervised system). Then, STDP adaptation is stopped and the learned system is assessed on testing sequences. We conduct several experiments to study the effect of key parameters in our system and to compare it against conventionally-trained CNNs. We show that our system reaches a higher peak $F_1$ score (+19%) compared to CNNs with event-based camera frames, while enabling on-line adaptation.
△ Less
Submitted 20 February, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
A New Look at Spike-Timing-Dependent Plasticity Networks for Spatio-Temporal Feature Learning
Authors:
Ali Safa,
Ilja Ocket,
André Bourdoux,
Hichem Sahli,
Francky Catthoor,
Georges Gielen
Abstract:
We present new theoretical foundations for unsupervised Spike-Timing-Dependent Plasticity (STDP) learning in spiking neural networks (SNNs). In contrast to empirical parameter search used in most previous works, we provide novel theoretical grounds for SNN and STDP parameter tuning which considerably reduces design time. Using our generic framework, we propose a class of global, action-based and c…
▽ More
We present new theoretical foundations for unsupervised Spike-Timing-Dependent Plasticity (STDP) learning in spiking neural networks (SNNs). In contrast to empirical parameter search used in most previous works, we provide novel theoretical grounds for SNN and STDP parameter tuning which considerably reduces design time. Using our generic framework, we propose a class of global, action-based and convolutional SNN-STDP architectures for learning spatio-temporal features from event-based cameras. We assess our methods on the N-MNIST, the CIFAR10-DVS and the IBM DVS128 Gesture datasets, all acquired with a real-world event camera. Using our framework, we report significant improvements in classification accuracy compared to both conventional state-of-the-art event-based feature descriptors (+8.2% on CIFAR10-DVS), and compared to state-of-the-art STDP-based systems (+9.3% on N-MNIST, +7.74% on IBM DVS128 Gesture). Our work contributes to both ultra-low-power learning in neuromorphic edge devices, and towards a biologically-plausible, optimization-based theory of cortical vision.
△ Less
Submitted 22 February, 2022; v1 submitted 1 November, 2021;
originally announced November 2021.
-
Fail-Safe Human Detection for Drones Using a Multi-Modal Curriculum Learning Approach
Authors:
Ali Safa,
Tim Verbelen,
Ilja Ocket,
André Bourdoux,
Francky Catthoor,
Georges G. E. Gielen
Abstract:
Drones are currently being explored for safety-critical applications where human agents are expected to evolve in their vicinity. In such applications, robust people avoidance must be provided by fusing a number of sensing modalities in order to avoid collisions. Currently however, people detection systems used on drones are solely based on standard cameras besides an emerging number of works disc…
▽ More
Drones are currently being explored for safety-critical applications where human agents are expected to evolve in their vicinity. In such applications, robust people avoidance must be provided by fusing a number of sensing modalities in order to avoid collisions. Currently however, people detection systems used on drones are solely based on standard cameras besides an emerging number of works discussing the fusion of imaging and event-based cameras. On the other hand, radar-based systems provide up-most robustness towards environmental conditions but do not provide complete information on their own and have mainly been investigated in automotive contexts, not for drones. In order to enable the fusion of radars with both event-based and standard cameras, we present KUL-UAVSAFE, a first-of-its-kind dataset for the study of safety-critical people detection by drones. In addition, we propose a baseline CNN architecture with cross-fusion highways and introduce a curriculum learning strategy for multi-modal data termed SAUL, which greatly enhances the robustness of the system towards hard RGB failures and provides a significant gain of 15% in peak F1 score compared to the use of BlackIn, previously proposed for cross-fusion networks. We demonstrate the real-time performance and feasibility of the approach by implementing the system in an edge-computing unit. We release our dataset and additional material in the project home page.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
A Low-Complexity Radar Detector Outperforming OS-CFAR for Indoor Drone Obstacle Avoidance
Authors:
Ali Safa,
Tim Verbelen,
Lars Keuninckx,
Ilja Ocket,
Matthias Hartmann,
André Bourdoux,
Franky Catthoor,
Georges Gielen
Abstract:
As radar sensors are being miniaturized, there is a growing interest for using them in indoor sensing applications such as indoor drone obstacle avoidance. In those novel scenarios, radars must perform well in dense scenes with a large number of neighboring scatterers. Central to radar performance is the detection algorithm used to separate targets from the background noise and clutter. Traditiona…
▽ More
As radar sensors are being miniaturized, there is a growing interest for using them in indoor sensing applications such as indoor drone obstacle avoidance. In those novel scenarios, radars must perform well in dense scenes with a large number of neighboring scatterers. Central to radar performance is the detection algorithm used to separate targets from the background noise and clutter. Traditionally, most radar systems use conventional CFAR detectors but their performance degrades in indoor scenarios with many reflectors. Inspired by the advances in non-linear target detection, we propose a novel high-performance, yet low-complexity target detector and we experimentally validate our algorithm on a dataset acquired using a radar mounted on a drone. We experimentally show that our proposed algorithm drastically outperforms OS-CFAR (standard detector used in automotive systems) for our specific task of indoor drone navigation with more than 19% higher probability of detection for a given probability of false alarm. We also benchmark our proposed detector against a number of recently proposed multi-target CFAR detectors and show an improvement of 16% in probability of detection compared to CHA-CFAR, with even larger improvements compared to both OR-CFAR and TS-LNCFAR in our particular indoor scenario. To the best of our knowledge, this work improves the state of the art for high-performance yet low-complexity radar detection in critical indoor sensing applications.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.