-
Cascaded Cross-Modal Transformer for Audio-Textual Classification
Authors:
Nicolae-Catalin Ristea,
Andrei Anghel,
Radu Tudor Ionescu
Abstract:
Speech classification tasks often require powerful language understanding models to grasp useful features, which becomes problematic when limited training data is available. To attain superior classification performance, we propose to harness the inherent value of multimodal representations by transcribing speech using automatic speech recognition (ASR) models and translating the transcripts into…
▽ More
Speech classification tasks often require powerful language understanding models to grasp useful features, which becomes problematic when limited training data is available. To attain superior classification performance, we propose to harness the inherent value of multimodal representations by transcribing speech using automatic speech recognition (ASR) models and translating the transcripts into different languages via pretrained translation models. We thus obtain an audio-textual (multimodal) representation for each data sample. Subsequently, we combine language-specific Bidirectional Encoder Representations from Transformers (BERT) with Wav2Vec2.0 audio features via a novel cascaded cross-modal transformer (CCMT). Our model is based on two cascaded transformer blocks. The first one combines text-specific features from distinct languages, while the second one combines acoustic features with multilingual features previously learned by the first transformer block. We employed our system in the Requests Sub-Challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. CCMT was declared the winning solution, obtaining an unweighted average recall (UAR) of 65.41% and 85.87% for complaint and request detection, respectively. Moreover, we applied our framework on the Speech Commands v2 and HarperValleyBank dialog data sets, surpassing previous studies reporting results on these benchmarks. Our code is freely available for download at: https://github.com/ristea/ccmt.
△ Less
Submitted 24 July, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Sea Ice Segmentation From SAR Data by Convolutional Transformer Networks
Authors:
Nicolae-Catalin Ristea,
Andrei Anghel,
Mihai Datcu
Abstract:
Sea ice is a crucial component of the Earth's climate system and is highly sensitive to changes in temperature and atmospheric conditions. Accurate and timely measurement of sea ice parameters is important for understanding and predicting the impacts of climate change. Nevertheless, the amount of satellite data acquired over ice areas is huge, making the subjective measurements ineffective. Theref…
▽ More
Sea ice is a crucial component of the Earth's climate system and is highly sensitive to changes in temperature and atmospheric conditions. Accurate and timely measurement of sea ice parameters is important for understanding and predicting the impacts of climate change. Nevertheless, the amount of satellite data acquired over ice areas is huge, making the subjective measurements ineffective. Therefore, automated algorithms must be used in order to fully exploit the continuous data feeds coming from satellites. In this paper, we present a novel approach for sea ice segmentation based on SAR satellite imagery using hybrid convolutional transformer (ConvTr) networks. We show that our approach outperforms classical convolutional networks, while being considerably more efficient than pure transformer models. ConvTr obtained a mean intersection over union (mIoU) of 63.68% on the AI4Arctic data set, assuming an inference time of 120ms for a 400 x 400 squared km product.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Explainable, Physics Aware, Trustworthy AI Paradigm Shift for Synthetic Aperture Radar
Authors:
Mihai Datcu,
Zhongling Huang,
Andrei Anghel,
Juanping Zhao,
Remus Cacoveanu
Abstract:
The recognition or understanding of the scenes observed with a SAR system requires a broader range of cues, beyond the spatial context. These encompass but are not limited to: imaging geometry, imaging mode, properties of the Fourier spectrum of the images or the behavior of the polarimetric signatures. In this paper, we propose a change of paradigm for explainability in data science for the case…
▽ More
The recognition or understanding of the scenes observed with a SAR system requires a broader range of cues, beyond the spatial context. These encompass but are not limited to: imaging geometry, imaging mode, properties of the Fourier spectrum of the images or the behavior of the polarimetric signatures. In this paper, we propose a change of paradigm for explainability in data science for the case of Synthetic Aperture Radar (SAR) data to ground the explainable AI for SAR. It aims to use explainable data transformations based on well-established models to generate inputs for AI methods, to provide knowledgeable feedback for training process, and to learn or improve high-complexity unknown or un-formalized models from the data. At first, we introduce a representation of the SAR system with physical layers: i) instrument and platform, ii) imaging formation, iii) scattering signatures and objects, that can be integrated with an AI model for hybrid modeling. Successively, some illustrative examples are presented to demonstrate how to achieve hybrid modeling for SAR image understanding. The perspective of trustworthy model and supplementary explanations are discussed later. Finally, we draw the conclusion and we deem the proposed concept has applicability to the entire class of coherent imaging sensors and other computational imaging systems.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Automotive Radar Interference Mitigation with Unfolded Robust PCA based on Residual Overcomplete Auto-Encoder Blocks
Authors:
Nicolae-Cătălin Ristea,
Andrei Anghel,
Radu Tudor Ionescu,
Yonina C. Eldar
Abstract:
In autonomous driving, radar systems play an important role in detecting targets such as other vehicles on the road. Radars mounted on different cars can interfere with each other, degrading the detection performance. Deep learning methods for automotive radar interference mitigation can succesfully estimate the amplitude of targets, but fail to recover the phase of the respective targets. In this…
▽ More
In autonomous driving, radar systems play an important role in detecting targets such as other vehicles on the road. Radars mounted on different cars can interfere with each other, degrading the detection performance. Deep learning methods for automotive radar interference mitigation can succesfully estimate the amplitude of targets, but fail to recover the phase of the respective targets. In this paper, we propose an efficient and effective technique based on unfolded robust Principal Component Analysis (RPCA) that is able to estimate both amplitude and phase in the presence of interference. Our contribution consists in introducing residual overcomplete auto-encoder (ROC-AE) blocks into the recurrent architecture of unfolded RPCA, which results in a deeper model that significantly outperforms unfolded RPCA as well as other deep learning models.
△ Less
Submitted 17 April, 2021; v1 submitted 14 October, 2020;
originally announced October 2020.
-
Estimating the Magnitude and Phase of Automotive Radar Signals under Multiple Interference Sources with Fully Convolutional Networks
Authors:
Nicolae-Cătălin Ristea,
Andrei Anghel,
Radu Tudor Ionescu
Abstract:
Radar sensors are gradually becoming a wide-spread equipment for road vehicles, playing a crucial role in autonomous driving and road safety. The broad adoption of radar sensors increases the chance of interference among sensors from different vehicles, generating corrupted range profiles and range-Doppler maps. In order to extract distance and velocity of multiple targets from range-Doppler maps,…
▽ More
Radar sensors are gradually becoming a wide-spread equipment for road vehicles, playing a crucial role in autonomous driving and road safety. The broad adoption of radar sensors increases the chance of interference among sensors from different vehicles, generating corrupted range profiles and range-Doppler maps. In order to extract distance and velocity of multiple targets from range-Doppler maps, the interference affecting each range profile needs to be mitigated. In this paper, we propose a fully convolutional neural network for automotive radar interference mitigation. In order to train our network in a real-world scenario, we introduce a new data set of realistic automotive radar signals with multiple targets and multiple interferers. To our knowledge, we are the first to apply weight pruning in the automotive radar domain, obtaining superior results compared to the widely-used dropout. While most previous works successfully estimated the magnitude of automotive radar signals, we propose a deep learning model that can accurately estimate the phase. For instance, our novel approach reduces the phase estimation error with respect to the commonly-adopted zeroing technique by half, from 12.55 degrees to 6.58 degrees. Considering the lack of databases for automotive radar interference mitigation, we release as open source our large-scale data set that closely replicates the real-world automotive scenario for multiple interference cases, allowing others to objectively compare their future work in this domain. Our data set is available for download at: http://github.com/ristea/arim-v2.
△ Less
Submitted 6 November, 2021; v1 submitted 11 August, 2020;
originally announced August 2020.
-
Fully Convolutional Neural Networks for Automotive Radar Interference Mitigation
Authors:
Nicolae-Cătălin Ristea,
Andrei Anghel,
Radu Tudor Ionescu
Abstract:
The interest of the automotive industry has progressively focused on subjects related to driver assistance systems as well as autonomous cars. Cars combine a variety of sensors to perceive their surroundings robustly. Among them, radar sensors are indispensable because of their independence of lighting conditions and the possibility to directly measure velocity. However, radar interference is an i…
▽ More
The interest of the automotive industry has progressively focused on subjects related to driver assistance systems as well as autonomous cars. Cars combine a variety of sensors to perceive their surroundings robustly. Among them, radar sensors are indispensable because of their independence of lighting conditions and the possibility to directly measure velocity. However, radar interference is an issue that becomes prevalent with the increasing amount of radar systems in automotive scenarios. In this paper, we address this issue for frequency modulated continuous wave (FMCW) radars with fully convolutional neural networks (FCNs), a state-of-the-art deep learning technique. We propose two FCNs that take spectrograms of the beat signals as input, and provide the corresponding clean range profiles as output. We propose two architectures for interference mitigation which outperform the classical zeroing technique. Moreover, considering the lack of databases for this task, we release as open source a large scale data set that closely replicates real world automotive scenarios for single-interference cases, allowing others to objectively compare their future work in this domain. The data set is available for download at: http://github.com/ristea/arim.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Pulse radar with FPGA range compression for real time displacement and vibration monitoring
Authors:
Mihai Tudose,
Andrei Anghel,
Remus Cacoveanu,
Mihai Datcu
Abstract:
This paper aims at presenting the basic functionality of a radar platform for real-time monitoring of displacement and vibration. The real time capabilities make the radar platform useful when live monitoring of targets is required. The system is based on the RF analog front-end of an USRP, and the range compression (time-domain cross-correlation) is implemented on the FPGA included in the USRP. F…
▽ More
This paper aims at presenting the basic functionality of a radar platform for real-time monitoring of displacement and vibration. The real time capabilities make the radar platform useful when live monitoring of targets is required. The system is based on the RF analog front-end of an USRP, and the range compression (time-domain cross-correlation) is implemented on the FPGA included in the USRP. Further processing is performed on the host computer to plot real time range profiles, displacements, vibration frequencies spectra and spectrograms (waterfall plots) for long term monitoring. The system is currently in experimental form and the present paper aims at proving its functionality. The precision of this system is estimated at 0.6 mm for displacement measurements and 1.8 mm for vibration amplitude measurements.
△ Less
Submitted 14 November, 2018;
originally announced November 2018.