-
Deep Representation Learning for Open Vocabulary Electroencephalography-to-Text Decoding
Authors:
Hamza Amrani,
Daniela Micucci,
Paolo Napoletano
Abstract:
Previous research has demonstrated the potential of using pre-trained language models for decoding open vocabulary Electroencephalography (EEG) signals captured through a non-invasive Brain-Computer Interface (BCI). However, the impact of embedding EEG signals in the context of language models and the effect of subjectivity, remain unexplored, leading to uncertainty about the best approach to enha…
▽ More
Previous research has demonstrated the potential of using pre-trained language models for decoding open vocabulary Electroencephalography (EEG) signals captured through a non-invasive Brain-Computer Interface (BCI). However, the impact of embedding EEG signals in the context of language models and the effect of subjectivity, remain unexplored, leading to uncertainty about the best approach to enhance decoding performance. Additionally, current evaluation metrics used to assess decoding effectiveness are predominantly syntactic and do not provide insights into the comprehensibility of the decoded output for human understanding. We present an end-to-end deep learning framework for non-invasive brain recordings that brings modern representational learning approaches to neuroscience. Our proposal introduces the following innovations: 1) an end-to-end deep learning architecture for open vocabulary EEG decoding, incorporating a subject-dependent representation learning module for raw EEG encoding, a BART language model, and a GPT-4 sentence refinement module; 2) a more comprehensive sentence-level evaluation metric based on the BERTScore; 3) an ablation study that analyses the contributions of each module within our proposal, providing valuable insights for future research. We evaluate our approach on two publicly available datasets, ZuCo v1.0 and v2.0, comprising EEG recordings of 30 subjects engaged in natural reading tasks. Our model achieves a BLEU-1 score of 42.75%, a ROUGE-1-F of 33.28%, and a BERTScore-F of 53.86%, outperforming the previous state-of-the-art methods by 3.38%, 8.43%, and 6.31%, respectively.
△ Less
Submitted 15 November, 2023;
originally announced December 2023.
-
Deep Learning Hyperspectral Pansharpening on large scale PRISMA dataset
Authors:
Simone Zini,
Mirko Paolo Barbato,
Flavio Piccoli,
Paolo Napoletano
Abstract:
In this work, we assess several deep learning strategies for hyperspectral pansharpening. First, we present a new dataset with a greater extent than any other in the state of the art. This dataset, collected using the ASI PRISMA satellite, covers about 262200 km2, and its heterogeneity is granted by randomly sampling the Earth's soil. Second, we adapted several state of the art approaches based on…
▽ More
In this work, we assess several deep learning strategies for hyperspectral pansharpening. First, we present a new dataset with a greater extent than any other in the state of the art. This dataset, collected using the ASI PRISMA satellite, covers about 262200 km2, and its heterogeneity is granted by randomly sampling the Earth's soil. Second, we adapted several state of the art approaches based on deep learning to fit PRISMA hyperspectral data and then assessed, quantitatively and qualitatively, the performance in this new scenario. The investigation has included two settings: Reduced Resolution (RR) to evaluate the techniques in a supervised environment and Full Resolution (FR) for a real-world evaluation. The main purpose is the evaluation of the reconstruction fidelity of the considered methods. In both scenarios, for the sake of completeness, we also included machine-learning-free approaches. From this extensive analysis has emerged that data-driven neural network methods outperform machine-learning-free approaches and adapt better to the task of hyperspectral pansharpening, both in RR and FR protocols.
△ Less
Submitted 28 July, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Semi-supervised cross-lingual speech emotion recognition
Authors:
Mirko Agarla,
Simone Bianco,
Luigi Celona,
Paolo Napoletano,
Alexey Petrovsky,
Flavio Piccoli,
Raimondo Schettini,
Ivan Shanin
Abstract:
Performance in Speech Emotion Recognition (SER) on a single language has increased greatly in the last few years thanks to the use of deep learning techniques. However, cross-lingual SER remains a challenge in real-world applications due to two main factors: the first is the big gap among the source and the target domain distributions; the second factor is the major availability of unlabeled utter…
▽ More
Performance in Speech Emotion Recognition (SER) on a single language has increased greatly in the last few years thanks to the use of deep learning techniques. However, cross-lingual SER remains a challenge in real-world applications due to two main factors: the first is the big gap among the source and the target domain distributions; the second factor is the major availability of unlabeled utterances in contrast to the labeled ones for the new language. Taking into account previous aspects, we propose a Semi-Supervised Learning (SSL) method for cross-lingual emotion recognition when only few labeled examples in the target domain (i.e. the new language) are available. Our method is based on a Transformer and it adapts to the new domain by exploiting a pseudo-labeling strategy on the unlabeled utterances. In particular, the use of a hard and soft pseudo-labels approach is investigated. We thoroughly evaluate the performance of the proposed method in a speaker-independent setup on both the source and the new language and show its robustness across five languages belonging to different linguistic strains. The experimental findings indicate that the unweighted accuracy is increased by an average of 40% compared to state-of-the-art methods.
△ Less
Submitted 17 July, 2023; v1 submitted 14 July, 2022;
originally announced July 2022.
-
Homogenization of Existing Inertial-Based Datasets to Support Human Activity Recognition
Authors:
Hamza Amrani,
Daniela Micucci,
Marco Mobilio,
Paolo Napoletano
Abstract:
Several techniques have been proposed to address the problem of recognizing activities of daily living from signals. Deep learning techniques applied to inertial signals have proven to be effective, achieving significant classification accuracy. Recently, research in human activity recognition (HAR) models has been almost totally model-centric. It has been proven that the number of training sample…
▽ More
Several techniques have been proposed to address the problem of recognizing activities of daily living from signals. Deep learning techniques applied to inertial signals have proven to be effective, achieving significant classification accuracy. Recently, research in human activity recognition (HAR) models has been almost totally model-centric. It has been proven that the number of training samples and their quality are critical for obtaining deep learning models that both perform well independently of their architecture, and that are more robust to intraclass variability and interclass similarity. Unfortunately, publicly available datasets do not always contain hight quality data and a sufficiently large and diverse number of samples (e.g., number of subjects, type of activity performed, and duration of trials). Furthermore, datasets are heterogeneous among them and therefore cannot be trivially combined to obtain a larger set. The final aim of our work is the definition and implementation of a platform that integrates datasets of inertial signals in order to make available to the scientific community large datasets of homogeneous signals, enriched, when possible, with context information (e.g., characteristics of the subjects and device position). The main focus of our platform is to emphasise data quality, which is essential for training efficient models.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Falls as anomalies? An experimental evaluation using smartphone accelerometer data
Authors:
Daniela Micucci,
Marco Mobilio,
Paolo Napoletano,
Francesco Tisato
Abstract:
Life expectancy keeps growing and, among elderly people, accidental falls occur frequently. A system able to promptly detect falls would help in reducing the injuries that a fall could cause. Such a system should meet the needs of the people to which is designed, so that it is actually used. In particular, the system should be minimally invasive and inexpensive. Thanks to the fact that most of the…
▽ More
Life expectancy keeps growing and, among elderly people, accidental falls occur frequently. A system able to promptly detect falls would help in reducing the injuries that a fall could cause. Such a system should meet the needs of the people to which is designed, so that it is actually used. In particular, the system should be minimally invasive and inexpensive. Thanks to the fact that most of the smartphones embed accelerometers and powerful processing unit, they are good candidates both as data acquisition devices and as platforms to host fall detection systems. For this reason, in the last years several fall detection methods have been experimented on smartphone accelerometer data. Most of them have been tuned with simulated falls because, to date, datasets of real-world falls are not available. This article evaluates the effectiveness of methods that detect falls as anomalies. To this end, we compared traditional approaches with anomaly detectors. In particular, we experienced the kNN and the SVM methods using both the one-class and two-classes configurations. The comparison involved three different collections of accelerometer data, and four different data representations. Empirical results demonstrated that, in most of the cases, falls are not required to design an effective fall detector.
△ Less
Submitted 30 October, 2015; v1 submitted 5 July, 2015;
originally announced July 2015.