-
A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition Evaluation
Authors:
Rodrigo Lima,
Sidney Evaldo Leal,
Arnaldo Candido Junior,
Sandra Maria Aluísio
Abstract:
We present a freely available spontaneous speech corpus for the Brazilian Portuguese language and report preliminary automatic speech recognition (ASR) results, using both the Wav2Vec2-XLSR-53 and Distil-Whisper models fine-tuned and trained on our corpus. The NURC-SP Audio Corpus comprises 401 different speakers (204 females, 197 males) with a total of 239.30 hours of transcribed audio recordings…
▽ More
We present a freely available spontaneous speech corpus for the Brazilian Portuguese language and report preliminary automatic speech recognition (ASR) results, using both the Wav2Vec2-XLSR-53 and Distil-Whisper models fine-tuned and trained on our corpus. The NURC-SP Audio Corpus comprises 401 different speakers (204 females, 197 males) with a total of 239.30 hours of transcribed audio recordings. To the best of our knowledge, this is the first large Paulistano accented spontaneous speech corpus dedicated to the ASR task in Portuguese. We first present the design and development procedures of the NURC-SP Audio Corpus, and then describe four ASR experiments in detail. The experiments demonstrated promising results for the applicability of the corpus for ASR. Specifically, we fine-tuned two versions of Wav2Vec2-XLSR-53 model, trained a Distil-Whisper model using our dataset with labels determined by Whisper Large-V3 model, and fine-tuned this Distil-Whisper model with our corpus. Our best results were the Distil-Whisper fine-tuned over NURC-SP Audio Corpus with a WER of 24.22% followed by a fine-tuned versions of Wav2Vec2-XLSR-53 model with a WER of 33.73%, that is almost 10% point worse than Distil-Whisper's. To enable experiment reproducibility, we share the NURC-SP Audio Corpus dataset, pre-trained models, and training recipes in Hugging-Face and Github repositories.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Certified Vision-based State Estimation for Autonomous Landing Systems using Reachability Analysis
Authors:
Ulices Santa Cruz Leal,
Yasser Shoukry
Abstract:
This paper studies the problem of designing a certified vision-based state estimator for autonomous landing systems. In such a system, a neural network (NN) processes images from a camera to estimate the aircraft relative position with respect to the runway. We propose an algorithm to design such NNs with certified properties in terms of their ability to detect runways and provide accurate state e…
▽ More
This paper studies the problem of designing a certified vision-based state estimator for autonomous landing systems. In such a system, a neural network (NN) processes images from a camera to estimate the aircraft relative position with respect to the runway. We propose an algorithm to design such NNs with certified properties in terms of their ability to detect runways and provide accurate state estimation. At the heart of our approach is the use of geometric models of perspective cameras to obtain a mathematical model that captures the relation between the aircraft states and the inputs. We show that such geometric models enjoy mixed monotonicity properties that can be used to design state estimators with certifiable error bounds. We show the effectiveness of the proposed approach using an experimental testbed on data collected from event-based cameras.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese
Authors:
Sidney Evaldo Leal,
Magali Sanches Duran,
Carolina Evaristo Scarton,
Nathan Siegle Hartmann,
Sandra Maria Aluísio
Abstract:
This paper presents and makes publicly available the NILC-Metrix, a computational system comprising 200 metrics proposed in studies on discourse, psycholinguistics, cognitive and computational linguistics, to assess textual complexity in Brazilian Portuguese (BP). These metrics are relevant for descriptive analysis and the creation of computational models and can be used to extract information fro…
▽ More
This paper presents and makes publicly available the NILC-Metrix, a computational system comprising 200 metrics proposed in studies on discourse, psycholinguistics, cognitive and computational linguistics, to assess textual complexity in Brazilian Portuguese (BP). These metrics are relevant for descriptive analysis and the creation of computational models and can be used to extract information from various linguistic levels of written and spoken language. The metrics in NILC-Metrix were developed during the last 13 years, starting in 2008 with Coh-Metrix-Port, a tool developed within the scope of the PorSimples project. Coh-Metrix-Port adapted some metrics to BP from the Coh-Metrix tool that computes metrics related to cohesion and coherence of texts in English. After the end of PorSimples in 2010, new metrics were added to the initial 48 metrics of Coh-Metrix-Port. Given the large number of metrics, we present them following an organisation similar to the metrics of Coh-Metrix v3.0 to facilitate comparisons made with metrics in Portuguese and English. In this paper, we illustrate the potential of NILC-Metrix by presenting three applications: (i) a descriptive analysis of the differences between children's film subtitles and texts written for Elementary School I and II (Final Years); (ii) a new predictor of textual complexity for the corpus of original and simplified texts of the PorSimples project; (iii) a complexity prediction model for school grades, using transcripts of children's story narratives told by teenagers. For each application, we evaluate which groups of metrics are more discriminative, showing their contribution for each task.
△ Less
Submitted 17 December, 2021;
originally announced January 2022.
-
Unity Perception: Generate Synthetic Data for Computer Vision
Authors:
Steve Borkman,
Adam Crespi,
Saurav Dhakad,
Sujoy Ganguly,
Jonathan Hogins,
You-Cyuan Jhang,
Mohsen Kamalzadeh,
Bowen Li,
Steven Leal,
Pete Parisi,
Cesar Romero,
Wesley Smith,
Alex Thaman,
Samuel Warren,
Nupur Yadav
Abstract:
We introduce the Unity Perception package which aims to simplify and accelerate the process of generating synthetic datasets for computer vision tasks by offering an easy-to-use and highly customizable toolset. This open-source package extends the Unity Editor and engine components to generate perfectly annotated examples for several common computer vision tasks. Additionally, it offers an extensi…
▽ More
We introduce the Unity Perception package which aims to simplify and accelerate the process of generating synthetic datasets for computer vision tasks by offering an easy-to-use and highly customizable toolset. This open-source package extends the Unity Editor and engine components to generate perfectly annotated examples for several common computer vision tasks. Additionally, it offers an extensible Randomization framework that lets the user quickly construct and configure randomized simulation parameters in order to introduce variation into the generated datasets. We provide an overview of the provided tools and how they work, and demonstrate the value of the generated synthetic datasets by training a 2D object detection model. The model trained with mostly synthetic data outperforms the model trained using only real data.
△ Less
Submitted 19 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Unveiling the research landscape of Sustainable Development Goals and their inclusion in Higher Education Institutions and Research Centers: major trends in 2000-2017
Authors:
Nuria Bautista-Puig,
Ana Marta Aleixo,
Susana Leal,
Ulisses Azeiteiro,
Rodrigo Costas
Abstract:
Sustainable Development Goals are the blueprint to achieve a better and more sustainable future for society. Its legacy is linked with the Millennium Development Goals, set up in 2000. A bibliometric analysis was conducted to 1) measure "core" research output from 2000-2017, with the aim to map the global research of sustainability goals, 2) describe thematic specialization based on keywords co-oc…
▽ More
Sustainable Development Goals are the blueprint to achieve a better and more sustainable future for society. Its legacy is linked with the Millennium Development Goals, set up in 2000. A bibliometric analysis was conducted to 1) measure "core" research output from 2000-2017, with the aim to map the global research of sustainability goals, 2) describe thematic specialization based on keywords co-occurrence analysis and strongest citation burst, 3) present a methodology to classify scientific output (based on an ad-hoc glossary) and assess SDGs interconnections.
Sustainability goals publications (core+expand based on direct citations) were identified in-house CWTS Web of Science by using search terms in titles, abstracts, and keywords. 25,299 bibliographic records were analyzed, from which 21,653 (85.59%) are from HEIs and research centres (RC). The purpose of this paper is to analyze the role of these organizations in sustainability research. The findings reveal the increasing participation of these organizations in this research (660 institutions in 2000-2005 to 1744 institutions involved in 2012-2017). In terms of specialization, some institutions present a higher production and specialization on the topic (e.g., London School of Hygiene & Tropical Medicine and World Health Organization); however, others present less production but higher specialization (e.g., Stockholm Environment Institute). Regarding the topics, health (especially in developing countries), women and socio-economic aspects are the most prominent ones. Moreover, it is observed the interlinked nature of SDGs between some SDGs in research output (e.g., SDG11 and SDG3). This study provides important orientation for HEIs and RCs in terms of Research, Development and Innovation (R&D+i) to respond to major societal challenges and could be useful for the policymakers in order to promote the research agenda on this topic.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.