-
Microelectrode Signal Dynamics as Biomarkers of Subthalamic Nucleus Entry on Deep Brain Stimulation: A Nonlinear Feature Approach
Authors:
Ana Luiza S. Tavares,
Artur Pedro M. Neto,
Francinaldo L. Gomes,
Paul Rodrigo dos Reis,
Arthur G. da Silva,
Antonio P. Junior,
Bruno D. Gomes
Abstract:
Accurate intraoperative localization of the subthalamic nucleus (STN) is essential for the efficacy of Deep Brain Stimulation (DBS) in patients with Parkinson's disease. While microelectrode recordings (MERs) provide rich electrophysiological information during DBS electrode implantation, current localization practices often rely on subjective interpretation of signal features. In this study, we p…
▽ More
Accurate intraoperative localization of the subthalamic nucleus (STN) is essential for the efficacy of Deep Brain Stimulation (DBS) in patients with Parkinson's disease. While microelectrode recordings (MERs) provide rich electrophysiological information during DBS electrode implantation, current localization practices often rely on subjective interpretation of signal features. In this study, we propose a quantitative framework that leverages nonlinear dynamics and entropy-based metrics to classify neural activity recorded inside versus outside the STN. MER data from three patients were preprocessed using a robust artifact correction pipeline, segmented, and labelled based on surgical annotations. A comprehensive set of recurrence quantification analysis, nonlinear, and entropy features were extracted from each segment. Multiple supervised classifiers were trained on every combination of feature domains using stratified 10-fold cross-validation, followed by statistical comparison using paired Wilcoxon signed-rank tests with Holm-Bonferroni correction. The combination of entropy and nonlinear features yielded the highest discriminative power, and the Extra Trees classifier emerged as the best model with a cross-validated F1-score of 0.902+/-0.027 and ROC AUC of 0.887+/-0.055. Final evaluation on a 20% hold-out test set confirmed robust generalization (F1= 0.922, ROC AUC = 0.941). These results highlight the potential of nonlinear and entropy signal descriptors in supporting real-time, data-driven decision-making during DBS surgeries
△ Less
Submitted 14 June, 2025;
originally announced June 2025.
-
UruBots Autonomous Cars Challenge Pro Team Description Paper for FIRA 2025
Authors:
Pablo Moraes,
Mónica Rodríguez,
Sebastian Barcelona,
Angel Da Silva,
Santiago Fernandez,
Hiago Sodre,
Igor Nunes,
Bruna Guterres,
Ricardo Grando
Abstract:
This paper describes the development of an autonomous car by the UruBots team for the 2025 FIRA Autonomous Cars Challenge (Pro). The project involves constructing a compact electric vehicle, approximately the size of an RC car, capable of autonomous navigation through different tracks. The design incorporates mechanical and electronic components and machine learning algorithms that enable the vehi…
▽ More
This paper describes the development of an autonomous car by the UruBots team for the 2025 FIRA Autonomous Cars Challenge (Pro). The project involves constructing a compact electric vehicle, approximately the size of an RC car, capable of autonomous navigation through different tracks. The design incorporates mechanical and electronic components and machine learning algorithms that enable the vehicle to make real-time navigation decisions based on visual input from a camera. We use deep learning models to process camera images and control vehicle movements. Using a dataset of over ten thousand images, we trained a Convolutional Neural Network (CNN) to drive the vehicle effectively, through two outputs, steering and throttle. The car completed the track in under 30 seconds, achieving a pace of approximately 0.4 meters per second while avoiding obstacles.
△ Less
Submitted 8 June, 2025;
originally announced June 2025.
-
Speechless: Speech Instruction Training Without Speech for Low Resource Languages
Authors:
Alan Dao,
Dinh Bach Vu,
Huy Hoang Ha,
Tuan Le Duc Anh,
Shreyas Gopal,
Yue Heng Yeo,
Warren Keng Hoong Low,
Eng Siong Chng,
Jia Qi Yip
Abstract:
The rapid growth of voice assistants powered by large language models (LLM) has highlighted a need for speech instruction data to train these systems. Despite the abundance of speech recognition data, there is a notable scarcity of speech instruction data, which is essential for fine-tuning models to understand and execute spoken commands. Generating high-quality synthetic speech requires a good t…
▽ More
The rapid growth of voice assistants powered by large language models (LLM) has highlighted a need for speech instruction data to train these systems. Despite the abundance of speech recognition data, there is a notable scarcity of speech instruction data, which is essential for fine-tuning models to understand and execute spoken commands. Generating high-quality synthetic speech requires a good text-to-speech (TTS) model, which may not be available to low resource languages. Our novel approach addresses this challenge by halting synthesis at the semantic representation level, bypassing the need for TTS. We achieve this by aligning synthetic semantic representations with the pre-trained Whisper encoder, enabling an LLM to be fine-tuned on text instructions while maintaining the ability to understand spoken instructions during inference. This simplified training process is a promising approach to building voice assistant for low-resource languages.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
On Word-of-Mouth and Private-Prior Sequential Social Learning
Authors:
Andrea Da Col,
Cristian R. Rojas,
Vikram Krishnamurthy
Abstract:
Social learning provides a fundamental framework in economics and social sciences for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives…
▽ More
Social learning provides a fundamental framework in economics and social sciences for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and adopted by all agents, in place of their own. We analyze this setting both theoretically and through numerical simulations, showing that some agents benefit from using the public belief broadcast by the last agent, while others suffer from performance deterioration.
△ Less
Submitted 7 April, 2025; v1 submitted 3 April, 2025;
originally announced April 2025.
-
Experimental evaluation of xApp Conflict Mitigation Framework in O-RAN: Insights from Testbed deployment in OTIC
Authors:
Abida Sultana,
Cezary Adamczyk,
Mayukh Roy Chowdhury,
Adrian Kliks,
Aloizio Da Silva
Abstract:
Conflict Mitigation (CM) in Open Radio Access Network (O-RAN) is a topic that is gaining importance as commercial O-RAN deployments become more complex. Although research on CM is already covered in terms of simulated network scenarios, it lacks validation using real-world deployment and Over The Air (OTA) Radio Frequency (RF) transmission. Our objective is to conduct the first assessment of the C…
▽ More
Conflict Mitigation (CM) in Open Radio Access Network (O-RAN) is a topic that is gaining importance as commercial O-RAN deployments become more complex. Although research on CM is already covered in terms of simulated network scenarios, it lacks validation using real-world deployment and Over The Air (OTA) Radio Frequency (RF) transmission. Our objective is to conduct the first assessment of the Conflict Mitigation Framework (CMF) for O-RAN using a real-world testbed and OTA RF transmission. This paper presents results of an experiment using a dedicated testbed built in an O-RAN Open Test and Integration Center (OTIC) to confirm the validity of one of the Conflict Resolution (CR) schemes proposed by existing research. The results show that the implemented conflict detection and resolution mechanisms allow a significant improvement in network operation stability by reducing the variability of the measured Downlink (DL) throughput by 78%.
△ Less
Submitted 15 May, 2025; v1 submitted 14 March, 2025;
originally announced March 2025.
-
Machine Learning Strategies for Parkinson Tremor Classification Using Wearable Sensor Data
Authors:
Jesus Paucar-Escalante,
Matheus Alves da Silva,
Bruno De Lima Sanches,
Aurea Soriano-Vargas,
Laura Silveira Moriyama,
Esther Luna Colombini
Abstract:
Parkinson's disease (PD) is a neurological disorder requiring early and accurate diagnosis for effective management. Machine learning (ML) has emerged as a powerful tool to enhance PD classification and diagnostic accuracy, particularly by leveraging wearable sensor data. This survey comprehensively reviews current ML methodologies used in classifying Parkinsonian tremors, evaluating various tremo…
▽ More
Parkinson's disease (PD) is a neurological disorder requiring early and accurate diagnosis for effective management. Machine learning (ML) has emerged as a powerful tool to enhance PD classification and diagnostic accuracy, particularly by leveraging wearable sensor data. This survey comprehensively reviews current ML methodologies used in classifying Parkinsonian tremors, evaluating various tremor data acquisition methodologies, signal preprocessing techniques, and feature selection methods across time and frequency domains, highlighting practical approaches for tremor classification. The survey explores ML models utilized in existing studies, ranging from traditional methods such as Support Vector Machines (SVM) and Random Forests to advanced deep learning architectures like Convolutional Neural Networks (CNN) and Long Short-Term Memory networks (LSTM). We assess the efficacy of these models in classifying tremor patterns associated with PD, considering their strengths and limitations. Furthermore, we discuss challenges and discrepancies in current research and broader challenges in applying ML to PD diagnosis using wearable sensor data. We also outline future research directions to advance ML applications in PD diagnostics, providing insights for researchers and practitioners.
△ Less
Submitted 30 January, 2025;
originally announced January 2025.
-
FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
Authors:
Alef Iury Siqueira Ferreira,
Lucas Rafael Gris,
Augusto Seben da Rosa,
Frederico Santos de Oliveira,
Edresson Casanova,
Rafael Teixeira Sousa,
Arnaldo Candido Junior,
Anderson da Silva Soares,
Arlindo Galvão Filho
Abstract:
This work presents FreeSVC, a promising multilingual singing voice conversion approach that leverages an enhanced VITS model with Speaker-invariant Clustering (SPIN) for better content representation and the State-of-the-Art (SOTA) speaker encoder ECAPA2. FreeSVC incorporates trainable language embeddings to handle multiple languages and employs an advanced speaker encoder to disentangle speaker c…
▽ More
This work presents FreeSVC, a promising multilingual singing voice conversion approach that leverages an enhanced VITS model with Speaker-invariant Clustering (SPIN) for better content representation and the State-of-the-Art (SOTA) speaker encoder ECAPA2. FreeSVC incorporates trainable language embeddings to handle multiple languages and employs an advanced speaker encoder to disentangle speaker characteristics from linguistic content. Designed for zero-shot learning, FreeSVC enables cross-lingual singing voice conversion without extensive language-specific training. We demonstrate that a multilingual content extractor is crucial for optimal cross-language conversion. Our source code and models are publicly available.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Additional Tests for TV 3.0
Authors:
Eduardo Peixoto,
Pedro Garcia Freitas,
Mylene Christine Queiroz Farias,
Edil Medeiros,
Gabriel Correia Lima da Cunha e Menezes,
André Henrique Macedo da Costa
Abstract:
In 2023 we have conducted extensive experiments on subjective video quality for the TV 3.0 project at University of Brasília. A full report on these tests is available at the Fórum SBTVD website . These tests have evaluated the H.266/VVC codec and a hybrid codec formed by the H.266/VVC and the LCEVC (Low Complexity Enhancement Video Coding) with different resolutions, ranging from 720p to 4K. This…
▽ More
In 2023 we have conducted extensive experiments on subjective video quality for the TV 3.0 project at University of Brasília. A full report on these tests is available at the Fórum SBTVD website . These tests have evaluated the H.266/VVC codec and a hybrid codec formed by the H.266/VVC and the LCEVC (Low Complexity Enhancement Video Coding) with different resolutions, ranging from 720p to 4K. This report contains the results of additional tests performed for TV 3.0 performed at University of Brasília. This new experiment consists of two new Video Under Tests (VUTs), one with the H.266/VVC codec at 4K resolution, and the other with the H.266/VVC+LCEVC codec at 4K resolution. In this new test, both codecs have the same GOP size (120 frames) and use the same VVC encoder (MainConcept live encoder). This new experiment follows the same experimental protocol as the previous experiments, in order to be fully comparable to the reported results. This document details the results of the new experiments.
△ Less
Submitted 18 November, 2024;
originally announced November 2024.
-
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Authors:
Alan Dao,
Dinh Bach Vu,
Huy Hoang Ha
Abstract:
Large Language Models (LLMs) have revolutionized natural language processing, but their application to speech-based tasks remains challenging due to the complexities of integrating audio and text modalities. This paper introduces Ichigo, a mixed-modal model that seamlessly processes interleaved sequences of speech and text. Utilizing a tokenized early-fusion approach, Ichigo quantizes speech into…
▽ More
Large Language Models (LLMs) have revolutionized natural language processing, but their application to speech-based tasks remains challenging due to the complexities of integrating audio and text modalities. This paper introduces Ichigo, a mixed-modal model that seamlessly processes interleaved sequences of speech and text. Utilizing a tokenized early-fusion approach, Ichigo quantizes speech into discrete tokens and employs a uniform transformer-based architecture for both speech and text modalities. This method enables joint reasoning and generation across modalities without the need for separate adapters. We present a comprehensive training methodology, including pre-training on multilingual speech recognition datasets and fine-tuning on a curated instruction dataset. Ichigo demonstrates state-of-the-art performance on speech question-answering benchmarks, outperforming existing open-source speech language models and achieving comparable results to cascaded systems. Notably, Ichigo exhibits a latency of just 111 ms to first token generation, significantly lower than current models. Our approach not only advances the field of multimodal AI but also provides a framework for smaller research teams to contribute effectively to open-source speech-language models.
△ Less
Submitted 4 April, 2025; v1 submitted 20 October, 2024;
originally announced October 2024.
-
Development of a Digital Front-End for Electrooculography Circuits to Facilitate Digital Communication in Individuals with Communicative and Motor Disabilities
Authors:
Andre Heid Rocha da Costa,
Keiran Robert O'Keeffe
Abstract:
This project developed a cost-effective, digital-viable front-end for electrooculography (EOG) circuits aimed at enabling communication for individuals with Locked-in Syndrome (LIS) and Amyotrophic Lateral Sclerosis (ALS). Using the TL072 operational amplifier, the system amplifies weak EOG signals and processes them through an Arduino Uno for real-time monitoring. The circuit includes preamplific…
▽ More
This project developed a cost-effective, digital-viable front-end for electrooculography (EOG) circuits aimed at enabling communication for individuals with Locked-in Syndrome (LIS) and Amyotrophic Lateral Sclerosis (ALS). Using the TL072 operational amplifier, the system amplifies weak EOG signals and processes them through an Arduino Uno for real-time monitoring. The circuit includes preamplification, filtering between 0.1 Hz and 30 Hz, and final amplification stages, achieving accurate eye movement tracking with a 256 Hz sampling rate. The approach to this was described in detail, with a comparison drawn between the theoretical expectations of our circuit design and its viability in contrast to the actual values measured. Our readings aimed to create an interface that optimized max-gaze angle readings by outputting a maximum reading at values above the baseline theory of our amplification circuit. From this, we measured the latency between the serial output and action, analyzing video recordings of such readings. The Latency value read reached around 20ms, which is within the tolerance for proper communication and did not seriously affect the readings. Beyond this, challenges such as noise interference (with an SNR of 1.07 dB) remain despite achieving reliable signal amplification. This was during a test of the analog functionality of this circuit. However, its limitations mean that future improvements will focus on reducing environmental interference, optimizing electrode placement, applying a novel detection algorithm to optimize communication applications, and enhancing signal clarity to make the system more effective for real-world applications.
△ Less
Submitted 14 October, 2024; v1 submitted 3 October, 2024;
originally announced October 2024.
-
Beam Profiling and Beamforming Modeling for mmWave NextG Networks
Authors:
Efat Samir Fathalla,
Sahar Zargarzadeh,
Chunsheng Xin,
Hongyi Wu,
Peng Jiang,
Joao F. Santos,
Jacek Kibilda,
Aloizio Pereira da
Abstract:
This paper presents an experimental study on mmWave beam profiling on a mmWave testbed, and develops a machine learning model for beamforming based on the experiment data. The datasets we have obtained from the beam profiling and the machine learning model for beamforming are valuable for a broad set of network design problems, such as network topology optimization, user equipment association, pow…
▽ More
This paper presents an experimental study on mmWave beam profiling on a mmWave testbed, and develops a machine learning model for beamforming based on the experiment data. The datasets we have obtained from the beam profiling and the machine learning model for beamforming are valuable for a broad set of network design problems, such as network topology optimization, user equipment association, power allocation, and beam scheduling, in complex and dynamic mmWave networks. We have used two commercial-grade mmWave testbeds with operational frequencies on the 27 Ghz and 71 GHz, respectively, for beam profiling. The obtained datasets were used to train the machine learning model to estimate the received downlink signal power, and data rate at the receivers (user equipment with different geographical locations in the range of a transmitter (base station). The results have shown high prediction accuracy with low mean square error (loss), indicating the model's ability to estimate the received signal power or data rate at each individual receiver covered by a beam. The dataset and the machine learning-based beamforming model can assist researchers in optimizing various network design problems for mmWave networks.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Robust Model Predictive Control for Aircraft Intent-Aware Collision Avoidance
Authors:
Arash Bahari Kordabad,
Andrea Da Col,
Arabinda Ghosh,
Sybert Stroeve,
Sadegh Soudjani
Abstract:
This paper presents the use of robust model predictive control for the design of an intent-aware collision avoidance system for multi-agent aircraft engaged in horizontal maneuvering scenarios. We assume that information from other agents is accessible in the form of waypoints or destinations. Consequently, we consider that other agents follow their optimal Dubin's path--a trajectory that connects…
▽ More
This paper presents the use of robust model predictive control for the design of an intent-aware collision avoidance system for multi-agent aircraft engaged in horizontal maneuvering scenarios. We assume that information from other agents is accessible in the form of waypoints or destinations. Consequently, we consider that other agents follow their optimal Dubin's path--a trajectory that connects their current state to their intended state--while accounting for potential uncertainties. We propose using scenario tree model predictive control as a robust approach that demonstrates computational efficiency. We demonstrate that the proposed method can easily integrate intent information and offer a robust scheme that handles different uncertainties. The method is illustrated through simulation results.
△ Less
Submitted 30 March, 2025; v1 submitted 13 August, 2024;
originally announced August 2024.
-
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task
Authors:
Khai Le-Duc,
Ryan Zhang,
Ngoc Son Nguyen,
Tan-Hanh Pham,
Anh Dao,
Ba Hung Ngo,
Anh Totti Nguyen,
Truong-Son Hy
Abstract:
Vision-language models have been extensively explored across a wide range of tasks, achieving satisfactory performance; however, their application in medical imaging remains underexplored. In this work, we propose a unified framework - LiteGPT - for the medical imaging. We leverage multiple pre-trained visual encoders to enrich information and enhance the performance of vision-language models. To…
▽ More
Vision-language models have been extensively explored across a wide range of tasks, achieving satisfactory performance; however, their application in medical imaging remains underexplored. In this work, we propose a unified framework - LiteGPT - for the medical imaging. We leverage multiple pre-trained visual encoders to enrich information and enhance the performance of vision-language models. To the best of our knowledge, this is the first study to utilize vision-language models for the novel task of joint localization and classification in medical images. Besides, we are pioneers in providing baselines for disease localization in chest X-rays. Finally, we set new state-of-the-art performance in the image classification task on the well-benchmarked VinDr-CXR dataset. All code and models are publicly available online: https://github.com/leduckhai/LiteGPT
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images
Authors:
Ruiwen Ding,
Kha-Dinh Luong,
Erika Rodriguez,
Ana Cristina Araujo Lemos da Silva,
William Hsu
Abstract:
In computational pathology, extracting spatial features from gigapixel whole slide images (WSIs) is a fundamental task, but due to their large size, WSIs are typically segmented into smaller tiles. A critical aspect of this analysis is aggregating information from these tiles to make predictions at the WSI level. We introduce a model that combines a message-passing graph neural network (GNN) with…
▽ More
In computational pathology, extracting spatial features from gigapixel whole slide images (WSIs) is a fundamental task, but due to their large size, WSIs are typically segmented into smaller tiles. A critical aspect of this analysis is aggregating information from these tiles to make predictions at the WSI level. We introduce a model that combines a message-passing graph neural network (GNN) with a state space model (Mamba) to capture both local and global spatial relationships among the tiles in WSIs. The model's effectiveness was demonstrated in predicting progression-free survival among patients with early-stage lung adenocarcinomas (LUAD). We compared the model with other state-of-the-art methods for tile-level information aggregation in WSIs, including tile-level information summary statistics-based aggregation, multiple instance learning (MIL)-based aggregation, GNN-based aggregation, and GNN-transformer-based aggregation. Additional experiments showed the impact of different types of node features and different tile sampling strategies on the model performance. This work can be easily extended to any WSI-based analysis. Code: https://github.com/rina-ding/gat-mamba.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
A Digital Beamforming Receiver Architecture Implemented on a FPGA for Space Applications
Authors:
Eduardo Ortega,
Agustín Martínez,
Antonio Oliva,
Fernando Sanz,
Oscar Rodríguez,
Manuel Prieto,
Pablo Parra,
Antonio Da Silva,
Sebastián Sánchez
Abstract:
The burgeoning interest within the space community in digital beamforming is largely attributable to the superior flexibility that satellites with active antenna systems offer for a wide range of applications, notably in communication services. This paper delves into the analysis and practical implementation of a Digital Beamforming and Digital Down Conversion (DDC) chain, leveraging a high-speed…
▽ More
The burgeoning interest within the space community in digital beamforming is largely attributable to the superior flexibility that satellites with active antenna systems offer for a wide range of applications, notably in communication services. This paper delves into the analysis and practical implementation of a Digital Beamforming and Digital Down Conversion (DDC) chain, leveraging a high-speed Analog-to-Digital Converter (ADC) certified for space applications alongside a high-performance Field-Programmable Gate Array (FPGA). The proposed design strategy focuses on optimizing resource efficiency and minimizing power consumption by strategically sequencing the beamformer processor ahead of the complex down-conversion operation. This innovative approach entails the application of demodulation and low-pass filtering exclusively to the aggregated beam channel, culminating in a marked reduction in the requisite digital signal processing resources relative to traditional, more resource-intensive digital beamforming and DDC architectures. In the experimental validation, an evaluation board integrating a high-speed ADC and a FPGA was utilized. This setup facilitated the empirical validation of the design's efficacy by applying various RF input signals to the digital beamforming receiver system. The ADC employed is capable of high-resolution signal processing, while the FPGA provides the necessary computational flexibility and speed for real-time digital signal processing tasks. The findings underscore the potential of this design to significantly enhance the efficiency and performance of digital beamforming systems in space applications.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Improving the Design of Linear Controllers for Homogeneous Platooning under Disturbances
Authors:
Emerson A. da Silva,
Leonardo A. Mozelli,
Armando A. Neto,
Fernando O. Souza
Abstract:
This paper addresses the problem of longitudinal platooning control of homogeneous vehicles subject to external disturbances, such as wind gusts, road slopes, and parametric uncertainties. Our control objective is to maintain the relative distance of the cars regarding their nearby teammates in a decentralized manner. Therefore, we proposed a novel control law to compute the acceleration commands…
▽ More
This paper addresses the problem of longitudinal platooning control of homogeneous vehicles subject to external disturbances, such as wind gusts, road slopes, and parametric uncertainties. Our control objective is to maintain the relative distance of the cars regarding their nearby teammates in a decentralized manner. Therefore, we proposed a novel control law to compute the acceleration commands of each vehicle that includes the integral of the spacing error, which endows the controller with the capability to mitigate external disturbances in steady-state conditions. We adopt a constant distance spacing policy and employ generalized look-ahead and bidirectional network topologies. We provide formal conditions for the controller synthesis that ensure the internal stability of the platoon under the proposed control law in the presence of constant and bounded disturbances affecting multiple vehicles. Experiments considering nonlinear vehicle models in the high-fidelity CARLA simulator environment under different disturbances, parametric uncertainties, and several network topologies demonstrate the effectiveness of our approach.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Image-Based Soil Organic Carbon Remote Sensing from Satellite Images with Fourier Neural Operator and Structural Similarity
Authors:
Ken C. L. Wong,
Levente Klein,
Ademir Ferreira da Silva,
Hongzhi Wang,
Jitendra Singh,
Tanveer Syeda-Mahmood
Abstract:
Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil prope…
▽ More
Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil properties at a global scale, estimation of SOC through satellite data has been explored as an alternative to manual soil sampling. Although existing studies show promising results, they are mainly based on pixel-based approaches with traditional machine learning methods, and convolutional neural networks (CNNs) are uncommon. To study the use of CNNs on SOC remote sensing, here we propose the FNO-DenseNet based on the Fourier neural operator (FNO). By combining the advantages of the FNO and DenseNet, the FNO-DenseNet outperformed the FNO in our experiments with hundreds of times fewer parameters. The FNO-DenseNet also outperformed a pixel-based random forest by 18% in the mean absolute percentage error.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring
Authors:
Juan Sebastián Cañas,
Maria Paula Toro-Gómez,
Larissa Sayuri Moreira Sugai,
Hernán Darío Benítez Restrepo,
Jorge Rudas,
Breyner Posso Bautista,
Luís Felipe Toledo,
Simone Dena,
Adão Henrique Rosa Domingos,
Franco Leandro de Souza,
Selvino Neckel-Oliveira,
Anderson da Rosa,
Vítor Carvalho-Rocha,
José Vinícius Bernardy,
José Luiz Massao Moreira Sugai,
Carolina Emília dos Santos,
Rogério Pereira Bastos,
Diego Llusia,
Juan Sebastián Ulloa
Abstract:
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians ca…
▽ More
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources can be found on our GitHub repository https://github.com/soundclim/anuraset.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
eXplainable Artificial Intelligence on Medical Images: A Survey
Authors:
Matteus Vargas Simão da Silva,
Rodrigo Reis Arrais,
Jhessica Victoria Santos da Silva,
Felipe Souza Tânios,
Mateus Antonio Chinelatto,
Natalia Backhaus Pereira,
Renata De Paris,
Lucas Cesar Ferreira Domingos,
Rodrigo Dória Villaça,
Vitor Lopes Fabris,
Nayara Rossi Brito da Silva,
Ana Claudia Akemi Matsuki de Faria,
Jose Victor Nogueira Alves da Silva,
Fabiana Cristina Queiroz de Oliveira Marucci,
Francisco Alves de Souza Neto,
Danilo Xavier Silva,
Vitor Yukio Kondo,
Claudio Filipi Gonçalves dos Santos
Abstract:
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such…
▽ More
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such black box models to permit the desired assessment. This survey analyses several recent studies in the XAI field applied to medical diagnosis research, allowing some explainability of the machine learning results in several different diseases, such as cancers and COVID-19.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
A Two-Dimensional FFT Precoded Filter Bank Scheme
Authors:
R. Pereira Junior,
C. A. F. da Rocha,
B. S. Chang,
D. Le Ruyet
Abstract:
This work proposes a new precoded filter bank (FB) system via a two-dimensional (2D) fast Fourier transform (2D-FFT). Its structure is similar to Orthogonal Time Frequency Space (OTFS) systems, where the OFDM transmitter is changed to a filter bank multi-carrier (FBMC) one, thus obtaining a lower out-of-band emission. The complex orthogonality of the FBMC transmission is guaranteed by using precod…
▽ More
This work proposes a new precoded filter bank (FB) system via a two-dimensional (2D) fast Fourier transform (2D-FFT). Its structure is similar to Orthogonal Time Frequency Space (OTFS) systems, where the OFDM transmitter is changed to a filter bank multi-carrier (FBMC) one, thus obtaining a lower out-of-band emission. The complex orthogonality of the FBMC transmission is guaranteed by using precoding based on a discrete Fourier transform, which is also used to implement the two-dimensional fast Fourier transform. Through the use of a global transmission matrix, we propose a hybrid receiver for the new system. First, a frequency domain equalization is performed, followed by an interference cancellation on the delay-Doppler domain. The simulation results show that the proposed system obtains an error performance similar to other OTFS systems, and superior performance as compared to other precoded FBMC systems.
△ Less
Submitted 11 June, 2022;
originally announced June 2022.
-
STEAM++ An Extensible End-To-End Framework for Developing IoT Data Processing Applications in the Fog
Authors:
Márcio Miguel Gomes,
Rodrigo da Rosa Righi,
Cristiano André da Costa,
Dalvan Griebler
Abstract:
IoT applications usually rely on cloud computing services to perform data analysis such as filtering, aggregation, classification, pattern detection, and prediction. When applied to specific domains, the IoT needs to deal with unique constraints. Besides the hostile environment such as vibration and electric-magnetic interference, resulting in malfunction, noise, and data loss, industrial plants o…
▽ More
IoT applications usually rely on cloud computing services to perform data analysis such as filtering, aggregation, classification, pattern detection, and prediction. When applied to specific domains, the IoT needs to deal with unique constraints. Besides the hostile environment such as vibration and electric-magnetic interference, resulting in malfunction, noise, and data loss, industrial plants often have Internet access restricted or unavailable, forcing us to design stand-alone fog and edge computing solutions. In this context, we present STEAM++, a lightweight and extensible framework for real-time data stream processing and decision-making in the network edge, targeting hardware-limited devices, besides proposing a micro-benchmark methodology for assessing embedded IoT applications. In real-case experiments in a semiconductor industry, we processed an entire data flow, from values sensing, processing and analyzing data, detecting relevant events, and finally, publishing results to a dashboard. On average, the application consumed less than 500kb RAM and 1.0% of CPU usage, processing up to 239 data packets per second and reducing the output data size to 14% of the input raw data size when notifying events.
△ Less
Submitted 7 April, 2022;
originally announced May 2022.
-
Evaluation of Convolutional Neural Networks for COVID-19 Classification on Chest X-Rays
Authors:
Felipe André Zeiser,
Cristiano André da Costa,
Gabriel de Oliveira Ramos,
Henrique Bohn,
Ismael Santos,
Rodrigo da Rosa Righi
Abstract:
Early identification of patients with COVID-19 is essential to enable adequate treatment and to reduce the burden on the health system. The gold standard for COVID-19 detection is the use of RT-PCR tests. However, due to the high demand for tests, these can take days or even weeks in some regions of Brazil. Thus, an alternative for detecting COVID-19 is the analysis of Digital Chest X-rays (XR). C…
▽ More
Early identification of patients with COVID-19 is essential to enable adequate treatment and to reduce the burden on the health system. The gold standard for COVID-19 detection is the use of RT-PCR tests. However, due to the high demand for tests, these can take days or even weeks in some regions of Brazil. Thus, an alternative for detecting COVID-19 is the analysis of Digital Chest X-rays (XR). Changes due to COVID-19 can be detected in XR, even in asymptomatic patients. In this context, models based on deep learning have great potential to be used as support systems for diagnosis or as screening tools. In this paper, we propose the evaluation of convolutional neural networks to identify pneumonia due to COVID-19 in XR. The proposed methodology consists of a preprocessing step of the XR, data augmentation, and classification by the convolutional architectures DenseNet121, InceptionResNetV2, InceptionV3, MovileNetV2, ResNet50, and VGG16 pre-trained with the ImageNet dataset. The obtained results demonstrate that the VGG16 architecture obtained superior performance in the classification of XR for the evaluation metrics using the methodology proposed in this article. The obtained results for our methodology demonstrate that the VGG16 architecture presented a superior performance in the classification of XR, with an Accuracy of 85.11%, Sensitivity of 85.25%, Specificity of $85.16%, F1-score of $85.03%, and an AUC of 0.9758.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
The Behavior of Internet Traffic for Internet Services during COVID-19 Pandemic Scenario
Authors:
Carlos Alexandre Gouvea da Silva,
Allan Christian Krainski Ferrari,
Cristiano Osinski,
Douglas Antonio Firmino Pelacini
Abstract:
Since the end of 2019, the SARS-CoV-2 virus known as COVID-19 has spread rapidly around the world, forcing many governments to impose restrictive blocking or lockdown to combat the pandemic. With locomotion restriction of people in almost of countries of the world, workers and students needed to keep their activities at home. As a result, people's behavior, habits, and the way they started using t…
▽ More
Since the end of 2019, the SARS-CoV-2 virus known as COVID-19 has spread rapidly around the world, forcing many governments to impose restrictive blocking or lockdown to combat the pandemic. With locomotion restriction of people in almost of countries of the world, workers and students needed to keep their activities at home. As a result, people's behavior, habits, and the way they started using the Internet changed significantly. Like professionals of offices, the younger played an important role in this behavior, especially in the type of resources used by them. As result, the characterization and traffic of communication networks were affected in some way. In this perspective article, we join from many available studies about the COVID-19 effect at networks and investigate the effects on the Internet traffic of using services such as video streaming, video conferencing, and gaming during 2020's months of the pandemic.
△ Less
Submitted 9 May, 2021;
originally announced May 2021.
-
Binary Segmentation of Seismic Facies Using Encoder-Decoder Neural Networks
Authors:
Gefersom Lima,
Gabriel Ramos,
Sandro Rigo,
Felipe Zeiser,
Ariane da Silveira
Abstract:
The interpretation of seismic data is vital for characterizing sediments' shape in areas of geological study. In seismic interpretation, deep learning becomes useful for reducing the dependence on handcrafted facies segmentation geometry and the time required to study geological areas. This work presents a Deep Neural Network for Facies Segmentation (DNFS) to obtain state-of-the-art results for se…
▽ More
The interpretation of seismic data is vital for characterizing sediments' shape in areas of geological study. In seismic interpretation, deep learning becomes useful for reducing the dependence on handcrafted facies segmentation geometry and the time required to study geological areas. This work presents a Deep Neural Network for Facies Segmentation (DNFS) to obtain state-of-the-art results for seismic facies segmentation. DNFS is trained using a combination of cross-entropy and Jaccard loss functions. Our results show that DNFS obtains highly detailed predictions for seismic facies segmentation using fewer parameters than StNet and U-Net.
△ Less
Submitted 14 November, 2020;
originally announced December 2020.
-
A Workbench for Testing and Simulation Faults in Three-phase Electric Motors with Intelligent Electronic Device and Microcontrolled System
Authors:
Giovanni Faria,
Michel Fernandes Peres,
Osmar Moreira da Silva Neto,
Jefferson Rodrigo Schuertz,
Edson Leonardo dos Santos,
Carlos Alexandre Gouvea da Silva
Abstract:
Electric motors can be damaged or operate improperly from a possible set of failures. Such failures are related to high or very low voltage and current levels, phase loss or blocked rotor. Therefore, it is important to protect these equipments through appropriate mechanisms. Alternatively, a workbench can simulate detectable failures related to the engines, allowing to change parameters, in which…
▽ More
Electric motors can be damaged or operate improperly from a possible set of failures. Such failures are related to high or very low voltage and current levels, phase loss or blocked rotor. Therefore, it is important to protect these equipments through appropriate mechanisms. Alternatively, a workbench can simulate detectable failures related to the engines, allowing to change parameters, in which maintenance operators are able to identify the results of these changes. This work presents the development of a workbench as a tool for testing electrical machines and drives. The workbench is based on the Arduino programming platform (microcontroller system), in which it checks the functioning of electric motors under the condition of failures that may occur in this engine. Motor protections are carried out through an Intelligent Electronic Device (IED), which are popularly known as intelligent relays. The results show the development of a workbench that can test and identify several faults in a small three-phase motor.
△ Less
Submitted 21 November, 2020;
originally announced November 2020.
-
Improving Solar and PV Power Prediction with Ensemble Methods
Authors:
L. A. Dao,
L. Ferrarini,
D. La Carrubba
Abstract:
Estimation of the generated power of renewable energy resources is in general important for planning operations as well as demand balance and power quality. This paper addresses the problem of the estimation of the short-term (3-hour ahead) and medium-term (1-day ahead) generated power of a photovoltaic plant. Firstly, the design of day-ahead solar radiation predictors is investigated with differe…
▽ More
Estimation of the generated power of renewable energy resources is in general important for planning operations as well as demand balance and power quality. This paper addresses the problem of the estimation of the short-term (3-hour ahead) and medium-term (1-day ahead) generated power of a photovoltaic plant. Firstly, the design of day-ahead solar radiation predictors is investigated with different setups of time series models, and with their combinations with the weather forecast services using ensemble methods. Support Vector Machine methods are also adopted in this stage, to cluster data. Secondly, under a similar ensemble framework, the generated power prediction is investigated. The whole generated power and solar radiation prediction tasks are then implemented on a low-cost, embedded mini PC module Raspberry Pi 3. As an application, the prediction is employed in the control system of a typical microgrid settings focusing on energy management problem. The impact of the quality of generated power prediction on the performance of the controller is also evaluated in this paper.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Video Quality Enhancement Using Deep Learning-Based Prediction Models for Quantized DCT Coefficients in MPEG I-frames
Authors:
Antonio J G Busson,
Paulo R C Mendes,
Daniel de S Moraes,
Álvaro M da Veiga,
Álan L V Guedes,
Sérgio Colcher
Abstract:
Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from the lossy JPEG/MPEG compression technique. Most of them are built upon the processing made on the spatial domain. In this work, we propose a MPEG video decoder that is purely based on the frequency-to-frequency domain: it reads the quantized DCT coefficients…
▽ More
Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from the lossy JPEG/MPEG compression technique. Most of them are built upon the processing made on the spatial domain. In this work, we propose a MPEG video decoder that is purely based on the frequency-to-frequency domain: it reads the quantized DCT coefficients received from a low-quality I-frames bitstream and, using a deep learning-based model, predicts the missing coefficients in order to recompose the same frames with enhanced quality. In experiments with a video dataset, our best model was able to improve from frames with quantized DCT coefficients corresponding to a Quality Factor (QF) of 10 to enhanced quality frames with QF slightly near to 20.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
An Embedded System for Monitoring Industrial Air Dehumidifiers using a Mobile Android Application for IEEE 802.11 Networks
Authors:
Erik de Oliveira Rosa,
Lincoln Cezar Grabarski,
Marcos Fernando Fragoso,
Allan Cristian Krainski Ferrari,
Jefferson Rodrigo Schuertz,
Carlos Alexandre Gouvea da Silva
Abstract:
The constant technological evolution allowed significant advances and improvements in the processes of industries, mainly in areas that demand greater control and environmental air efficiency. In this way, Embedded Systems allows the development of products and services that aim to solve or propose solutions in these industrial environments. This article presents the development of an Embedded Sys…
▽ More
The constant technological evolution allowed significant advances and improvements in the processes of industries, mainly in areas that demand greater control and environmental air efficiency. In this way, Embedded Systems allows the development of products and services that aim to solve or propose solutions in these industrial environments. This article presents the development of an Embedded System with a Programmable Logic Controller (PLC) and Arduino for industrial air dehumidifier, which allows the monitoring of failures remotely from a reliable data communication in a mobile application for Android operating system (OS) on a wireless network IEEE 802.11. As a result, a prototype of the test bench for the Embedded System is presented in which the main parameters of temperature sensors and operating conditions of the dehumidifiers are checked.
△ Less
Submitted 9 August, 2020;
originally announced August 2020.
-
Automatic Detection of Aedes aegypti Breeding Grounds Based on Deep Networks with Spatio-Temporal Consistency
Authors:
Wesley L. Passos,
Gabriel M. Araujo,
Amaro A. de Lima,
Sergio L. Netto,
Eduardo A. B. da Silva
Abstract:
Every year, the Aedes aegypti mosquito infects millions of people with diseases such as dengue, zika, chikungunya, and urban yellow fever. The main form to combat these diseases is to avoid mosquito reproduction by searching for and eliminating the potential mosquito breeding grounds. In this work, we introduce a comprehensive dataset of aerial videos, acquired with an unmanned aerial vehicle, con…
▽ More
Every year, the Aedes aegypti mosquito infects millions of people with diseases such as dengue, zika, chikungunya, and urban yellow fever. The main form to combat these diseases is to avoid mosquito reproduction by searching for and eliminating the potential mosquito breeding grounds. In this work, we introduce a comprehensive dataset of aerial videos, acquired with an unmanned aerial vehicle, containing possible mosquito breeding sites. All frames of the video dataset were manually annotated with bounding boxes identifying all objects of interest. This dataset was employed to develop an automatic detection system of such objects based on deep convolutional networks. We propose the exploitation of the temporal information contained in the videos by the incorporation, in the object detection pipeline, of a spatio-temporal consistency module that can register the detected objects, minimizing most false-positive and false-negative occurrences. Also, we experimentally show that using videos is more beneficial than only composing a mosaic using the frames. Using the ResNet-50-FPN as a backbone, we achieve F$_1$-scores of 0.65 and 0.77 on the object-level detection of `tires' and `water tanks', respectively, illustrating the system capabilities to properly locate potential mosquito breeding objects.
△ Less
Submitted 27 November, 2021; v1 submitted 29 July, 2020;
originally announced July 2020.
-
A hierarchical distributed predictive control approach for microgrids energy management
Authors:
Le Anh Dao,
Alireza Dehghani-Pilehvarani,
Achilleas Markou,
Luca Ferrarini
Abstract:
This paper addresses the problem of management and coordination of energy resources in a typical microgrid, including smart buildings as flexible loads, energy storages, and renewables. The overall goal is to provide a comprehensive and innovative framework to maximize the overall benefit, still accounting for possible requests to change the load profile coming from the grid and leaving every sing…
▽ More
This paper addresses the problem of management and coordination of energy resources in a typical microgrid, including smart buildings as flexible loads, energy storages, and renewables. The overall goal is to provide a comprehensive and innovative framework to maximize the overall benefit, still accounting for possible requests to change the load profile coming from the grid and leaving every single building or user to balance between servicing those requests and satisfying his own comfort levels. The user involvement in the decision-making process is granted by a management and control solution exploiting an innovative distributed model predictive control approach with coordination. In addition, also a hierarchical structure is proposed, to integrate the distributed MPC user-side with the microgrid control, also implemented with an MPC technique. The proposed overall approach has been implemented and tested in several experiments in the laboratory facility for distributed energy systems (Smart RUE) at NTUA, Athens, Greece. Simulation analysis and results complement the testing, showing the accuracy and the potential of the method, also from the perspective of implementation.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Validation of the rapid detection approach for enhancing the electronic nose systems performance, using different deep learning models and support vector machines
Authors:
Juan C. Rodriguez Gamboa,
Adenilton J. da Silva,
Ismael C. S. Araujo,
Eva Susana Albarracin E.,
Cristhian M. Duran A
Abstract:
Real-time gas classification is an essential issue and challenge in applications such as food and beverage quality control, accident prevention in industrial environments, for instance. In recent years, the Deep Learning (DL) models have shown great potential to classify and forecast data in diverse problems, even in the electronic nose (E-Nose) field. In this work, we used a Support Vector Machin…
▽ More
Real-time gas classification is an essential issue and challenge in applications such as food and beverage quality control, accident prevention in industrial environments, for instance. In recent years, the Deep Learning (DL) models have shown great potential to classify and forecast data in diverse problems, even in the electronic nose (E-Nose) field. In this work, we used a Support Vector Machines (SVM) algorithm and three different DL models to validate the rapid detection approach (based on processing an early portion of raw signals and a rising window protocol) over different measurement conditions. We performed a set of trials with five different E-Nose databases that include fifteen datasets. Based on the results, we concluded that the proposed approach has a high potential, and it can be suitable to be used for E-nose technologies, reducing the necessary time for making forecasts and accelerating the response time. Because in most cases, it achieved reliable estimates using only the first 30% or fewer of measurement data (counted after the gas injection starts.) The findings suggest that the rapid detection approach generates reliable forecasting models using different classification methods. Still, SVM seems to obtain the best accuracy, right window size, and better training time.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
Review of LoRaWAN Applications
Authors:
Lucas R. de Oliveira,
Poliana de Moraes,
Lauro P. S. Neto,
Arlindo F. da Conceição
Abstract:
This paper presents a systematic review of LoRaWAN applications. We analyzed 71 cases of application, with a focus on deploy and challenges faced. The review summarizes the characteristics of the network protocol and shows applications in the context of smart cities, smart grids, smart farms, health, location, industry, and military. Finally, this article analyzes some security issues.
This paper presents a systematic review of LoRaWAN applications. We analyzed 71 cases of application, with a focus on deploy and challenges faced. The review summarizes the characteristics of the network protocol and shows applications in the context of smart cities, smart grids, smart farms, health, location, industry, and military. Finally, this article analyzes some security issues.
△ Less
Submitted 13 April, 2020;
originally announced April 2020.
-
Wine quality rapid detection using a compact electronic nose system: application focused on spoilage thresholds by acetic acid
Authors:
Juan C. Rodriguez Gamboa,
Eva Susana Albarracin E.,
Adenilton J. da Silva,
Luciana Leite,
Tiago A. E. Ferreira
Abstract:
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP)…
▽ More
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP) neural network, can perform early detection of wine spoilage thresholds in routine tasks of wine quality control. To obtain rapid and online detection, we propose a method of rising-window focused on raw data processing to find an early portion of the sensor signals with the best recognition performance. Our approach was compared with the conventional approach employed in E-Noses for gas recognition that involves feature extraction and selection techniques for preprocessing data, succeeded by a Support Vector Machine (SVM) classifier. The results evidence that is possible to classify three wine spoilage levels in 2.7 seconds after the gas injection point, implying in a methodology 63 times faster than the results obtained with the conventional approach in our experimental setup.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks
Authors:
Aman Rana,
Alarice Lowe,
Marie Lithgow,
Katharine Horback,
Tyler Janovitz,
Annacarolina Da Silva,
Harrison Tsai,
Vignesh Shanmugam,
Hyung-Jin Yoon,
Pratik Shah
Abstract:
Histopathological diagnoses of tumors in tissue biopsy after Hematoxylin and Eosin (H&E) staining is the gold standard for oncology care. H&E staining is slow and uses dyes, reagents and precious tissue samples that cannot be reused. Thousands of native nonstained RGB Whole Slide Image (RWSI) patches of prostate core tissue biopsies were registered with their H&E stained versions. Conditional Gene…
▽ More
Histopathological diagnoses of tumors in tissue biopsy after Hematoxylin and Eosin (H&E) staining is the gold standard for oncology care. H&E staining is slow and uses dyes, reagents and precious tissue samples that cannot be reused. Thousands of native nonstained RGB Whole Slide Image (RWSI) patches of prostate core tissue biopsies were registered with their H&E stained versions. Conditional Generative Adversarial Neural Networks (cGANs) that automate conversion of native nonstained RWSI to computational H&E stained images were then trained. High similarities between computational and H&E dye stained images with Structural Similarity Index (SSIM) 0.902, Pearsons Correlation Coefficient (CC) 0.962 and Peak Signal to Noise Ratio (PSNR) 22.821 dB were calculated. A second cGAN performed accurate computational destaining of H&E dye stained images back to their native nonstained form with SSIM 0.9, CC 0.963 and PSNR 25.646 dB. A single-blind study computed more than 95% pixel-by-pixel overlap between prostate tumor annotations on computationally stained images, provided by five-board certified MD pathologists, with those on H&E dye stained counterparts. We report the first visualization and explanation of neural network kernel activation maps during H&E staining and destaining of RGB images by cGANs. High similarities between kernel activation maps of computational and H&E stained images (Mean-Squared Errors <0.0005) provide additional mathematical and mechanistic validation of the staining system. Our neural network framework thus is automated, explainable and performs high precision H&E staining and destaining of low cost native RGB images, and is computer vision and physician authenticated for rapid and accurate tumor diagnoses.
△ Less
Submitted 2 August, 2019;
originally announced August 2019.
-
A Music Classification Model based on Metric Learning and Feature Extraction from MP3 Audio Files
Authors:
Angelo C. Mendes da Silva,
Mauricio A. Nunes,
Raul Fonseca Neto
Abstract:
The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structure…
▽ More
The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structured prediction approach from a set of MP3 audio files containing several music genres. The main objective of this work is to make possible learning a personalized metric for each customer. To extract the acoustic information we use the Mel-Frequency Cepstral Coefficient (MFCC) and make a dimensionality reduction with the use of Principal Components Analysis. We attest the model validity performing a set of experiments and comparing the training and testing results with baseline algorithms, such as K-means and Soft Margin Linear Support Vector Machine (SVM). Experiments show promising results and encourage the future development of an online version of the learning model.
△ Less
Submitted 17 September, 2019; v1 submitted 29 May, 2019;
originally announced May 2019.
-
Reconstruction of Electrical Impedance Tomography Using Fish School Search, Non-Blind Search, and Genetic Algorithm
Authors:
Valter Augusto de Freitas Barbosa,
Reiga Ramalho Ribeiro,
Allan Rivalles Souza Feitosa,
Victor Luiz Bezerra Araújo da Silva,
Arthur Diego Dias Rocha,
Rafaela Covello de Freitas,
Ricardo Emmanuel de Souza,
Wellington Pinheiro dos Santos
Abstract:
Electrical Impedance Tomography (EIT) is a noninvasive imaging technique that does not use ionizing radiation, with application both in environmental sciences and in health. Image reconstruction is performed by solving an inverse problem and ill-posed. Evolutionary Computation and Swarm Intelligence have become a source of methods for solving inverse problems. Fish School Search (FSS) is a promisi…
▽ More
Electrical Impedance Tomography (EIT) is a noninvasive imaging technique that does not use ionizing radiation, with application both in environmental sciences and in health. Image reconstruction is performed by solving an inverse problem and ill-posed. Evolutionary Computation and Swarm Intelligence have become a source of methods for solving inverse problems. Fish School Search (FSS) is a promising search and optimization method, based on the dynamics of schools of fish. In this article the authors present a method for reconstruction of EIT images based on FSS and Non-Blind Search (NBS). The method was evaluated using numerical phantoms consisting of electrical conductivity images with subjects in the center, between the center and the edge and on the edge of a circular section, with meshes of 415 finite elements. The authors performed 20 simulations for each configuration. Results showed that both FSS and FSS-NBS were able to converge faster than genetic algorithms.
△ Less
Submitted 3 December, 2017;
originally announced December 2017.