-
Automatic detection of abnormal clinical EEG: comparison of a finetuned foundation model with two deep learning models
Authors:
Aurore Bussalb,
François Le Gac,
Guillaume Jubien,
Mohamed Rahmouni,
Ruggero G. Bettinardi,
Pedro Marinho R. de Oliveira,
Phillipe Derambure,
Nicolas Gaspard,
Jacques Jonas,
Louis Maillard,
Laurent Vercueil,
Hervé Vespignani,
Philippe Laval,
Laurent Koessler,
Ulysse Gimenez
Abstract:
Electroencephalography (EEG) is commonly used by physicians for the diagnosis of numerous neurological disorders. Due to the large volume of EEGs requiring interpretation and the specific expertise involved, artificial intelligence-based tools are being developed to assist in their visual analysis. In this paper, we compare two deep learning models (CNN-LSTM and Transformer-based) with BioSerenity…
▽ More
Electroencephalography (EEG) is commonly used by physicians for the diagnosis of numerous neurological disorders. Due to the large volume of EEGs requiring interpretation and the specific expertise involved, artificial intelligence-based tools are being developed to assist in their visual analysis. In this paper, we compare two deep learning models (CNN-LSTM and Transformer-based) with BioSerenity-E1, a recently proposed foundation model, in the task of classifying entire EEG recordings as normal or abnormal. The three models were trained or finetuned on 2,500 EEG recordings and their performances were evaluated on two private and one public datasets: a large multicenter dataset annotated by a single specialist (dataset A composed of n = 4,480 recordings), a small multicenter dataset annotated by three specialists (dataset B, n = 198), and the Temple University Abnormal (TUAB) EEG corpus evaluation dataset (n = 276). On dataset A, the three models achieved at least 86% balanced accuracy, with BioSerenity-E1 finetuned achieving the highest balanced accuracy (89.19% [88.36-90.41]). BioSerenity-E1 finetuned also achieved the best performance on dataset B, with 94.63% [92.32-98.12] balanced accuracy. The models were then validated on TUAB evaluation dataset, whose corresponding training set was not used during training, where they achieved at least 76% accuracy. Specifically, BioSerenity-E1 finetuned outperformed the other two models, reaching an accuracy of 82.25% [78.27-87.48]. Our results highlight the usefulness of leveraging pre-trained models for automatic EEG classification: enabling robust and efficient interpretation of EEG data with fewer resources and broader applicability.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context
Authors:
Bryan L. M. de Oliveira,
Luana G. B. Martins,
Bruno Brandão,
Luckeciano C. Melo
Abstract:
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous s…
▽ More
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous scenarios that require models to engage in information-seeking dialogue by asking clarifying questions before providing appropriate responses. Our evaluation of both open and closed models reveals that, while proprietary models generally perform better, all current assistants struggle to gather critical information effectively. They often require multiple turns to infer user intent and frequently default to generic responses without proper clarification. We provide a systematic methodology for generating diverse scenarios and evaluating models' information-seeking capabilities, which can be leveraged to automatically generate data for self-improvement. We also offer insights into the current limitations of language models in handling ambiguous requests through multi-turn interactions.
△ Less
Submitted 25 April, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
CARROT: A Cost Aware Rate Optimal Router
Authors:
Seamus Somerstep,
Felipe Maia Polo,
Allysson Flavio Melo de Oliveira,
Prattyush Mangal,
Mírian Silva,
Onkar Bhardwaj,
Mikhail Yurochkin,
Subha Maity
Abstract:
With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. We conduct a minimax analysis of the routing problem, providing a lower bound and finding that a simple router that predicts both cost and accuracy for each question can be minimax optimal. Inspired by t…
▽ More
With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. We conduct a minimax analysis of the routing problem, providing a lower bound and finding that a simple router that predicts both cost and accuracy for each question can be minimax optimal. Inspired by this, we introduce CARROT, a Cost AwaRe Rate Optimal rouTer that selects a model based on estimates of the models' cost and performance. Alongside CARROT, we also introduce the Smart Price-aware ROUTing (SPROUT) dataset to facilitate routing on a wide spectrum of queries with the latest state-of-the-art LLMs. Using SPROUT and prior benchmarks such as Routerbench and open-LLM-leaderboard-v2 we empirically validate CARROT's performance against several alternative routers.
△ Less
Submitted 19 May, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
Authors:
Bryan L. M. de Oliveira,
Murilo L. da Luz,
Bruno Brandão,
Luana G. B. Martins,
Telma W. de L. Soares,
Luckeciano C. Melo
Abstract:
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel ben…
▽ More
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that reimagines the classic 8-tile puzzle with a visual observation space of images sourced from arbitrarily large datasets. SPGym provides precise control over representation complexity through visual diversity, allowing researchers to systematically scale the representation learning challenge while maintaining consistent environment dynamics. Despite the apparent simplicity of the task, our experiments with both model-free and model-based RL algorithms reveal fundamental limitations in current methods. As we increase visual diversity by expanding the pool of possible images, all tested algorithms show significant performance degradation, with even state-of-the-art methods struggling to generalize across different visual inputs while maintaining consistent puzzle-solving capabilities. These results highlight critical gaps in visual representation learning for RL and provide clear directions for improving robustness and generalization in decision-making systems.
△ Less
Submitted 13 February, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
Real-time design of architectural structures with differentiable mechanics and neural networks
Authors:
Rafael Pastrana,
Eder Medina,
Isabel M. de Oliveira,
Sigrid Adriaenssens,
Ryan P. Adams
Abstract:
Designing mechanically efficient geometry for architectural structures like shells, towers, and bridges, is an expensive iterative process. Existing techniques for solving such inverse problems rely on traditional optimization methods, which are slow and computationally expensive, limiting iteration speed and design exploration. Neural networks would seem to offer a solution via data-driven amorti…
▽ More
Designing mechanically efficient geometry for architectural structures like shells, towers, and bridges, is an expensive iterative process. Existing techniques for solving such inverse problems rely on traditional optimization methods, which are slow and computationally expensive, limiting iteration speed and design exploration. Neural networks would seem to offer a solution via data-driven amortized optimization, but they often require extensive fine-tuning and cannot ensure that important design criteria, such as mechanical integrity, are met. In this work, we combine neural networks with a differentiable mechanics simulator to develop a model that accelerates the solution of shape approximation problems for architectural structures represented as bar systems. This model explicitly guarantees compliance with mechanical constraints while generating designs that closely match target geometries. We validate our approach in two tasks, the design of masonry shells and cable-net towers. Our model achieves better accuracy and generalization than fully neural alternatives, and comparable accuracy to direct optimization but in real time, enabling fast and reliable design exploration. We further demonstrate its advantages by integrating it into 3D modeling software and fabricating a physical prototype. Our work opens up new opportunities for accelerated mechanical design enhanced by neural networks for the built environment.
△ Less
Submitted 17 March, 2025; v1 submitted 4 September, 2024;
originally announced September 2024.
-
Unconditionally separating noisy $\mathsf{QNC}^0$ from bounded polynomial threshold circuits of constant depth
Authors:
Min-Hsiu Hsieh,
Leandro Mendes,
Michael de Oliveira,
Sathyawageeswar Subramanian
Abstract:
We study classes of constant-depth circuits with gates that compute restricted polynomial threshold functions, recently introduced by [Kum23] as a family that strictly generalizes $\mathsf{AC}^0$. Denoting these circuit families $\mathsf{bPTFC}^0[k]$ for $\textit{bounded polynomial threshold circuits}$ parameterized by an integer-valued degree-bound $k$, we prove three hardness results separating…
▽ More
We study classes of constant-depth circuits with gates that compute restricted polynomial threshold functions, recently introduced by [Kum23] as a family that strictly generalizes $\mathsf{AC}^0$. Denoting these circuit families $\mathsf{bPTFC}^0[k]$ for $\textit{bounded polynomial threshold circuits}$ parameterized by an integer-valued degree-bound $k$, we prove three hardness results separating these classes from constant-depth quantum circuits ($\mathsf{QNC}^0$).
$\hspace{2em}$ - We prove that the parity halving problem [WKS+19], which $\mathsf{QNC}^0$ over qubits can solve with certainty, remains average-case hard for polynomial size $\mathsf{bPTFC}^0[k]$ circuits for all $k=\mathcal{O}(n^{1/(5d)})$.
$\hspace{2em}$ - We construct a new family of relation problems based on computing $\mathsf{mod}\ p$ for each prime $p>2$, and prove a separation of $\mathsf{QNC}^0$ circuits over higher dimensional quantum systems (`qupits') against $\mathsf{bPTFC}^0[k]$ circuits for the same degree-bound parameter as above.
$\hspace{2em}$ - We prove that both foregoing results are noise-robust under the local stochastic noise model, by introducing fault-tolerant implementations of non-Clifford $\mathsf{QNC}^0/|\overline{T^{1/p}}>$ circuits, that use logical magic states as advice.
$\mathsf{bPTFC}^0[k]$ circuits can compute certain classes of Polynomial Threshold Functions (PTFs), which in turn serve as a natural model for neural networks and exhibit enhanced expressivity and computational capabilities. Furthermore, for large enough values of $k$, $\mathsf{bPTFC}^0[k]$ contains $\mathsf{TC}^0$ as a subclass. The main challenges we overcome include establishing classical average-case lower bounds, designing non-local games with quantum-classical gaps in winning probabilities and developing noise-resilient non-Clifford quantum circuits necessary to extend beyond qubits to higher dimensions.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Implementation and Applications of WakeWords Integrated with Speaker Recognition: A Case Study
Authors:
Alexandre Costa Ferro Filho,
Elisa Ayumi Masasi de Oliveira,
Iago Alves Brito,
Pedro Martins Bittencourt
Abstract:
This paper explores the application of artificial intelligence techniques in audio and voice processing, focusing on the integration of wake words and speaker recognition for secure access in embedded systems. With the growing prevalence of voice-activated devices such as Amazon Alexa, ensuring secure and user-specific interactions has become paramount. Our study aims to enhance the security frame…
▽ More
This paper explores the application of artificial intelligence techniques in audio and voice processing, focusing on the integration of wake words and speaker recognition for secure access in embedded systems. With the growing prevalence of voice-activated devices such as Amazon Alexa, ensuring secure and user-specific interactions has become paramount. Our study aims to enhance the security framework of these systems by leveraging wake words for initial activation and speaker recognition to validate user permissions. By incorporating these AI-driven methodologies, we propose a robust solution that restricts system usage to authorized individuals, thereby mitigating unauthorized access risks. This research delves into the algorithms and technologies underpinning wake word detection and speaker recognition, evaluates their effectiveness in real-world applications, and discusses the potential for their implementation in various embedded systems, emphasizing security and user convenience. The findings underscore the feasibility and advantages of employing these AI techniques to create secure, user-friendly voice-activated systems.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Efficient multi-prompt evaluation of LLMs
Authors:
Felipe Maia Polo,
Ronald Xu,
Lucas Weber,
Mírian Silva,
Onkar Bhardwaj,
Leshem Choshen,
Allysson Flavio Melo de Oliveira,
Yuekai Sun,
Mikhail Yurochkin
Abstract:
Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt va…
▽ More
Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt variants instead of finding a single prompt to evaluate with. We introduce PromptEval, a method for estimating performance across a large set of prompts borrowing strength across prompts and examples to produce accurate estimates under practical evaluation budgets. The resulting distribution can be used to obtain performance quantiles to construct various robust performance metrics (e.g., top 95% quantile or median). We prove that PromptEval consistently estimates the performance distribution and demonstrate its efficacy empirically on three prominent LLM benchmarks: MMLU, BIG-bench Hard, and LMentry; for example, PromptEval can accurately estimate performance quantiles across 100 prompt templates on MMLU with a budget equivalent to two single-prompt evaluations. Moreover, we show how PromptEval can be useful in LLM-as-a-judge and best prompt identification applications.
△ Less
Submitted 30 October, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
The power of shallow-depth Toffoli and qudit quantum circuits
Authors:
Alex Bredariol Grilo,
Elham Kashefi,
Damian Markham,
Michael de Oliveira
Abstract:
The relevance of shallow-depth quantum circuits has recently increased, mainly due to their applicability to near-term devices. In this context, one of the main goals of quantum circuit complexity is to find problems that can be solved by quantum shallow circuits but require more computational resources classically.
Our first contribution in this work is to prove new separations between classica…
▽ More
The relevance of shallow-depth quantum circuits has recently increased, mainly due to their applicability to near-term devices. In this context, one of the main goals of quantum circuit complexity is to find problems that can be solved by quantum shallow circuits but require more computational resources classically.
Our first contribution in this work is to prove new separations between classical and quantum constant-depth circuits. Firstly, we show a separation between constant-depth quantum circuits with quantum advice $\mathsf{QNC}^0/\mathsf{qpoly}$, and $\mathsf{AC}^0[p]$, which is the class of classical constant-depth circuits with unbounded-fan in and $\pmod{p}$ gates. In addition, we show a separation between $\mathsf{QAC}^0$, which additionally has Toffoli gates with unbounded control, and $\mathsf{AC}^0[p]$. This establishes the first such separation for a shallow-depth quantum class that does not involve quantum fan-out gates.
Secondly, we consider $\mathsf{QNC}^0$ circuits with infinite-size gate sets. We show that these circuits, along with (classical or quantum) prime modular gates, can implement threshold gates, showing that $\mathsf{QNC}^0[p]=\mathsf{QTC}^0$. Finally, we also show that in the infinite-size gateset case, these quantum circuit classes for higher-dimensional Hilbert spaces do not offer any advantage to standard qubit implementations.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Learning Input Constrained Control Barrier Functions for Guaranteed Safety of Car-Like Robots
Authors:
Sven Brüggemann,
Dominic Nightingale,
Jack Silberman,
Maurício de Oliveira
Abstract:
We propose a design method for a robust safety filter based on Input Constrained Control Barrier Functions (ICCBF) for car-like robots moving in complex environments. A robust ICCBF that can be efficiently implemented is obtained by learning a smooth function of the environment using Support Vector Machine regression. The method takes into account steering constraints and is validated in simulatio…
▽ More
We propose a design method for a robust safety filter based on Input Constrained Control Barrier Functions (ICCBF) for car-like robots moving in complex environments. A robust ICCBF that can be efficiently implemented is obtained by learning a smooth function of the environment using Support Vector Machine regression. The method takes into account steering constraints and is validated in simulation and a real experiment.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Detecting Semantic Conflicts using Static Analysis
Authors:
Galileu Santos de Jesus,
Paulo Borba,
Rodrigo Bonifácio,
Matheus Barbosa de Oliveira
Abstract:
Version control system tools empower developers to independently work on their development tasks. These tools also facilitate the integration of changes through merging operations, and report textual conflicts. However, when developers integrate their changes, they might encounter other types of conflicts that are not detected by current merge tools. In this paper, we focus on dynamic semantic con…
▽ More
Version control system tools empower developers to independently work on their development tasks. These tools also facilitate the integration of changes through merging operations, and report textual conflicts. However, when developers integrate their changes, they might encounter other types of conflicts that are not detected by current merge tools. In this paper, we focus on dynamic semantic conflicts, which occur when merging reports no textual conflicts but results in undesired interference - causing unexpected program behavior at runtime. To address this issue, we propose a technique that explores the use of static analysis to detect interference when merging contributions from two developers. We evaluate our technique using a dataset of 99 experimental units extracted from merge scenarios. The results provide evidence that our technique presents significant interference detection capability. It outperforms, in terms of F1 score and recall, previous methods that rely on dynamic analysis for detecting semantic conflicts, but these show better precision. Our technique precision is comparable to the ones observed in other studies that also leverage static analysis or use theorem proving techniques to detect semantic conflicts, albeit with significantly improved overall performance.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Calculating and Visualizing Counterfactual Feature Importance Values
Authors:
Bjorge Meulemeester,
Raphael Mazzine Barbosa De Oliveira,
David Martens
Abstract:
Despite the success of complex machine learning algorithms, mostly justified by an outstanding performance in prediction tasks, their inherent opaque nature still represents a challenge to their responsible application. Counterfactual explanations surged as one potential solution to explain individual decision results. However, two major drawbacks directly impact their usability: (1) the isonomic…
▽ More
Despite the success of complex machine learning algorithms, mostly justified by an outstanding performance in prediction tasks, their inherent opaque nature still represents a challenge to their responsible application. Counterfactual explanations surged as one potential solution to explain individual decision results. However, two major drawbacks directly impact their usability: (1) the isonomic view of feature changes, in which it is not possible to observe \textit{how much} each modified feature influences the prediction, and (2) the lack of graphical resources to visualize the counterfactual explanation. We introduce Counterfactual Feature (change) Importance (CFI) values as a solution: a way of assigning an importance value to each feature change in a given counterfactual explanation. To calculate these values, we propose two potential CFI methods. One is simple, fast, and has a greedy nature. The other, coined CounterShapley, provides a way to calculate Shapley values between the factual-counterfactual pair. Using these importance values, we additionally introduce three chart types to visualize the counterfactual explanations: (a) the Greedy chart, which shows a greedy sequential path for prediction score increase up to predicted class change, (b) the CounterShapley chart, depicting its respective score in a simple and one-dimensional chart, and finally (c) the Constellation chart, which shows all possible combinations of feature changes, and their impact on the model's prediction score. For each of our proposed CFI methods and visualization schemes, we show how they can provide more information on counterfactual explanations. Finally, an open-source implementation is offered, compatible with any counterfactual explanation generator algorithm. Code repository at: https://github.com/ADMAntwerp/CounterPlots
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
Unveiling the Potential of Counterfactuals Explanations in Employability
Authors:
Raphael Mazzine Barbosa de Oliveira,
Sofie Goethals,
Dieter Brughmans,
David Martens
Abstract:
In eXplainable Artificial Intelligence (XAI), counterfactual explanations are known to give simple, short, and comprehensible justifications for complex model decisions. However, we are yet to see more applied studies in which they are applied in real-world cases. To fill this gap, this study focuses on showing how counterfactuals are applied to employability-related problems which involve complex…
▽ More
In eXplainable Artificial Intelligence (XAI), counterfactual explanations are known to give simple, short, and comprehensible justifications for complex model decisions. However, we are yet to see more applied studies in which they are applied in real-world cases. To fill this gap, this study focuses on showing how counterfactuals are applied to employability-related problems which involve complex machine learning algorithms. For these use cases, we use real data obtained from a public Belgian employment institution (VDAB). The use cases presented go beyond the mere application of counterfactuals as explanations, showing how they can enhance decision support, comply with legal requirements, guide controlled changes, and analyze novel insights.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
An Evidence-based Roadmap for IoT Software Systems Engineering
Authors:
Rebeca C. Motta,
Káthia M. de Oliveira,
Guilherme H. Travassos
Abstract:
Context: The Internet of Things (IoT) has brought expectations for software inclusion in everyday objects. However, it has challenges and requires multidisciplinary technical knowledge involving different areas that should be combined to enable IoT software systems engineering. Goal: To present an evidence-based roadmap for IoT development to support developers in specifying, designing, and implem…
▽ More
Context: The Internet of Things (IoT) has brought expectations for software inclusion in everyday objects. However, it has challenges and requires multidisciplinary technical knowledge involving different areas that should be combined to enable IoT software systems engineering. Goal: To present an evidence-based roadmap for IoT development to support developers in specifying, designing, and implementing IoT systems. Method: An iterative approach based on experimental studies to acquire evidence to define the IoT Roadmap. Next, the Systems Engineering Body of Knowledge life cycle was used to organize the roadmap and set temporal dimensions for IoT software systems engineering. Results: The studies revealed seven IoT Facets influencing IoT development. The IoT Roadmap comprises 117 items organized into 29 categories representing different concerns for each Facet. In addition, an experimental study was conducted observing a real case of a healthcare IoT project, indicating the roadmap applicability. Conclusions: The IoT Roadmap can be a feasible instrument to assist IoT software systems engineering because it can (a) support researchers and practitioners in understanding and characterizing the IoT and (b) provide a checklist to identify the applicable recommendations for engineering IoT software systems.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Morphological Classification of Galaxies in S-PLUS using an Ensemble of Convolutional Networks
Authors:
N. M. Cardoso,
G. B. O. Schwarz,
L. O. Dias,
C. R. Bom,
L. Sodré Jr.,
C. Mendes de Oliveira
Abstract:
The universe is composed of galaxies that have diverse shapes. Once the structure of a galaxy is determined, it is possible to obtain important information about its formation and evolution. Morphologically classifying galaxies means cataloging them according to their visual appearance and the classification is linked to the physical properties of the galaxy. A morphological classification made th…
▽ More
The universe is composed of galaxies that have diverse shapes. Once the structure of a galaxy is determined, it is possible to obtain important information about its formation and evolution. Morphologically classifying galaxies means cataloging them according to their visual appearance and the classification is linked to the physical properties of the galaxy. A morphological classification made through visual inspection is subject to biases introduced by subjective observations made by human volunteers. For this reason, systematic, objective and easily reproducible classification of galaxies has been gaining importance since the astronomer Edwin Hubble created his famous classification method. In this work, we combine accurate visual classifications of the Galaxy Zoo project with \emph {Deep Learning} methods. The goal is to find an efficient technique at human performance level classification, but in a systematic and automatic way, for classification of elliptical and spiral galaxies. For this, a neural network model was created through an Ensemble of four other convolutional models, allowing a greater accuracy in the classification than what would be obtained with any one individual. Details of the individual models and improvements made are also described. The present work is entirely based on the analysis of images (not parameter tables) from DR1 (www.datalab.noao.edu) of the Southern Photometric Local Universe Survey (S-PLUS). In terms of classification, we achieved, with the Ensemble, an accuracy of $\approx 99 \%$ in the test sample (using pre-trained networks).
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Put your money where your mouth is: Using deep learning to identify consumer tribes from word usage
Authors:
P. Gloor,
A. Fronzetti Colladon,
J. M. de Oliveira,
P. Rovelli
Abstract:
Internet and social media offer firms novel ways of managing their marketing strategy and gain competitive advantage. The groups of users expressing themselves on the Internet about a particular topic, product, or brand are frequently called a virtual tribe or E-tribe. However, there are no automatic tools for identifying and studying the characteristics of these virtual tribes. Towards this aim,…
▽ More
Internet and social media offer firms novel ways of managing their marketing strategy and gain competitive advantage. The groups of users expressing themselves on the Internet about a particular topic, product, or brand are frequently called a virtual tribe or E-tribe. However, there are no automatic tools for identifying and studying the characteristics of these virtual tribes. Towards this aim, this paper presents Tribefinder, a system to reveal Twitter users' tribal affiliations, by analyzing their tweets and language use. To show the potential of this instrument, we provide an example considering three specific tribal macro-categories: alternative realities, lifestyle, and recreation. In addition, we discuss the different characteristics of each identified tribe, in terms of use of language and social interaction metrics. Tribefinder illustrates the importance of adopting a new lens for studying virtual tribes, which is crucial for firms to properly design their marketing strategy, and for scholars to extend prior marketing research.
△ Less
Submitted 27 May, 2021;
originally announced May 2021.
-
Software Development During COVID-19 Pandemic: an Analysis of Stack Overflow and GitHub
Authors:
Pedro Almir Martins de Oliveira,
Pedro de Alcântara dos Santos Neto,
Gleison Silva,
Irvayne Ibiapina,
Werney Lira,
Rossana Maria de Castro Andrade
Abstract:
The new coronavirus became a severe health issue for the world. This situation has motivated studies of different areas to combat this pandemic. In software engineering, we point out data visualization projects to follow the disease evolution, machine learning to estimate the pandemic behavior, and computer vision processing radiologic images. Most of these projects are stored in version control s…
▽ More
The new coronavirus became a severe health issue for the world. This situation has motivated studies of different areas to combat this pandemic. In software engineering, we point out data visualization projects to follow the disease evolution, machine learning to estimate the pandemic behavior, and computer vision processing radiologic images. Most of these projects are stored in version control systems, and there are discussions about them in Question & Answer websites. In this work, we conducted a Mining Software Repository on a large number of questions and projects aiming to find trends that could help researchers and practitioners to fight against the coronavirus. We analyzed 1,190 questions from Stack Overflow and Data Science Q\&A and 60,352 GitHub projects. We identified a correlation between the questions and projects throughout the pandemic. The main questions about coronavirus are how-to, related to web scraping and data visualization, using Python, JavaScript, and R. The most recurrent GitHub projects are machine learning projects, using JavaScript, Python, and Java.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Geometrical Representation for Number-theoretic Transforms
Authors:
H. M. de Oliveira,
R. J. Cintra
Abstract:
This short note introduces a geometric representation for binary (or ternary) sequences. The proposed representation is linked to multivariate data plotting according to the radar chart. As an illustrative example, the binary Hamming transform recently proposed is geometrically interpreted. It is shown that codewords of standard Hamming code $\mathcal{H}(N=7,k=4,d=3)$ are invariant vectors under t…
▽ More
This short note introduces a geometric representation for binary (or ternary) sequences. The proposed representation is linked to multivariate data plotting according to the radar chart. As an illustrative example, the binary Hamming transform recently proposed is geometrically interpreted. It is shown that codewords of standard Hamming code $\mathcal{H}(N=7,k=4,d=3)$ are invariant vectors under the Hamming transform. These invariant are eigenvectors of the binary Hamming transform. The images are always inscribed in a regular polygon of unity side, resembling triangular rose petals and/or ``thorns''. A geometric representation of the ternary Golay transform, based on the extended Golay $\mathcal{G}(N=12, k=6, d=6)$ code over $\operatorname{GF}(3)$ is also showed. This approach is offered as an alternative representation of finite-length sequences over finite prime fields.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Quantum Bayesian decision-making*
Authors:
Michael de Oliveira,
Luis Soares Barbosa
Abstract:
As a compact representation of joint probability distributions over a dependence graph of random variables, and a tool for modelling and reasoning in the presence of uncertainty, Bayesian networks are of great importance for artificial intelligence to combine domain knowledge, capture causal relationships, or learn from incomplete datasets. Known as a NP-hard problem in a classical setting, Bayesi…
▽ More
As a compact representation of joint probability distributions over a dependence graph of random variables, and a tool for modelling and reasoning in the presence of uncertainty, Bayesian networks are of great importance for artificial intelligence to combine domain knowledge, capture causal relationships, or learn from incomplete datasets. Known as a NP-hard problem in a classical setting, Bayesian inference pops up as a class of algorithms worth to explore in a quantum framework. This paper explores such a research direction and improves on previous proposals by a judicious use of the utility function in an entangled configuration. It proposes a completely quantum mechanical decision-making process with a proven computational advantage. A prototype implementation in Qiskit (a Python-based program development kit for the IBM Q machine) is discussed as a proof-of-concept.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Rounded Hartley Transform: A Quasi-involution
Authors:
R. J. Cintra,
H. M. de Oliveira,
C. O. Cintra
Abstract:
A new multiplication-free transform derived from DHT is introduced: the RHT. Investigations on the properties of the RHT led us to the concept of weak-inversion. Using new constructs, we show that RHT is not involutional like the DHT, but exhibits quasi-involutional property, a new definition derived from the periodicity of matrices. Thus instead of using the actual inverse transform, the RHT is v…
▽ More
A new multiplication-free transform derived from DHT is introduced: the RHT. Investigations on the properties of the RHT led us to the concept of weak-inversion. Using new constructs, we show that RHT is not involutional like the DHT, but exhibits quasi-involutional property, a new definition derived from the periodicity of matrices. Thus instead of using the actual inverse transform, the RHT is viewed as an involutional transform, allowing the use of direct (multiplication-free) to evaluate the inverse. A fast algorithm to compute RHT is presented. This algorithm show embedded properties. We also extended RHT to the two-dimensional case. This permitted us to perform a preliminary analysis on the effects of RHT on images. Despite of some SNR loss, RHT can be very interesting for applications involving image monitoring associated to decision making, such as military applications or medical imaging.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
Quantum One-class Classification With a Distance-based Classifier
Authors:
Nicolas M. de Oliveira,
Lucas P. de Albuquerque,
Wilson R. de Oliveira,
Teresa B. Ludermir,
Adenilton J. da Silva
Abstract:
The advancement of technology in Quantum Computing has brought possibilities for the execution of algorithms in real quantum devices. However, the existing errors in the current quantum hardware and the low number of available qubits make it necessary to use solutions that use fewer qubits and fewer operations, mitigating such obstacles. Hadamard Classifier (HC) is a distance-based quantum machine…
▽ More
The advancement of technology in Quantum Computing has brought possibilities for the execution of algorithms in real quantum devices. However, the existing errors in the current quantum hardware and the low number of available qubits make it necessary to use solutions that use fewer qubits and fewer operations, mitigating such obstacles. Hadamard Classifier (HC) is a distance-based quantum machine learning model for pattern recognition. We present a new classifier based on HC named Quantum One-class Classifier (QOCC) that consists of a minimal quantum machine learning model with fewer operations and qubits, thus being able to mitigate errors from NISQ (Noisy Intermediate-Scale Quantum) computers. Experimental results were obtained by running the proposed classifier on a quantum device and show that QOCC has advantages over HC.
△ Less
Submitted 6 May, 2021; v1 submitted 31 July, 2020;
originally announced July 2020.
-
Infinite Sequences, Series Convergence and the Discrete Time Fourier Transform over Finite Fields
Authors:
R. M. Campello de Souza,
M. M. Campello de Souza,
H. M. de Oliveira,
M. M. Vasconcelos
Abstract:
Digital Transforms have important applications on subjects such as channel coding, cryptography and digital signal processing. In this paper, two Fourier Transforms are considered, the discrete time Fourier transform (DTFT) and the finite field Fourier transform (FFFT). A finite field version of the DTFT is introduced and the FFFT is redefined with a complex kernel, which makes it a more appropria…
▽ More
Digital Transforms have important applications on subjects such as channel coding, cryptography and digital signal processing. In this paper, two Fourier Transforms are considered, the discrete time Fourier transform (DTFT) and the finite field Fourier transform (FFFT). A finite field version of the DTFT is introduced and the FFFT is redefined with a complex kernel, which makes it a more appropriate finite field version of the Discrete Fourier Transform. These transforms can handle FIR and IIR filters defined over finite algebraic structures.
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
Multidimensional Wavelets for Scalable Image Decomposition: Orbital Wavelets
Authors:
H. M. de Oliveira,
V. V. Vermehren,
R. J. Cintra
Abstract:
Wavelets are closely related to the Schrödinger's wave functions and the interpretation of Born. Similarly to the appearance of atomic orbital, it is proposed to combine anti-symmetric wavelets into orbital wavelets. The proposed approach allows the increase of the dimension of wavelets through this process. New orbital 2D-wavelets are introduced for the decomposition of still images, showing that…
▽ More
Wavelets are closely related to the Schrödinger's wave functions and the interpretation of Born. Similarly to the appearance of atomic orbital, it is proposed to combine anti-symmetric wavelets into orbital wavelets. The proposed approach allows the increase of the dimension of wavelets through this process. New orbital 2D-wavelets are introduced for the decomposition of still images, showing that it is possible to perform an analysis simultaneous in two distinct scales. An example of such an image analysis is shown.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
An Overview of Self-Similar Traffic: Its Implications in the Network Design
Authors:
Ernande F. Melo,
H. M. de Oliveira
Abstract:
The knowledge about the true nature of the traffic in computer networking is a key requirement in the design of such networks. The phenomenon of self-similarity is a characteristic of the traffic of current client/server packet networks in LAN/WAN environments dominated by network technologies such as Ethernet and the TCP/IP protocol stack. The development of networks traffic simulators, which tak…
▽ More
The knowledge about the true nature of the traffic in computer networking is a key requirement in the design of such networks. The phenomenon of self-similarity is a characteristic of the traffic of current client/server packet networks in LAN/WAN environments dominated by network technologies such as Ethernet and the TCP/IP protocol stack. The development of networks traffic simulators, which take into account this attribute, is necessary for a more realistic description the traffic on these networks and their use in the design of resources (contention elements) and protocols of flow control and network congestion. In this scenario it is recommended do not adopt standard traffic models of the Poisson type.
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
Fog Computing on Constrained Devices: Paving the Way for the Future IoT
Authors:
Flavia Pisani,
Fabiola M. C. de Oliveira,
Eduardo S. Gama,
Roger Immich,
Luiz F. Bittencourt,
Edson Borin
Abstract:
In the long term, the Internet of Things (IoT) is expected to become an integral part of people's daily lives. In light of this technological advancement, an ever-growing number of objects with limited hardware may become connected to the Internet. In this chapter, we explore the importance of these constrained devices as well as how we can use them in conjunction with fog computing to change the…
▽ More
In the long term, the Internet of Things (IoT) is expected to become an integral part of people's daily lives. In light of this technological advancement, an ever-growing number of objects with limited hardware may become connected to the Internet. In this chapter, we explore the importance of these constrained devices as well as how we can use them in conjunction with fog computing to change the future of the IoT. First, we present an overview of the concepts of constrained devices, IoT, and fog and mist computing, and then we present a classification of applications according to the amount of resources they require (e.g., processing power and memory). After that, we tie in these topics with a discussion of what can be expected in a future where constrained devices and fog computing are used to push the IoT to new limits. Lastly, we discuss some challenges and opportunities that these technologies may bring.
△ Less
Submitted 4 March, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
Experimental quantum secret sharing with spin-orbit structured photons
Authors:
Michael de Oliveira,
Isaac Nape,
Jonathan Pinnell,
Najmeh TabeBordbar,
Andrew Forbes
Abstract:
Secret sharing allows three or more parties to share secret information which can only be decrypted through collaboration. It complements quantum key distribution as a valuable resource for securely distributing information. Here we take advantage of hybrid spin and orbital angular momentum states to access a high dimensional encoding space, demonstrating a protocol that is easily scalable in both…
▽ More
Secret sharing allows three or more parties to share secret information which can only be decrypted through collaboration. It complements quantum key distribution as a valuable resource for securely distributing information. Here we take advantage of hybrid spin and orbital angular momentum states to access a high dimensional encoding space, demonstrating a protocol that is easily scalable in both dimension and participants. To illustrate the versatility of our approach, we first demonstrate the protocol in two dimensions, extending the number of participants to ten, and then demonstrate the protocol in three dimensions with three participants, the highest realisation of participants and dimensions thus far. We reconstruct secrets depicted as images with a fidelity of up to 0.979. Moreover, our scheme exploits the use of conventional linear optics to emulate the quantum gates needed for transitions between basis modes on a high dimensional Hilbert space with the potential of up to 1.225 bits of encoding capacity per transmitted photon. Our work offers a practical approach for sharing information across multiple parties, a crucial element of any quantum network.
△ Less
Submitted 26 September, 2019; v1 submitted 29 August, 2019;
originally announced September 2019.
-
A survey on Big Data and Machine Learning for Chemistry
Authors:
Jose F Rodrigues Jr,
Larisa Florea,
Maria C F de Oliveira,
Dermot Diamond,
Osvaldo N Oliveira Jr
Abstract:
Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production;…
▽ More
Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production; that is, the algorithms, in order to learn, demand large volumes of data of various natures and from different sources, from materials properties to sensor data. In the survey, we propose a roadmap for future developments, with emphasis on materials discovery and chemical sensing, and within the context of the Internet of Things (IoT), both prominent research fields for ML in the context of big data. In addition to providing an overview of recent advances, we elaborate upon the conceptual and practical limitations of big data and ML applied to chemistry, outlining processes, discussing pitfalls, and reviewing cases of success and failure.
△ Less
Submitted 23 April, 2019;
originally announced April 2019.
-
A Note on the Shannon Entropy of Short Sequences
Authors:
H. M. de Oliveira,
Raydonal Ospina
Abstract:
For source sequences of length L symbols we proposed to use a more realistic value to the usual benchmark of number of code letters by source letters. Our idea is based on a quantifier of information fluctuation of a source, F(U), which corresponds to the second central moment of the random variable that measures the information content of a source symbol. An alternative interpretation of typical…
▽ More
For source sequences of length L symbols we proposed to use a more realistic value to the usual benchmark of number of code letters by source letters. Our idea is based on a quantifier of information fluctuation of a source, F(U), which corresponds to the second central moment of the random variable that measures the information content of a source symbol. An alternative interpretation of typical sequences is additionally provided through this approach.
△ Less
Submitted 6 July, 2018;
originally announced July 2018.
-
Modeling, comprehending and summarizing textual content by graphs
Authors:
Vinicius Woloszyn,
Guilherme Medeiros Machado,
Leandro Krug Wives,
José Palazzo Moreira de Oliveira
Abstract:
Automatic Text Summarization strategies have been successfully employed to digest text collections and extract its essential content. Usually, summaries are generated using textual corpora that belongs to the same domain area where the summary will be used. Nonetheless, there are special cases where it is not found enough textual sources, and one possible alternative is to generate a summary from…
▽ More
Automatic Text Summarization strategies have been successfully employed to digest text collections and extract its essential content. Usually, summaries are generated using textual corpora that belongs to the same domain area where the summary will be used. Nonetheless, there are special cases where it is not found enough textual sources, and one possible alternative is to generate a summary from a different domain. One manner to summarize texts consists of using a graph model. This model allows giving more importance to words corresponding to the main concepts from the target domain found in the summarized text. This gives the reader an overview of the main text concepts as well as their relationships. However, this kind of summarization presents a significant number of repeated terms when compared to human-generated summaries. In this paper, we present an approach to produce graph-model extractive summaries of texts, meeting the target domain exigences and treating the terms repetition problem. To evaluate the proposition, we performed a series of experiments showing that the proposed approach statistically improves the performance of a model based on Graph Centrality, achieving better coverage, accuracy, and recall.
△ Less
Submitted 1 July, 2018;
originally announced July 2018.
-
The Hamming and Golay Number-Theoretic Transforms
Authors:
A. J. A. Paschoal,
R. M. Campello de Souza,
H. M. de Oliveira
Abstract:
New number-theoretic transforms are derived from known linear block codes over finite fields. In particular, two new such transforms are built from perfect codes, namely the \textit {Hamming number-theoretic transform} and the \textit {Golay number-theoretic transform}. A few properties of these new transforms are presented.
New number-theoretic transforms are derived from known linear block codes over finite fields. In particular, two new such transforms are built from perfect codes, namely the \textit {Hamming number-theoretic transform} and the \textit {Golay number-theoretic transform}. A few properties of these new transforms are presented.
△ Less
Submitted 25 September, 2018; v1 submitted 25 June, 2018;
originally announced June 2018.
-
Audiovisual Analytics Vocabulary and Ontology (AAVO): initial core and example expansion
Authors:
Renato Fabbri,
Maria Cristina Ferreira de Oliveira
Abstract:
Visual Analytics might be defined as data mining assisted by interactive visual interfaces. The field has been receiving prominent consideration by researchers, developers and the industry. The literature, however, is complex because it involves multiple fields of knowledge and is considerably recent. In this article we describe an initial tentative organization of the knowledge in the field as an…
▽ More
Visual Analytics might be defined as data mining assisted by interactive visual interfaces. The field has been receiving prominent consideration by researchers, developers and the industry. The literature, however, is complex because it involves multiple fields of knowledge and is considerably recent. In this article we describe an initial tentative organization of the knowledge in the field as an OWL ontology and a SKOS vocabulary. This effort might be useful in many ways that include conceptual considerations and software implementations. Within the results and discussions, we expose a core and an example expansion of the conceptualization, and incorporate design issues that enhance the expressive power of the abstraction.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.
-
Linear Computer-Music through Sequences over Galois Fields
Authors:
H. M. de Oliveira,
R. C. de Oliveira
Abstract:
It is shown how binary sequences can be associated with automatic composition of monophonic pieces. We are concerned with the composition of e-music from finite field structures. The information at the input may be either random or information from a black-and-white, grayscale or color picture. New e-compositions and music score are made available, including a new piece from the famous Lenna pictu…
▽ More
It is shown how binary sequences can be associated with automatic composition of monophonic pieces. We are concerned with the composition of e-music from finite field structures. The information at the input may be either random or information from a black-and-white, grayscale or color picture. New e-compositions and music score are made available, including a new piece from the famous Lenna picture: the score of the e-music <<Between Lenna's eyes in C major.>> The corresponding stretch of music score are presented. Some particular structures, including clock arithmetic (mod 12), GF(7), GF(8), GF(13) and GF(17) are addressed. Further, multilevel block-codes are also used in a new approach of e-music composition, engendering a particular style as an e-composer. As an example, Pascal multilevel block codes recently introduced are handled to generate a new style of electronic music over GF(13).
△ Less
Submitted 19 September, 2017;
originally announced September 2017.
-
Understanding MIDI: A Painless Tutorial on Midi Format
Authors:
H. M. de Oliveira,
R. C. de Oliveira
Abstract:
A short overview demystifying the midi audio format is presented. The goal is to explain the file structure and how the instructions are used to produce a music signal, both in the case of monophonic signals as for polyphonic signals.
A short overview demystifying the midi audio format is presented. The goal is to explain the file structure and how the instructions are used to produce a music signal, both in the case of monophonic signals as for polyphonic signals.
△ Less
Submitted 15 May, 2017;
originally announced May 2017.
-
Multiuser Communication Based on the DFT Eigenstructure
Authors:
R. M. Campello de Souza,
H. M. de Oliveira,
R. J. Cintra
Abstract:
The eigenstructure of the discrete Fourier transform (DFT) is examined and new systematic procedures to generate eigenvectors of the unitary DFT are proposed. DFT eigenvectors are suggested as user signatures for data communication over the real adder channel (RAC). The proposed multiuser communication system over the 2-user RAC is detailed.
The eigenstructure of the discrete Fourier transform (DFT) is examined and new systematic procedures to generate eigenvectors of the unitary DFT are proposed. DFT eigenvectors are suggested as user signatures for data communication over the real adder channel (RAC). The proposed multiuser communication system over the 2-user RAC is detailed.
△ Less
Submitted 6 February, 2017;
originally announced February 2017.
-
Performance Assessment of WhatsApp and IMO on Android Operating System (Lollipop and KitKat) during VoIP calls using 3G or WiFi
Authors:
R. C. de Oliveira,
H. M. de Oliveira,
R. A. Ramalho,
L. P. S. Viana
Abstract:
This paper assesses the performance of mobile messaging and VoIP connections. We investigate the CPU usage of WhatsApp and IMO under different scenarios. This analysis also enabled a comparison of the performance of these applications on two Android operating system (OS) versions: KitKat or Lollipop. Two models of smartphones were considered, viz. Galaxy Note 4 and Galaxy S4. The applications beha…
▽ More
This paper assesses the performance of mobile messaging and VoIP connections. We investigate the CPU usage of WhatsApp and IMO under different scenarios. This analysis also enabled a comparison of the performance of these applications on two Android operating system (OS) versions: KitKat or Lollipop. Two models of smartphones were considered, viz. Galaxy Note 4 and Galaxy S4. The applications behavior was statistically investigated for both sending and receiving VoIP calls. Connections have been examined over 3G and WiFi. The handset model plays a decisive role in CPU usage of the application. t-tests showed that IMO has a better performance that WhatsApp whatever be the Android at a significance level 1%, on Galaxy Note 4. In contrast, WhatsApp requires less CPU than IMO on Galaxy S4 whatever be the OS and access (3G/WiFi). Galaxy Note 4 using WiFi always outperformed S4 in terms of processing efficiency.
△ Less
Submitted 14 May, 2016; v1 submitted 3 March, 2016;
originally announced March 2016.
-
A New Information Theoretical Concept: Information-Weighted Heavy-tailed Distributions
Authors:
H. M. de Oliveira,
R. J. Cintra
Abstract:
Given an arbitrary continuous probability density function, it is introduced a conjugated probability density, which is defined through the Shannon information associated with its cumulative distribution function. These new densities are computed from a number of standard distributions, including uniform, normal, exponential, Pareto, logistic, Kumaraswamy, Rayleigh, Cauchy, Weibull, and Maxwell-Bo…
▽ More
Given an arbitrary continuous probability density function, it is introduced a conjugated probability density, which is defined through the Shannon information associated with its cumulative distribution function. These new densities are computed from a number of standard distributions, including uniform, normal, exponential, Pareto, logistic, Kumaraswamy, Rayleigh, Cauchy, Weibull, and Maxwell-Boltzmann. The case of joint information-weighted probability distribution is assessed. An additive property is derived in the case of independent variables. One-sided and two-sided information-weighting are considered. The asymptotic behavior of the tail of the new distributions is examined. It is proved that all probability densities proposed here define heavy-tailed distributions. It is shown that the weighting of distributions regularly varying with extreme-value index $α> 0$ still results in a regular variation distribution with the same index. This approach can be particularly valuable in applications where the tails of the distribution play a major role.
△ Less
Submitted 24 January, 2016;
originally announced January 2016.
-
Reviewing Data Visualization: an Analytical Taxonomical Study
Authors:
Jose F. Rodrigues Jr.,
Agma J. M. Traina,
Maria Cristina F. de Oliveira,
Caetano Traina Jr
Abstract:
This paper presents an analytical taxonomy that can suitably describe, rather than simply classify, techniques for data presentation. Unlike previous works, we do not consider particular aspects of visualization techniques, but their mechanisms and foundational vision perception. Instead of just adjusting visualization research to a classification system, our aim is to better understand its proces…
▽ More
This paper presents an analytical taxonomy that can suitably describe, rather than simply classify, techniques for data presentation. Unlike previous works, we do not consider particular aspects of visualization techniques, but their mechanisms and foundational vision perception. Instead of just adjusting visualization research to a classification system, our aim is to better understand its process. For doing so, we depart from elementary concepts to reach a model that can describe how visualization techniques work and how they convey meaning.
△ Less
Submitted 9 June, 2015;
originally announced June 2015.
-
Inflexibility and independence: Phase transitions in the majority-rule model
Authors:
Nuno Crokidakis,
Paulo Murilo Castro de Oliveira
Abstract:
In this work we study opinion formation in a population participating of a public debate with two distinct choices. We considered three distinct mechanisms of social interactions and individuals' behavior: conformity, nonconformity and inflexibility. The conformity is ruled by the majority-rule dynamics, whereas the nonconformity is introduced in the population as an independent behavior, implying…
▽ More
In this work we study opinion formation in a population participating of a public debate with two distinct choices. We considered three distinct mechanisms of social interactions and individuals' behavior: conformity, nonconformity and inflexibility. The conformity is ruled by the majority-rule dynamics, whereas the nonconformity is introduced in the population as an independent behavior, implying the failure to attempted group influence. Finally, the inflexible agents are introduced in the population with a given density. These individuals present a singular behavior, in a way that their stubbornness makes them reluctant to change their opinions. We consider these effects separately and all together, with the aim to analyze the critical behavior of the system. We performed numerical simulations in some lattice structures and for distinct population sizes, and our results suggest that the different formulations of the model undergo order-disorder phase transitions in the same universality class of the Ising model. Some of our results are complemented by analytical calculations.
△ Less
Submitted 19 November, 2015; v1 submitted 19 May, 2015;
originally announced May 2015.
-
Efficient Multiplex for Band-Limited Channels: Galois-Field Division Multiple Access
Authors:
H. M. de Oliveira,
R. M. Campello de Souza,
A. N. Kauffman
Abstract:
A new Efficient-bandwidth code-division-multiple-access (CDMA) for band-limited channels is introduced which is based on finite field transforms. A multilevel code division multiplex exploits orthogonality properties of nonbinary sequences defined over a complex finite field. Galois-Fourier transforms contain some redundancy and just cyclotomic coefficients are needed to be transmitted yielding co…
▽ More
A new Efficient-bandwidth code-division-multiple-access (CDMA) for band-limited channels is introduced which is based on finite field transforms. A multilevel code division multiplex exploits orthogonality properties of nonbinary sequences defined over a complex finite field. Galois-Fourier transforms contain some redundancy and just cyclotomic coefficients are needed to be transmitted yielding compact spectrum requirements. The primary advantage of such schemes regarding classical multiplex is their better spectral efficiency. This paper estimates the \textit{bandwidth compactness factor} relatively to Time Division Multiple Access TDMA showing that it strongly depends on the alphabet extension. These multiplex schemes termed Galois Division Multiplex (GDM) are based on transforms for which there exists fast algorithms. They are also convenient from the implementation viewpoint since they can be implemented by a Digital Signal Processor.
△ Less
Submitted 15 May, 2015;
originally announced May 2015.
-
Application of Enhanced-2D-CWT in Topographic Images for Mapping Landslide Risk Areas
Authors:
V. V. Vermehren Valenzuela,
R. D. Lins,
H. M. de Oliveira
Abstract:
There has been lately a number of catastrophic events of landslides and mudslides in the mountainous region of Rio de Janeiro, Brazil. Those were caused by intense rain in localities where there was unplanned occupation of slopes of hills and mountains. Thus, it became imperative creating an inventory of landslide risk areas in densely populated cities. This work presents a way of demarcating risk…
▽ More
There has been lately a number of catastrophic events of landslides and mudslides in the mountainous region of Rio de Janeiro, Brazil. Those were caused by intense rain in localities where there was unplanned occupation of slopes of hills and mountains. Thus, it became imperative creating an inventory of landslide risk areas in densely populated cities. This work presents a way of demarcating risk areas by using the bidimensional Continuous Wavelet Transform (2D-CWT) applied to high resolution topographic images of the mountainous region of Rio de Janeiro.
△ Less
Submitted 15 April, 2015;
originally announced April 2015.
-
Spread-Spectrum Based on Finite Field Fourier Transforms
Authors:
H. M. de Oliveira,
J. P. C. L. Miranda,
R. M. Campello de Souza
Abstract:
Spread-spectrum systems are presented, which are based on Finite Field Fourier Transforms. Orthogonal spreading sequences defined over a finite field are derived. New digital multiplex schemes based on such spread-spectrum systems are also introduced, which are multilevel Coding Division Multiplex. These schemes termed Galois-field Division Multiplex (GDM) offer compact bandwidth requirements beca…
▽ More
Spread-spectrum systems are presented, which are based on Finite Field Fourier Transforms. Orthogonal spreading sequences defined over a finite field are derived. New digital multiplex schemes based on such spread-spectrum systems are also introduced, which are multilevel Coding Division Multiplex. These schemes termed Galois-field Division Multiplex (GDM) offer compact bandwidth requirements because only leaders of cyclotomic cosets are needed to be transmitted.
△ Less
Submitted 12 February, 2015;
originally announced March 2015.
-
A Low-throughput Wavelet-based Steganography Audio Scheme
Authors:
P. Carrion,
H. M. de Oliveira,
R. M. Campello de Souza
Abstract:
This paper presents the preliminary of a novel scheme of steganography, and introduces the idea of combining two secret keys in the operation. The first secret key encrypts the text using a standard cryptographic scheme (e.g. IDEA, SAFER+, etc.) prior to the wavelet audio decomposition. The way in which the cipher text is embedded in the file requires another key, namely a stego-key, which is asso…
▽ More
This paper presents the preliminary of a novel scheme of steganography, and introduces the idea of combining two secret keys in the operation. The first secret key encrypts the text using a standard cryptographic scheme (e.g. IDEA, SAFER+, etc.) prior to the wavelet audio decomposition. The way in which the cipher text is embedded in the file requires another key, namely a stego-key, which is associated with features of the audio wavelet analysis.
△ Less
Submitted 4 February, 2015;
originally announced March 2015.
-
Quantum Decoding with Venn Diagrams
Authors:
C. M. F. Barros,
Francisco Marcos de Assis,
H. M. de Oliveira
Abstract:
The quantum error correction theory is as a rule formulated in a rather convoluted way, in comparison to classical algebraic theory. This work revisits the error correction in a noisy quantum channel so as to make it intelligible to engineers. An illustrative example is presented of a naive perfect quantum code (Hamming-like code) with five-qubits for transmitting a single qubit of information. Al…
▽ More
The quantum error correction theory is as a rule formulated in a rather convoluted way, in comparison to classical algebraic theory. This work revisits the error correction in a noisy quantum channel so as to make it intelligible to engineers. An illustrative example is presented of a naive perfect quantum code (Hamming-like code) with five-qubits for transmitting a single qubit of information. Also the (9,1)-Shor codes is addressed.
△ Less
Submitted 14 March, 2015;
originally announced March 2015.
-
Radix-2 Fast Hartley Transform Revisited
Authors:
H. M. de Oliveira,
V. L. Sousa,
H. A. N.,
R. M. Campello de Souza
Abstract:
A Fast algorithm for the Discrete Hartley Transform (DHT) is presented, which resembles radix-2 fast Fourier Transform (FFT). Although fast DHTs are already known, this new approach bring some light about the deep relationship between fast DHT algorithms and a multiplication-free fast algorithm for the Hadamard Transform.
A Fast algorithm for the Discrete Hartley Transform (DHT) is presented, which resembles radix-2 fast Fourier Transform (FFT). Although fast DHTs are already known, this new approach bring some light about the deep relationship between fast DHT algorithms and a multiplication-free fast algorithm for the Hadamard Transform.
△ Less
Submitted 12 March, 2015;
originally announced March 2015.
-
The Discrete Cosine Transform over Prime Finite Fields
Authors:
M. M. Campello de Souza,
H. M. de Oliveira,
R. M. Campello de Souza,
M. M. Vasconcelos
Abstract:
This paper examines finite field trigonometry as a tool to construct trigonometric digital transforms. In particular, by using properties of the k-cosine function over GF(p), the Finite Field Discrete Cosine Transform (FFDCT) is introduced. The FFDCT pair in GF(p) is defined, having blocklengths that are divisors of (p+1)/2. A special case is the Mersenne FFDCT, defined when p is a Mersenne prime.…
▽ More
This paper examines finite field trigonometry as a tool to construct trigonometric digital transforms. In particular, by using properties of the k-cosine function over GF(p), the Finite Field Discrete Cosine Transform (FFDCT) is introduced. The FFDCT pair in GF(p) is defined, having blocklengths that are divisors of (p+1)/2. A special case is the Mersenne FFDCT, defined when p is a Mersenne prime. In this instance blocklengths that are powers of two are possible and radix-2 fast algorithms can be used to compute the transform.
△ Less
Submitted 12 March, 2015;
originally announced March 2015.
-
Fourier Codes
Authors:
R. M. Campello de Souza,
E. S. V. Freire,
H. M. de Oliveira
Abstract:
A new family of error-correcting codes, called Fourier codes, is introduced. The code parity-check matrix, dimension and an upper bound on its minimum distance are obtained from the eigenstructure of the Fourier number theoretic transform. A decoding technique for such codes is proposed.
A new family of error-correcting codes, called Fourier codes, is introduced. The code parity-check matrix, dimension and an upper bound on its minimum distance are obtained from the eigenstructure of the Fourier number theoretic transform. A decoding technique for such codes is proposed.
△ Less
Submitted 11 March, 2015;
originally announced March 2015.
-
New Algorithms for Computing a Single Component of the Discrete Fourier Transform
Authors:
G. Jerônimo da Silva Jr.,
R. M. Campello de Souza,
H. M. de Oliveira
Abstract:
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
△ Less
Submitted 9 March, 2015;
originally announced March 2015.
-
The Genetic Code revisited: Inner-to-outer map, 2D-Gray map, and World-map Genetic Representations
Authors:
H. M. de Oliveira,
N. S. Santos-Magalhaes
Abstract:
How to represent the genetic code? Despite the fact that it is extensively known, the DNA mapping into proteins remains as one of the relevant discoveries of genetics. However, modern genomic signal processing usually requires converting symbolic-DNA strings into complex-valued signals in order to take full advantage of a broad variety of digital processing techniques. The genetic code is revisite…
▽ More
How to represent the genetic code? Despite the fact that it is extensively known, the DNA mapping into proteins remains as one of the relevant discoveries of genetics. However, modern genomic signal processing usually requires converting symbolic-DNA strings into complex-valued signals in order to take full advantage of a broad variety of digital processing techniques. The genetic code is revisited in this paper, addressing alternative representations for it, which can be worthy for genomic signal processing. Three original representations are discussed. The inner-to-outer map builds on the unbalanced role of nucleotides of a 'codon' and it seems to be suitable for handling information-theory-based matter. The two-dimensional-Gray map representation is offered as a mathematically structured map that can help interpreting spectrograms or scalograms. Finally, the world-map representation for the genetic code is investigated, which can particularly be valuable for educational purposes -besides furnishing plenty of room for application of distance-based algorithms.
△ Less
Submitted 5 March, 2015;
originally announced March 2015.
-
Genomic Imaging Based on Codongrams and a^2grams
Authors:
E. A. Bouton,
H. M. de Oliveira,
R. M. Campello de Souza,
N. S. Santos-Magalhaes
Abstract:
This paper introduces new tools for genomic signal processing, which can assist for genomic attribute extracting or describing biologically meaningful features embedded in a DNA. The codongrams and a2grams are offered as an alternative to spectrograms and scalograms. Twenty different a^2grams are defined for a genome, one for each amino acid (valgram is an a^2gram for valine; alagram is an a^2gram…
▽ More
This paper introduces new tools for genomic signal processing, which can assist for genomic attribute extracting or describing biologically meaningful features embedded in a DNA. The codongrams and a2grams are offered as an alternative to spectrograms and scalograms. Twenty different a^2grams are defined for a genome, one for each amino acid (valgram is an a^2gram for valine; alagram is an a^2gram for alanine and so on). They provide information about the distribution and occurrence of the investigated amino acid. In particular, the metgram can be used to find out potential start position of genes within a genome. This approach can help implementing a new diagnosis test for genetic diseases by providing a type of DNA-medical imaging.
△ Less
Submitted 5 March, 2015;
originally announced March 2015.