-
MorphGuard: Morph Specific Margin Loss for Enhancing Robustness to Face Morphing Attacks
Authors:
Iurii Medvedev,
Nuno Goncalves
Abstract:
Face recognition has evolved significantly with the advancement of deep learning techniques, enabling its widespread adoption in various applications requiring secure authentication. However, this progress has also increased its exposure to presentation attacks, including face morphing, which poses a serious security threat by allowing one identity to impersonate another. Therefore, modern face re…
▽ More
Face recognition has evolved significantly with the advancement of deep learning techniques, enabling its widespread adoption in various applications requiring secure authentication. However, this progress has also increased its exposure to presentation attacks, including face morphing, which poses a serious security threat by allowing one identity to impersonate another. Therefore, modern face recognition systems must be robust against such attacks.
In this work, we propose a novel approach for training deep networks for face recognition with enhanced robustness to face morphing attacks. Our method modifies the classification task by introducing a dual-branch classification strategy that effectively handles the ambiguity in the labeling of face morphs. This adaptation allows the model to incorporate morph images into the training process, improving its ability to distinguish them from bona fide samples.
Our strategy has been validated on public benchmarks, demonstrating its effectiveness in enhancing robustness against face morphing attacks. Furthermore, our approach is universally applicable and can be integrated into existing face recognition training pipelines to improve classification-based recognition methods.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
AdaSplash: Adaptive Sparse Flash Attention
Authors:
Nuno Gonçalves,
Marcos Treviso,
André F. T. Martins
Abstract:
The computational cost of softmax-based attention in transformers limits their applicability to long-context tasks. Adaptive sparsity, of which $α$-entmax attention is an example, offers a flexible data-dependent alternative, but existing implementations are inefficient and do not leverage the sparsity to obtain runtime and memory gains. In this work, we propose AdaSplash, which combines the effic…
▽ More
The computational cost of softmax-based attention in transformers limits their applicability to long-context tasks. Adaptive sparsity, of which $α$-entmax attention is an example, offers a flexible data-dependent alternative, but existing implementations are inefficient and do not leverage the sparsity to obtain runtime and memory gains. In this work, we propose AdaSplash, which combines the efficiency of GPU-optimized algorithms with the sparsity benefits of $α$-entmax. We first introduce a hybrid Halley-bisection algorithm, resulting in a 7-fold reduction in the number of iterations needed to compute the $α$-entmax transformation. Then, we implement custom Triton kernels to efficiently handle adaptive sparsity. Experiments with RoBERTa and ModernBERT for text classification and single-vector retrieval, along with GPT-2 for language modeling, show that our method achieves substantial improvements in runtime and memory efficiency compared to existing $α$-entmax implementations. It approaches -- and in some cases surpasses -- the efficiency of highly optimized softmax implementations like FlashAttention-2, enabling long-context training while maintaining strong task performance.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Quadruplet Loss For Improving the Robustness to Face Morphing Attacks
Authors:
Iurii Medvedev,
Nuno Gonçalves
Abstract:
Recent advancements in deep learning have revolutionized technology and security measures, necessitating robust identification methods. Biometric approaches, leveraging personalized characteristics, offer a promising solution. However, Face Recognition Systems are vulnerable to sophisticated attacks, notably face morphing techniques, enabling the creation of fraudulent documents. In this study, we…
▽ More
Recent advancements in deep learning have revolutionized technology and security measures, necessitating robust identification methods. Biometric approaches, leveraging personalized characteristics, offer a promising solution. However, Face Recognition Systems are vulnerable to sophisticated attacks, notably face morphing techniques, enabling the creation of fraudulent documents. In this study, we introduce a novel quadruplet loss function for increasing the robustness of face recognition systems against morphing attacks. Our approach involves specific sampling of face image quadruplets, combined with face morphs, for network training. Experimental results demonstrate the efficiency of our strategy in improving the robustness of face recognition networks against morphing attacks.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Fused Classification For Differential Face Morphing Detection
Authors:
Iurii Medvedev,
Joana Pimenta,
Nuno Gonçalves
Abstract:
Face morphing, a sophisticated presentation attack technique, poses significant security risks to face recognition systems. Traditional methods struggle to detect morphing attacks, which involve blending multiple face images to create a synthetic image that can match different individuals. In this paper, we focus on the differential detection of face morphing and propose an extended approach based…
▽ More
Face morphing, a sophisticated presentation attack technique, poses significant security risks to face recognition systems. Traditional methods struggle to detect morphing attacks, which involve blending multiple face images to create a synthetic image that can match different individuals. In this paper, we focus on the differential detection of face morphing and propose an extended approach based on fused classification method for no-reference scenario. We introduce a public face morphing detection benchmark for the differential scenario and utilize a specific data mining technique to enhance the performance of our approach. Experimental results demonstrate the effectiveness of our method in detecting morphing attacks.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Impact of Image Context for Single Deep Learning Face Morphing Attack Detection
Authors:
Joana Pimenta,
Iurii Medvedev,
Nuno Gonçalves
Abstract:
The increase in security concerns due to technological advancements has led to the popularity of biometric approaches that utilize physiological or behavioral characteristics for enhanced recognition. Face recognition systems (FRSs) have become prevalent, but they are still vulnerable to image manipulation techniques such as face morphing attacks. This study investigates the impact of the alignmen…
▽ More
The increase in security concerns due to technological advancements has led to the popularity of biometric approaches that utilize physiological or behavioral characteristics for enhanced recognition. Face recognition systems (FRSs) have become prevalent, but they are still vulnerable to image manipulation techniques such as face morphing attacks. This study investigates the impact of the alignment settings of input images on deep learning face morphing detection performance. We analyze the interconnections between the face contour and image context and suggest optimal alignment conditions for face morphing detection.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Neural Implicit Morphing of Face Images
Authors:
Guilherme Schardong,
Tiago Novello,
Hallison Paz,
Iurii Medvedev,
Vinícius da Silva,
Luiz Velho,
Nuno Gonçalves
Abstract:
Face morphing is a problem in computer graphics with numerous artistic and forensic applications. It is challenging due to variations in pose, lighting, gender, and ethnicity. This task consists of a warping for feature alignment and a blending for a seamless transition between the warped images. We propose to leverage coord-based neural networks to represent such warpings and blendings of face im…
▽ More
Face morphing is a problem in computer graphics with numerous artistic and forensic applications. It is challenging due to variations in pose, lighting, gender, and ethnicity. This task consists of a warping for feature alignment and a blending for a seamless transition between the warped images. We propose to leverage coord-based neural networks to represent such warpings and blendings of face images. During training, we exploit the smoothness and flexibility of such networks by combining energy functionals employed in classical approaches without discretizations. Additionally, our method is time-dependent, allowing a continuous warping/blending of the images. During morphing inference, we need both direct and inverse transformations of the time-dependent warping. The first (second) is responsible for warping the target (source) image into the source (target) image. Our neural warping stores those maps in a single network dismissing the need for inverting them. The results of our experiments indicate that our method is competitive with both classical and generative models under the lens of image quality and face-morphing detectors. Aesthetically, the resulting images present a seamless blending of diverse faces not yet usual in the literature.
△ Less
Submitted 13 June, 2024; v1 submitted 26 August, 2023;
originally announced August 2023.
-
The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot
Authors:
Lucas Prado Osco,
Qiusheng Wu,
Eduardo Lopes de Lemos,
Wesley Nunes Gonçalves,
Ana Paula Marques Ramos,
Jonathan Li,
José Marcato Junior
Abstract:
Segmentation is an essential step for remote sensing image processing. This study aims to advance the application of the Segment Anything Model (SAM), an innovative image segmentation model by Meta AI, in the field of remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning, making it a promising approach to processing aerial and orbital im…
▽ More
Segmentation is an essential step for remote sensing image processing. This study aims to advance the application of the Segment Anything Model (SAM), an innovative image segmentation model by Meta AI, in the field of remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning, making it a promising approach to processing aerial and orbital images from diverse geographical contexts. Our exploration involved testing SAM across multi-scale datasets using various input prompts, such as bounding boxes, individual points, and text descriptors. To enhance the model's performance, we implemented a novel automated technique that combines a text-prompt-derived general example with one-shot training. This adjustment resulted in an improvement in accuracy, underscoring SAM's potential for deployment in remote sensing imagery and reducing the need for manual annotation. Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis. We recommend future research to enhance the model's proficiency through integration with supplementary fine-tuning techniques and other networks. Furthermore, we provide the open-source code of our modifications on online repositories, encouraging further and broader adaptations of SAM to the remote sensing domain.
△ Less
Submitted 31 October, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture
Authors:
Diogo Nunes Goncalves,
Jose Marcato Junior,
Pedro Zamboni,
Hemerson Pistori,
Jonathan Li,
Keiller Nogueira,
Wesley Nunes Goncalves
Abstract:
Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not…
▽ More
Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not directly consider the local characteristics of the image nor the level of importance or correlation between the tasks. In this paper, we propose a semantic segmentation method, MTLSegFormer, which combines multi-task learning and attention mechanisms. After the backbone feature extraction, two feature maps are learned for each task. The first map is proposed to learn features related to its task, while the second map is obtained by applying learned visual attention to locally re-weigh the feature maps of the other tasks. In this way, weights are assigned to local regions of the image of other tasks that have greater importance for the specific task. Finally, the two maps are combined and used to solve a task. We tested the performance in two challenging problems with correlated tasks and observed a significant improvement in accuracy, mainly in tasks with high dependence on the others.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
The Potential of Visual ChatGPT For Remote Sensing
Authors:
Lucas Prado Osco,
Eduardo Lopes de Lemos,
Wesley Nunes Gonçalves,
Ana Paula Marques Ramos,
José Marcato Junior
Abstract:
Recent advancements in Natural Language Processing (NLP), particularly in Large Language Models (LLMs), associated with deep learning-based computer vision techniques, have shown substantial potential for automating a variety of tasks. One notable model is Visual ChatGPT, which combines ChatGPT's LLM capabilities with visual computation to enable effective image analysis. The model's ability to pr…
▽ More
Recent advancements in Natural Language Processing (NLP), particularly in Large Language Models (LLMs), associated with deep learning-based computer vision techniques, have shown substantial potential for automating a variety of tasks. One notable model is Visual ChatGPT, which combines ChatGPT's LLM capabilities with visual computation to enable effective image analysis. The model's ability to process images based on textual inputs can revolutionize diverse fields. However, its application in the remote sensing domain remains unexplored. This is the first paper to examine the potential of Visual ChatGPT, a cutting-edge LLM founded on the GPT architecture, to tackle the aspects of image processing related to the remote sensing domain. Among its current capabilities, Visual ChatGPT can generate textual descriptions of images, perform canny edge and straight line detection, and conduct image segmentation. These offer valuable insights into image content and facilitate the interpretation and extraction of information. By exploring the applicability of these techniques within publicly available datasets of satellite images, we demonstrate the current model's limitations in dealing with remote sensing images, highlighting its challenges and future prospects. Although still in early development, we believe that the combination of LLMs and visual models holds a significant potential to transform remote sensing image processing, creating accessible and practical application opportunities in the field.
△ Less
Submitted 5 July, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
RADAM: Texture Recognition through Randomized Aggregated Encoding of Deep Activation Maps
Authors:
Leonardo Scabini,
Kallil M. Zielinski,
Lucas C. Ribas,
Wesley N. Gonçalves,
Bernard De Baets,
Odemir M. Bruno
Abstract:
Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D…
▽ More
Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D}eep \textbf{A}ctivation \textbf{M}aps (RADAM) which extracts rich texture representations without ever changing the backbone. The technique consists of encoding the output at different depths of a pre-trained deep convolutional network using a Randomized Autoencoder (RAE). The RAE is trained locally to each image using a closed-form solution, and its decoder weights are used to compose a 1-dimensional texture representation that is fed into a linear SVM. This means that no fine-tuning or backpropagation is needed. We explore RADAM on several texture benchmarks and achieve state-of-the-art results with different computational budgets. Our results suggest that pre-trained backbones may not require additional fine-tuning for texture recognition if their learned representations are better encoded.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Young Labeled Faces in the Wild (YLFW): A Dataset for Children Faces Recognition
Authors:
Iurii Medvedev,
Farhad Shadmand,
Nuno Gonçalves
Abstract:
Face recognition has achieved outstanding performance in the last decade with the development of deep learning techniques.
Nowadays, the challenges in face recognition are related to specific scenarios, for instance, the performance under diverse image quality, the robustness for aging and edge cases of person age (children and elders), distinguishing of related identities.
In this set of prob…
▽ More
Face recognition has achieved outstanding performance in the last decade with the development of deep learning techniques.
Nowadays, the challenges in face recognition are related to specific scenarios, for instance, the performance under diverse image quality, the robustness for aging and edge cases of person age (children and elders), distinguishing of related identities.
In this set of problems, recognizing children's faces is one of the most sensitive and important. One of the reasons for this problem is the existing bias towards adults in existing face datasets.
In this work, we present a benchmark dataset for children's face recognition, which is compiled similarly to the famous face recognition benchmarks LFW, CALFW, CPLFW, XQLFW and AgeDB.
We also present a development dataset (separated into train and test parts) for adapting face recognition models for face images of children.
The proposed data is balanced for African, Asian, Caucasian, and Indian races. To the best of our knowledge, this is the first standartized data tool set for benchmarking and the largest collection for development for children's face recognition. Several face recognition experiments are presented to demonstrate the performance of the proposed data tool set.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
MorDeephy: Face Morphing Detection Via Fused Classification
Authors:
Iurii Medvedev,
Farhad Shadmand,
Nuno Gonçalves
Abstract:
Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a sophisticated face recognition task in a complex classification scheme. It is directed onto learning the deep fa…
▽ More
Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a sophisticated face recognition task in a complex classification scheme. It is directed onto learning the deep facial features, which carry information about the authenticity of these features. Our work also introduces several additional contributions: the public and easy-to-use face morphing detection benchmark and the results of our wild datasets filtering strategy. Our method, which we call MorDeephy, achieved the state of the art performance and demonstrated a prominent ability for generalising the task of morphing detection to unseen scenarios.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Energy Efficiency of Web Browsers in the Android Ecosystem
Authors:
Nélson Gonçalves,
Rui Rua,
Jácome Cunha,
Rui Pereira,
João Saraiva
Abstract:
This paper presents an empirical study regarding the energy consumption of the most used web browsers on the Android ecosystem. In order to properly compare the web browsers in terms of energy consumption, we defined a set of typical usage scenarios to be replicated in the different browsers, executed in the same testing environment and conditions. The results of our study show that there are sign…
▽ More
This paper presents an empirical study regarding the energy consumption of the most used web browsers on the Android ecosystem. In order to properly compare the web browsers in terms of energy consumption, we defined a set of typical usage scenarios to be replicated in the different browsers, executed in the same testing environment and conditions. The results of our study show that there are significant differences in terms of energy consumption among the considered browsers. Furthermore, we conclude that some browsers are energy efficient in several user actions, but energy greedy in other ones, allowing us to conclude that no browser is universally more efficient for all usage scenarios.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Reducing Overconfidence Predictions for Autonomous Driving Perception
Authors:
Gledson Melotti,
Cristiano Premebida,
Jordan J. Bird,
Diego R. Faria,
Nuno Gonçalves
Abstract:
In state-of-the-art deep learning for object recognition, SoftMax and Sigmoid functions are most commonly employed as the predictor outputs. Such layers often produce overconfident predictions rather than proper probabilistic scores, which can thus harm the decision-making of `critical' perception systems applied in autonomous driving and robotics. Given this, the experiments in this work propose…
▽ More
In state-of-the-art deep learning for object recognition, SoftMax and Sigmoid functions are most commonly employed as the predictor outputs. Such layers often produce overconfident predictions rather than proper probabilistic scores, which can thus harm the decision-making of `critical' perception systems applied in autonomous driving and robotics. Given this, the experiments in this work propose a probabilistic approach based on distributions calculated out of the Logit layer scores of pre-trained networks. We demonstrate that Maximum Likelihood (ML) and Maximum a-Posteriori (MAP) functions are more suitable for probabilistic interpretations than SoftMax and Sigmoid-based predictions for object recognition. We explore distinct sensor modalities via RGB images and LiDARs (RV: range-view) data from the KITTI and Lyft Level-5 datasets, where our approach shows promising performance compared to the usual SoftMax and Sigmoid layers, with the benefit of enabling interpretable probabilistic predictions. Another advantage of the approach introduced in this paper is that the ML and MAP functions can be implemented in existing trained networks, that is, the approach benefits from the output of the Logit layer of pre-trained networks. Thus, there is no need to carry out a new training phase since the ML and MAP functions are used in the test/prediction phase.
△ Less
Submitted 11 May, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Stepwise Migration of a Monolith to a Microservices Architecture: Performance and Migration Effort Evaluation
Authors:
Diogo Faustino,
Nuno Gonçalves,
Manuel Portela,
António Rito Silva
Abstract:
The agility inherent to today's business promotes the definition of software architectures where the business entities are decoupled into modules and/or services. However, there are advantages in having a rich domain model, where domain entities are tightly connected, because it fosters an initial quick development. On the other hand, the split of the business logic into modules and/or services, i…
▽ More
The agility inherent to today's business promotes the definition of software architectures where the business entities are decoupled into modules and/or services. However, there are advantages in having a rich domain model, where domain entities are tightly connected, because it fosters an initial quick development. On the other hand, the split of the business logic into modules and/or services, its encapsulation through well-defined interfaces and the introduction of inter-service communication introduces a cost in terms of performance. In this paper we analyze the stepwise migrating of a monolith, using a rich domain object, into a microservice architecture, where a modular monolith architecture is used as an intermediate step. The impact on the migration effort and on performance is measured for both steps. Current state of the art analyses the migration of monolith systems to a microservices architecture, but we observed that migration effort and performance issues are already significant in the migration to a modular monolith. Therefore, a clear distinction is established for each one of the steps, which may inform software architects on the planning of the migration of monolith systems. In particular, the trade-offs of doing all the migration process or just migrating to a modular monolith.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Probabilistic Approach for Road-Users Detection
Authors:
G. Melotti,
W. Lu,
P. Conde,
D. Zhao,
A. Asvadi,
N. Gonçalves,
C. Premebida
Abstract:
Object detection in autonomous driving applications implies that the detection and tracking of semantic objects are commonly native to urban driving environments, as pedestrians and vehicles. One of the major challenges in state-of-the-art deep-learning based object detection are false positives which occur with overconfident scores. This is highly undesirable in autonomous driving and other criti…
▽ More
Object detection in autonomous driving applications implies that the detection and tracking of semantic objects are commonly native to urban driving environments, as pedestrians and vehicles. One of the major challenges in state-of-the-art deep-learning based object detection are false positives which occur with overconfident scores. This is highly undesirable in autonomous driving and other critical robotic-perception domains because of safety concerns. This paper proposes an approach to alleviate the problem of overconfident predictions by introducing a novel probabilistic layer to deep object detection networks in testing. The suggested approach avoids the traditional Sigmoid or Softmax prediction layer which often produces overconfident predictions. It is demonstrated that the proposed technique reduces overconfidence in the false positives without degrading the performance on the true positives. The approach is validated on the 2D-KITTI objection detection through the YOLOV4 and SECOND (Lidar-based detector). The proposed approach enables interpretable probabilistic predictions without the requirement of re-training the network and therefore is very practical.
△ Less
Submitted 21 April, 2023; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Authors:
Fabio M. F. Lobato,
Carlos D. N. Damasceno,
Péricles L. Machado,
Nandamudi L. Vijaykumar,
André R. dos Santos,
Sylvain H. Darnet,
André N. A. Gonçalves,
Dayse O. de Alencar,
Ádamo L. de Santana
Abstract:
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer…
▽ More
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used.
△ Less
Submitted 11 August, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.
-
Semantic Segmentation with Labeling Uncertainty and Class Imbalance
Authors:
Patrik Olã Bressan,
José Marcato Junior,
José Augusto Correa Martins,
Diogo Nunes Gonçalves,
Daniel Matte Freitas,
Lucas Prado Osco,
Jonathan de Andrade Silva,
Zhipeng Luo,
Jonathan Li,
Raymundo Cordero Garcia,
Wesley Nunes Gonçalves
Abstract:
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix…
▽ More
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Counting and Locating High-Density Objects Using Convolutional Neural Network
Authors:
Mauro dos Santos de Arruda,
Lucas Prado Osco,
Plabiany Rodrigo Acosta,
Diogo Nunes Gonçalves,
José Marcato Junior,
Ana Paula Marques Ramos,
Edson Takashi Matsubara,
Zhipeng Luo,
Jonathan Li,
Jonathan de Andrade Silva,
Wesley Nunes Gonçalves
Abstract:
This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our meth…
▽ More
This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our method returned a mean absolute error (MAE) of 2.05, a root-mean-squared error (RMSE) of 2.87 and a coefficient of determination (R$^2$) of 0.986. For the car dataset (CARPK and PUCPR+), our method was superior to state-of-the-art methods. In the these datasets, our approach achieved an MAE of 4.45 and 3.16, an RMSE of 6.18 and 4.39, and an R$^2$ of 0.975 and 0.999, respectively. The proposed method is suitable for dealing with high object-density, returning a state-of-the-art performance for counting and locating objects.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
A Deep Learning Approach Based on Graphs to Detect Plantation Lines
Authors:
Diogo Nunes Gonçalves,
Mauro dos Santos de Arruda,
Hemerson Pistori,
Vanessa Jordão Marcato Fernandes,
Ana Paula Marques Ramos,
Danielle Elis Garcia Furuya,
Lucas Prado Osco,
Hongjie He,
Jonathan Li,
José Marcato Junior,
Wesley Nunes Gonçalves
Abstract:
Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the…
▽ More
Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the backbone, which consists of the initial layers of the VGG16. This feature map is used as an input to the Knowledge Estimation Module (KEM), organized in three concatenated branches for detecting 1) the plant positions, 2) the plantation lines, and 3) for the displacement vectors between the plants. A graph modeling is applied considering each plant position on the image as vertices, and edges are formed between two vertices (i.e. plants). Finally, the edge is classified as pertaining to a certain plantation line based on three probabilities (higher than 0.5): i) in visual features obtained from the backbone; ii) a chance that the edge pixels belong to a line, from the KEM step; and iii) an alignment of the displacement vectors with the edge, also from KEM. Experiments were conducted in corn plantations with different growth stages and patterns with aerial RGB imagery. A total of 564 patches with 256 x 256 pixels were used and randomly divided into training, validation, and testing sets in a proportion of 60\%, 20\%, and 20\%, respectively. The proposed method was compared against state-of-the-art deep learning methods, and achieved superior performance with a significant margin, returning precision, recall, and F1-score of 98.7\%, 91.9\%, and 95.1\%, respectively. This approach is useful in extracting lines with spaced plantation patterns and could be implemented in scenarios where plantation gaps occur, generating lines with few-to-none interruptions.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
A Review on Deep Learning in UAV Remote Sensing
Authors:
Lucas Prado Osco,
José Marcato Junior,
Ana Paula Marques Ramos,
Lúcio André de Castro Jorge,
Sarah Narges Fatholahi,
Jonathan de Andrade Silva,
Edson Takashi Matsubara,
Hemerson Pistori,
Wesley Nunes Gonçalves,
Jonathan Li
Abstract:
Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information p…
▽ More
Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information produced in its subfields. Recently, Unmanned Aerial Vehicles (UAV) based applications have dominated aerial sensing research. However, a literature revision that combines both "deep learning" and "UAV remote sensing" thematics has not yet been conducted. The motivation for our work was to present a comprehensive review of the fundamentals of Deep Learning (DL) applied in UAV-based imagery. We focused mainly on describing classification and regression techniques used in recent applications with UAV-acquired data. For that, a total of 232 papers published in international scientific journal databases was examined. We gathered the published material and evaluated their characteristics regarding application, sensor, and technique used. We relate how DL presents promising results and has the potential for processing tasks associated with UAV-based image data. Lastly, we project future perspectives, commentating on prominent DL paths to be explored in the UAV remote sensing field. Our revision consists of a friendly-approach to introduce, commentate, and summarize the state-of-the-art in UAV-based image applications with DNNs algorithms in diverse subfields of remote sensing, grouping it in the environmental, urban, and agricultural contexts.
△ Less
Submitted 20 August, 2023; v1 submitted 22 January, 2021;
originally announced January 2021.
-
A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery
Authors:
Lucas Prado Osco,
Mauro dos Santos de Arruda,
Diogo Nunes Gonçalves,
Alexandre Dias,
Juliana Batistoti,
Mauricio de Souza,
Felipe David Georges Gomes,
Ana Paula Marques Ramos,
Lúcio André de Castro Jorge,
Veraldo Liesenberg,
Jonathan Li,
Lingfei Ma,
José Marcato Junior,
Wesley Nunes Gonçalves
Abstract:
In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scena…
▽ More
In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scenarios, locations, types of crops, sensors, and dates. A two-branch architecture was implemented in our CNN method, where the information obtained within the plantation-row is updated into the plant detection branch and retro-feed to the row branch; which are then refined by a Multi-Stage Refinement method. In the corn plantation datasets (with both growth phases, young and mature), our approach returned a mean absolute error (MAE) of 6.224 plants per image patch, a mean relative error (MRE) of 0.1038, precision and recall values of 0.856, and 0.905, respectively, and an F-measure equal to 0.876. These results were superior to the results from other deep networks (HRNet, Faster R-CNN, and RetinaNet) evaluated with the same task and dataset. For the plantation-row detection, our approach returned precision, recall, and F-measure scores of 0.913, 0.941, and 0.925, respectively. To test the robustness of our model with a different type of agriculture, we performed the same task in the citrus orchard dataset. It returned an MAE equal to 1.409 citrus-trees per patch, MRE of 0.0615, precision of 0.922, recall of 0.911, and F-measure of 0.965. For citrus plantation-row detection, our approach resulted in precision, recall, and F-measure scores equal to 0.965, 0.970, and 0.964, respectively. The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
△ Less
Submitted 14 February, 2021; v1 submitted 31 December, 2020;
originally announced December 2020.
-
From Time Series to Euclidean Spaces: On Spatial Transformations for Temporal Clustering
Authors:
Nuno Mota Goncalves,
Ioana Giurgiu,
Anika Schumann
Abstract:
Unsupervised clustering of temporal data is both challenging and crucial in machine learning. In this paper, we show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we (1) tran…
▽ More
Unsupervised clustering of temporal data is both challenging and crucial in machine learning. In this paper, we show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we (1) transform the input time series into a distance-based projected representation by using similarity measures suitable for dealing with temporal data,(2) feed these projections into a multi-layer CNN-GRU autoencoder to generate meaningful domain-aware latent representations, which ultimately (3) allow for a natural separation of clusters beneficial for most important traditional clustering algorithms. We evaluate our approach on time series datasets from various domains and show that it not only outperforms existing methods in all cases, by up to 32%, but is also robust and incurs negligible computation overheads.
△ Less
Submitted 2 October, 2020;
originally announced October 2020.
-
Probabilistic Object Classification using CNN ML-MAP layers
Authors:
G. Melotti,
C. Premebida,
J. J. Bird,
D. R. Faria,
N. Gonçalves
Abstract:
Deep networks are currently the state-of-the-art for sensory perception in autonomous driving and robotics. However, deep models often generate overconfident predictions precluding proper probabilistic interpretation which we argue is due to the nature of the SoftMax layer. To reduce the overconfidence without compromising the classification performance, we introduce a CNN probabilistic approach b…
▽ More
Deep networks are currently the state-of-the-art for sensory perception in autonomous driving and robotics. However, deep models often generate overconfident predictions precluding proper probabilistic interpretation which we argue is due to the nature of the SoftMax layer. To reduce the overconfidence without compromising the classification performance, we introduce a CNN probabilistic approach based on distributions calculated in the network's Logit layer. The approach enables Bayesian inference by means of ML and MAP layers. Experiments with calibrated and the proposed prediction layers are carried out on object classification using data from the KITTI database. Results are reported for camera ($RGB$) and LiDAR (range-view) modalities, where the new approach shows promising performance compared to SoftMax.
△ Less
Submitted 24 August, 2020; v1 submitted 29 May, 2020;
originally announced May 2020.
-
Data-driven surrogate modelling and benchmarking for process equipment
Authors:
Gabriel F. N. Gonçalves,
Assen Batchvarov,
Yuyi Liu,
Yuxin Liu,
Lachlan Mason,
Indranil Pan,
Omar K. Matar
Abstract:
In chemical process engineering, surrogate models of complex systems are often necessary for tasks of domain exploration, sensitivity analysis of the design parameters, and optimization. A suite of computational fluid dynamics (CFD) simulations geared toward chemical process equipment modeling has been developed and validated with experimental results from the literature. Various regression-based…
▽ More
In chemical process engineering, surrogate models of complex systems are often necessary for tasks of domain exploration, sensitivity analysis of the design parameters, and optimization. A suite of computational fluid dynamics (CFD) simulations geared toward chemical process equipment modeling has been developed and validated with experimental results from the literature. Various regression-based active learning strategies are explored with these CFD simulators in-the-loop under the constraints of a limited function evaluation budget. Specifically, five different sampling strategies and five regression techniques are compared, considering a set of four test cases of industrial significance and varying complexity. Gaussian process regression was observed to have a consistently good performance for these applications. The present quantitative study outlines the pros and cons of the different available techniques and highlights the best practices for their adoption. The test cases and tools are available with an open-source license to ensure reproducibility and engage the wider research community in contributing to both the CFD models and developing and benchmarking new improved algorithms tailored to this field.
△ Less
Submitted 8 September, 2020; v1 submitted 13 March, 2020;
originally announced March 2020.
-
Bio-Inspired Modality Fusion for Active Speaker Detection
Authors:
Gustavo Assunção,
Nuno Gonçalves,
Paulo Menezes
Abstract:
Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound s…
▽ More
Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.
△ Less
Submitted 13 April, 2021; v1 submitted 28 February, 2020;
originally announced March 2020.
-
Dynamic texture analysis with diffusion in networks
Authors:
Lucas C. Ribas,
Wesley N. Goncalves,
Odemir M. Bruno
Abstract:
Dynamic texture is a field of research that has gained considerable interest from computer vision community due to the explosive growth of multimedia databases. In addition, dynamic texture is present in a wide range of videos, which makes it very important in expert systems based on videos such as medical systems, traffic monitoring systems, forest fire detection system, among others. In this pap…
▽ More
Dynamic texture is a field of research that has gained considerable interest from computer vision community due to the explosive growth of multimedia databases. In addition, dynamic texture is present in a wide range of videos, which makes it very important in expert systems based on videos such as medical systems, traffic monitoring systems, forest fire detection system, among others. In this paper, a new method for dynamic texture characterization based on diffusion in directed networks is proposed. The dynamic texture is modeled as a directed network. The method consists in the analysis of the dynamic of this network after a series of graph cut transformations based on the edge weights. For each network transformation, the activity for each vertex is estimated. The activity is the relative frequency that one vertex is visited by random walks in balance. Then, texture descriptor is constructed by concatenating the activity histograms. The main contributions of this paper are the use of directed network modeling and diffusion in network to dynamic texture characterization. These tend to provide better performance in dynamic textures classification. Experiments with rotation and interference of the motion pattern were conducted in order to demonstrate the robustness of the method. The proposed approach is compared to other dynamic texture methods on two very well know dynamic texture database and on traffic condition classification, and outperform in most of the cases.
△ Less
Submitted 27 June, 2018;
originally announced June 2018.
-
Multilayer Complex Network Descriptors for Color-Texture Characterization
Authors:
Leonardo F S Scabini,
Rayner H M Condori,
Wesley N Gonçalves,
Odemir M Bruno
Abstract:
A new method based on complex networks is proposed for color-texture analysis. The proposal consists on modeling the image as a multilayer complex network where each color channel is a layer, and each pixel (in each color channel) is represented as a network vertex. The network dynamic evolution is accessed using a set of modeling parameters (radii and thresholds), and new characterization techniq…
▽ More
A new method based on complex networks is proposed for color-texture analysis. The proposal consists on modeling the image as a multilayer complex network where each color channel is a layer, and each pixel (in each color channel) is represented as a network vertex. The network dynamic evolution is accessed using a set of modeling parameters (radii and thresholds), and new characterization techniques are introduced to capt information regarding within and between color channel spatial interaction. An automatic and adaptive approach for threshold selection is also proposed. We conduct classification experiments on 5 well-known datasets: Vistex, Usptex, Outex13, CURet and MBT. Results among various literature methods are compared, including deep convolutional neural networks with pre-trained architectures. The proposed method presented the highest overall performance over the 5 datasets, with 97.7 of mean accuracy against 97.0 achieved by the ResNet convolutional neural network with 50 layers.
△ Less
Submitted 2 April, 2018;
originally announced April 2018.
-
A smartphone application to measure the quality of pest control spraying machines via image analysis
Authors:
Bruno B. Machado,
Gabriel Spadon,
Mauro S. Arruda,
Wesley N. Goncalves,
Andre C. P. L. F. Carvalho,
Jose F. Rodrigues-Jr
Abstract:
The need for higher agricultural productivity has demanded the intensive use of pesticides. However, their correct use depends on assessment methods that can accurately predict how well the pesticides' spraying covered the intended crop region. Some methods have been proposed in the literature, but their high cost and low portability harm their widespread use. This paper proposes and experimentall…
▽ More
The need for higher agricultural productivity has demanded the intensive use of pesticides. However, their correct use depends on assessment methods that can accurately predict how well the pesticides' spraying covered the intended crop region. Some methods have been proposed in the literature, but their high cost and low portability harm their widespread use. This paper proposes and experimentally evaluates a new methodology based on the use of a smartphone-based mobile application, named DropLeaf. Experiments performed using DropLeaf showed that, in addition to its versatility, it can predict with high accuracy the pesticide spraying. DropLeaf is a five-fold image-processing methodology based on: (i) color space conversion, (ii) threshold noise removal, (iii) convolutional operations of dilation and erosion, (iv) detection of contour markers in the water-sensitive card, and, (v) identification of droplets via the marker-controlled watershed transformation. The authors performed successful experiments over two case studies, the first using a set of synthetic cards and the second using a real-world crop. The proposed tool can be broadly used by farmers equipped with conventional mobile phones, improving the use of pesticides with health, environmental and financial benefits.
△ Less
Submitted 16 December, 2017; v1 submitted 21 November, 2017;
originally announced November 2017.
-
Texture analysis using deterministic partially self-avoiding walk with thresholds
Authors:
Lucas Correia Ribas,
Wesley Nunes Gonçalves,
Odemir Martinez Bruno
Abstract:
In this paper, we propose a new texture analysis method using the deterministic partially self-avoiding walk performed on maps modified with thresholds. In this method, two pixels of the map are neighbors if the Euclidean distance between them is less than $\sqrt{2}$ and the weight (difference between its intensities) is less than a given threshold. The maps obtained by using different thresholds…
▽ More
In this paper, we propose a new texture analysis method using the deterministic partially self-avoiding walk performed on maps modified with thresholds. In this method, two pixels of the map are neighbors if the Euclidean distance between them is less than $\sqrt{2}$ and the weight (difference between its intensities) is less than a given threshold. The maps obtained by using different thresholds highlight several properties of the image that are extracted by the deterministic walk. To compose the feature vector, deterministic walks are performed with different thresholds and its statistics are concatenated. Thus, this approach can be considered as a multi-scale analysis. We validate our method on the Brodatz database, which is very well known public image database and widely used by texture analysis methods. Experimental results indicate that the proposed method presents a good texture discrimination, overcoming traditional texture methods.
△ Less
Submitted 25 November, 2016;
originally announced November 2016.
-
The 12 prophets dataset
Authors:
J. Rodrigues,
M. Gazziro,
N. Goncalves,
O. Neto,
Y. Fernandes,
A. Gimenes,
C. Alegre,
R. Assis
Abstract:
The "Ajeijadinho 3D" project is an initiative supported by the University of São Paulo (Museum of Science and Dean of Culture and Extension), which involves the 3D digitization of art works of Brazilian sculptor Antonio Francisco Lisboa, better known as Aleijadinho. The project made use of advanced acquisition and processing of 3D meshes for preservation and dissemination of the cultural heritage.…
▽ More
The "Ajeijadinho 3D" project is an initiative supported by the University of São Paulo (Museum of Science and Dean of Culture and Extension), which involves the 3D digitization of art works of Brazilian sculptor Antonio Francisco Lisboa, better known as Aleijadinho. The project made use of advanced acquisition and processing of 3D meshes for preservation and dissemination of the cultural heritage. The dissemination occurs through a Web portal, so that the population has the opportunity to meet the art works in detail using 3D visualization and interaction. The portal address is http://www.aleijadinho3d.icmc.usp.br. The 3D acquisitions were conducted over a week at the end of July 2013 in the cities of Ouro Preto, MG, Brazil and Congonhas do Campo, MG, Brazil. The scanning was done with a special equipment supplied by company Leica Geosystems, which allowed the work to take place at distances between 10 and 30 meters, defining a non-invasive procedure, simplified logistics, and without the need for preparation or isolation of the sites. In Ouro Preto, we digitized the churches of Francisco of Assis, Our Lady of Carmo, and Our Lady of Mercy; in Congonhas do Campo we scanned the entire Sanctuary of Bom Jesus de Matosinhos and his 12 prophets. Once scanned, the art works went through a long process of preparation, which required careful handling of meshes done by experts from the University of São Paulo in partnership with company Imprimate.
△ Less
Submitted 19 June, 2015;
originally announced June 2015.
-
Texture descriptor combining fractal dimension and artificial crawlers
Authors:
Wesley Nunes Gonçalves,
Bruno Brandoli Machado,
Odemir Martinez Bruno
Abstract:
Texture is an important visual attribute used to describe images. There are many methods available for texture analysis. However, they do not capture the details richness of the image surface. In this paper, we propose a new method to describe textures using the artificial crawler model. This model assumes that each agent can interact with the environment and each other. Since this swarm system al…
▽ More
Texture is an important visual attribute used to describe images. There are many methods available for texture analysis. However, they do not capture the details richness of the image surface. In this paper, we propose a new method to describe textures using the artificial crawler model. This model assumes that each agent can interact with the environment and each other. Since this swarm system alone does not achieve a good discrimination, we developed a new method to increase the discriminatory power of artificial crawlers, together with the fractal dimension theory. Here, we estimated the fractal dimension by the Bouligand-Minkowski method due to its precision in quantifying structural properties of images. We validate our method on two texture datasets and the experimental results reveal that our method leads to highly discriminative textural features. The results indicate that our method can be used in different texture applications.
△ Less
Submitted 20 November, 2013;
originally announced November 2013.
-
Multi-q Pattern Classification of Polarization Curves
Authors:
Ricardo Fabbri,
Ivan N. Bastos,
Francisco D. Moura Neto,
Francisco J. P. Lopes,
Wesley N. Goncalves,
Odemir M. Bruno
Abstract:
Several experimental measurements are expressed in the form of one-dimensional profiles, for which there is a scarcity of methodologies able to classify the pertinence of a given result to a specific group. The polarization curves that evaluate the corrosion kinetics of electrodes in corrosive media are an application where the behavior is chiefly analyzed from profiles. Polarization curves are in…
▽ More
Several experimental measurements are expressed in the form of one-dimensional profiles, for which there is a scarcity of methodologies able to classify the pertinence of a given result to a specific group. The polarization curves that evaluate the corrosion kinetics of electrodes in corrosive media are an application where the behavior is chiefly analyzed from profiles. Polarization curves are indeed a classic method to determine the global kinetics of metallic electrodes, but the strong nonlinearity from different metals and alloys can overlap and the discrimination becomes a challenging problem. Moreover, even finding a typical curve from replicated tests requires subjective judgement. In this paper we used the so-called multi-q approach based on the Tsallis statistics in a classification engine to separate multiple polarization curve profiles of two stainless steels. We collected 48 experimental polarization curves in aqueous chloride medium of two stainless steel types, with different resistance against localized corrosion. Multi-q pattern analysis was then carried out on a wide potential range, from cathodic up to anodic regions. An excellent classification rate was obtained, at a success rate of 90%, 80%, and 83% for low (cathodic), high (anodic), and both potential ranges, respectively, using only 2% of the original profile data. These results show the potential of the proposed approach towards efficient, robust, systematic and automatic classification of highly non-linear profile curves.
△ Less
Submitted 10 May, 2013;
originally announced May 2013.
-
Material quality assessment of silk nanofibers based on swarm intelligence
Authors:
Bruno Brandoli Machado,
Wesley Nunes Gonçalves,
Odemir Martinez Bruno
Abstract:
In this paper, we propose a novel approach for texture analysis based on artificial crawler model. Our method assumes that each agent can interact with the environment and each other. The evolution process converges to an equilibrium state according to the set of rules. For each textured image, the feature vector is composed by signatures of the live agents curve at each time. Experimental results…
▽ More
In this paper, we propose a novel approach for texture analysis based on artificial crawler model. Our method assumes that each agent can interact with the environment and each other. The evolution process converges to an equilibrium state according to the set of rules. For each textured image, the feature vector is composed by signatures of the live agents curve at each time. Experimental results revealed that combining the minimum and maximum signatures into one increase the classification rate. In addition, we pioneer the use of autonomous agents for characterizing silk fibroin scaffolds. The results strongly suggest that our approach can be successfully employed for texture analysis.
△ Less
Submitted 13 March, 2013;
originally announced March 2013.
-
Image decomposition with anisotropic diffusion applied to leaf-texture analysis
Authors:
Bruno Brandoli Machado,
Wesley Nunes Gonçalves,
Odemir Martinez Bruno
Abstract:
Texture analysis is an important field of investigation that has received a great deal of interest from computer vision community. In this paper, we propose a novel approach for texture modeling based on partial differential equation (PDE). Each image $f$ is decomposed into a family of derived sub-images. $f$ is split into the $u$ component, obtained with anisotropic diffusion, and the $v$ compone…
▽ More
Texture analysis is an important field of investigation that has received a great deal of interest from computer vision community. In this paper, we propose a novel approach for texture modeling based on partial differential equation (PDE). Each image $f$ is decomposed into a family of derived sub-images. $f$ is split into the $u$ component, obtained with anisotropic diffusion, and the $v$ component which is calculated by the difference between the original image and the $u$ component. After enhancing the texture attribute $v$ of the image, Gabor features are computed as descriptors. We validate the proposed approach on two texture datasets with high variability. We also evaluate our approach on an important real-world application: leaf-texture analysis. Experimental results indicate that our approach can be used to produce higher classification rates and can be successfully employed for different texture applications.
△ Less
Submitted 19 January, 2012;
originally announced January 2012.
-
Spatiotemporal Gabor filters: a new method for dynamic texture recognition
Authors:
Wesley Nunes Gonçalves,
Bruno Brandoli Machado,
Odemir Martinez Bruno
Abstract:
This paper presents a new method for dynamic texture recognition based on spatiotemporal Gabor filters. Dynamic textures have emerged as a new field of investigation that extends the concept of self-similarity of texture image to the spatiotemporal domain. To model a dynamic texture, we convolve the sequence of images to a bank of spatiotemporal Gabor filters. For each response, a feature vector i…
▽ More
This paper presents a new method for dynamic texture recognition based on spatiotemporal Gabor filters. Dynamic textures have emerged as a new field of investigation that extends the concept of self-similarity of texture image to the spatiotemporal domain. To model a dynamic texture, we convolve the sequence of images to a bank of spatiotemporal Gabor filters. For each response, a feature vector is built by calculating the energy statistic. As far as the authors know, this paper is the first to report an effective method for dynamic texture recognition using spatiotemporal Gabor filters. We evaluate the proposed method on two challenging databases and the experimental results indicate that the proposed method is a robust approach for dynamic texture recognition.
△ Less
Submitted 17 January, 2012;
originally announced January 2012.
-
Automatic system for counting cells with elliptical shape
Authors:
Wesley Nunes Gonçalves,
Odemir Martinez Bruno
Abstract:
This paper presents a new method for automatic quantification of ellipse-like cells in images, an important and challenging problem that has been studied by the computer vision community. The proposed method can be described by two main steps. Initially, image segmentation based on the k-means algorithm is performed to separate different types of cells from the background. Then, a robust and effic…
▽ More
This paper presents a new method for automatic quantification of ellipse-like cells in images, an important and challenging problem that has been studied by the computer vision community. The proposed method can be described by two main steps. Initially, image segmentation based on the k-means algorithm is performed to separate different types of cells from the background. Then, a robust and efficient strategy is performed on the blob contour for touching cells splitting. Due to the contour processing, the method achieves excellent results of detection compared to manual detection performed by specialists.
△ Less
Submitted 15 January, 2012;
originally announced January 2012.
-
Multi-q Analysis of Image Patterns
Authors:
Ricardo Fabbri,
Wesley N. Gonçalves,
Francisco J. P. Lopes,
Odemir M. Bruno
Abstract:
This paper studies the use of the Tsallis Entropy versus the classic Boltzmann-Gibbs-Shannon entropy for classifying image patterns. Given a database of 40 pattern classes, the goal is to determine the class of a given image sample. Our experiments show that the Tsallis entropy encoded in a feature vector for different $q$ indices has great advantage over the Boltzmann-Gibbs-Shannon entropy for pa…
▽ More
This paper studies the use of the Tsallis Entropy versus the classic Boltzmann-Gibbs-Shannon entropy for classifying image patterns. Given a database of 40 pattern classes, the goal is to determine the class of a given image sample. Our experiments show that the Tsallis entropy encoded in a feature vector for different $q$ indices has great advantage over the Boltzmann-Gibbs-Shannon entropy for pattern classification, boosting recognition rates by a factor of 3. We discuss the reasons behind this success, shedding light on the usefulness of the Tsallis entropy.
△ Less
Submitted 29 December, 2011;
originally announced December 2011.
-
Complex network classification using partially self-avoiding deterministic walks
Authors:
Wesley Nunes Gonçalves,
Alexandre Souto Martinez,
Odemir Martinez Bruno
Abstract:
Complex networks have attracted increasing interest from various fields of science. It has been demonstrated that each complex network model presents specific topological structures which characterize its connectivity and dynamics. Complex network classification rely on the use of representative measurements that model topological structures. Although there are a large number of measurements, most…
▽ More
Complex networks have attracted increasing interest from various fields of science. It has been demonstrated that each complex network model presents specific topological structures which characterize its connectivity and dynamics. Complex network classification rely on the use of representative measurements that model topological structures. Although there are a large number of measurements, most of them are correlated. To overcome this limitation, this paper presents a new measurement for complex network classification based on partially self-avoiding walks. We validate the measurement on a data set composed by 40.000 complex networks of four well-known models. Our results indicate that the proposed measurement improves correct classification of networks compared to the traditional ones.
△ Less
Submitted 16 February, 2012; v1 submitted 23 December, 2011;
originally announced December 2011.