-
ProtoAL: Interpretable Deep Active Learning with prototypes for medical imaging
Authors:
Iury B. de A. Santos,
André C. P. L. F. de Carvalho
Abstract:
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, wher…
▽ More
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, where we integrate an interpretable DL model into the Deep Active Learning (DAL) framework. This approach aims to address both challenges by focusing on the medical imaging context and utilizing an inherently interpretable model based on prototypes. We evaluated ProtoAL on the Messidor dataset, achieving an area under the precision-recall curve of 0.79 while utilizing only 76.54\% of the available labeled data. These capabilities can enhances the practical usability of a DL model in the medical field, providing a means of trust calibration in domain experts and a suitable solution for learning in the data scarcity context often found.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Efficient Parameter Mining and Freezing for Continual Object Detection
Authors:
Angelo G. Menezes,
Augusto J. Peterlevitz,
Mateus A. Chinelatto,
André C. P. L. F. de Carvalho
Abstract:
Continual Object Detection is essential for enabling intelligent agents to interact proactively with humans in real-world settings. While parameter-isolation strategies have been extensively explored in the context of continual learning for classification, they have yet to be fully harnessed for incremental object detection scenarios. Drawing inspiration from prior research that focused on mining…
▽ More
Continual Object Detection is essential for enabling intelligent agents to interact proactively with humans in real-world settings. While parameter-isolation strategies have been extensively explored in the context of continual learning for classification, they have yet to be fully harnessed for incremental object detection scenarios. Drawing inspiration from prior research that focused on mining individual neuron responses and integrating insights from recent developments in neural pruning, we proposed efficient ways to identify which layers are the most important for a network to maintain the performance of a detector across sequential updates. The presented findings highlight the substantial advantages of layer-level parameter isolation in facilitating incremental learning within object detection models, offering promising avenues for future research and application in real-world scenarios.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Model interpretation using improved local regression with variable importance
Authors:
Gilson Y. Shimizu,
Rafael Izbicki,
Andre C. P. L. F. de Carvalho
Abstract:
A fundamental question on the use of ML models concerns the explanation of their predictions for increasing transparency in decision-making. Although several interpretability methods have emerged, some gaps regarding the reliability of their explanations have been identified. For instance, most methods are unstable (meaning that they give very different explanations with small changes in the data)…
▽ More
A fundamental question on the use of ML models concerns the explanation of their predictions for increasing transparency in decision-making. Although several interpretability methods have emerged, some gaps regarding the reliability of their explanations have been identified. For instance, most methods are unstable (meaning that they give very different explanations with small changes in the data), and do not cope well with irrelevant features (that is, features not related to the label). This article introduces two new interpretability methods, namely VarImp and SupClus, that overcome these issues by using local regressions fits with a weighted distance that takes into account variable importance. Whereas VarImp generates explanations for each instance and can be applied to datasets with more complex relationships, SupClus interprets clusters of instances with similar explanations and can be applied to simpler datasets where clusters can be found. We compare our methods with state-of-the art approaches and show that it yields better explanations according to several metrics, particularly in high-dimensional problems with irrelevant features, as well as when the relationship between features and target is non-linear.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Continual Object Detection: A review of definitions, strategies, and challenges
Authors:
Angelo G. Menezes,
Gustavo de Moura,
Cézanne Alves,
André C. P. L. F. de Carvalho
Abstract:
The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. Its focus has been mainly on incremental classification tasks. We believe that research in continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles. This scenario is more complex than con…
▽ More
The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. Its focus has been mainly on incremental classification tasks. We believe that research in continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles. This scenario is more complex than conventional classification given the occurrence of instances of classes that are unknown at the time, but can appear in subsequent tasks as a new class to be learned, resulting in missing annotations and conflicts with the background label. In this review, we analyze the current strategies proposed to tackle the problem of class-incremental object detection. Our main contributions are: (1) a short and systematic review of the methods that propose solutions to traditional incremental object detection scenarios; (2) A comprehensive evaluation of the existing approaches using a new metric to quantify the stability and plasticity of each technique in a standard way; (3) an overview of the current trends within continual object detection and a discussion of possible future research directions.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Forecasting Financial Market Structure from Network Features using Machine Learning
Authors:
Douglas Castilho,
Tharsis T. P. Souza,
Soong Moon Kang,
João Gama,
André C. P. L. F. de Carvalho
Abstract:
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering…
▽ More
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering methods to estimate market structure, namely Dynamic Asset Graph (DAG), Dynamic Minimal Spanning Tree (DMST) and Dynamic Threshold Networks (DTN). Experimental results show that the proposed model can forecast market structure with high predictive performance with up to $40\%$ improvement over a time-invariant correlation-based benchmark. Non-pair-wise correlation features showed to be important compared to traditionally used pair-wise correlation measures for all markets studied, particularly in the long-term forecasting of stock market structure. Evidence is provided for stock constituents of the DAX30, EUROSTOXX50, FTSE100, HANGSENG50, NASDAQ100 and NIFTY50 market indices. Findings can be useful to improve portfolio selection and risk management methods, which commonly rely on a backward-looking covariance matrix to estimate portfolio risk.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
An Extensive Experimental Evaluation of Automated Machine Learning Methods for Recommending Classification Algorithms (Extended Version)
Authors:
Márcio P. Basgalupp,
Rodrigo C. Barros,
Alex G. C. de Sá,
Gisele L. Pappa,
Rafael G. Mantovani,
André C. P. L. F. de Carvalho,
Alex A. Freitas
Abstract:
This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA…
▽ More
This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA-based methods build classification algorithms from a single machine learning paradigm: either decision-tree induction, rule induction, or Bayesian network classification. Auto-WEKA combines algorithm selection and hyper-parameter optimisation to recommend classification algorithms from multiple paradigms. We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit. In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant. However, the EA evolving decision-tree induction algorithms has the advantage of producing algorithms that generate interpretable classification models and that are more scalable to large datasets, by comparison with many algorithms from other learning paradigms that can be recommended by Auto-WEKA. We also observed that Auto-WEKA has shown meta-overfitting, a form of overfitting at the meta-learning level, rather than at the base-learning level.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
MeLIME: Meaningful Local Explanation for Machine Learning Models
Authors:
Tiago Botari,
Frederik Hvilshøj,
Rafael Izbicki,
Andre C. P. L. F. de Carvalho
Abstract:
Most state-of-the-art machine learning algorithms induce black-box models, preventing their application in many sensitive domains. Hence, many methodologies for explaining machine learning models have been proposed to address this problem. In this work, we introduce strategies to improve local explanations taking into account the distribution of the data used to train the black-box models. We show…
▽ More
Most state-of-the-art machine learning algorithms induce black-box models, preventing their application in many sensitive domains. Hence, many methodologies for explaining machine learning models have been proposed to address this problem. In this work, we introduce strategies to improve local explanations taking into account the distribution of the data used to train the black-box models. We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models, operating on various types of data. MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models. Additionally, we introduce modifications to standard training algorithms of local interpretable models fostering more robust explanations, even allowing the production of counterfactual examples. To show the strengths of the proposed approach, we include experiments on tabular data, images, and text; all showing improved explanations. In particular, MeLIME generated more meaningful explanations on the MNIST dataset than methods such as GuidedBackprop, SmoothGrad, and Layer-wise Relevance Propagation. MeLIME is available on https://github.com/tiagobotari/melime.
△ Less
Submitted 12 September, 2020;
originally announced September 2020.
-
Reconstructing commuters network using machine learning and urban indicators
Authors:
Gabriel Spadon,
Andre C. P. L. F. de Carvalho,
Jose F. Rodrigues-Jr,
Luiz G. A. Alves
Abstract:
Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g., a city) and links indicate the flow of people between two of them, physics-inspired models have been proposed to quantify the number of people migrating from…
▽ More
Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g., a city) and links indicate the flow of people between two of them, physics-inspired models have been proposed to quantify the number of people migrating from one city to the other. Despite the advances made by these models, our ability to predict the number of commuters and reconstruct mobility networks remains limited. Here, we propose an alternative approach using machine learning and 22 urban indicators to predict the flow of people and reconstruct the intercity commuters network. Our results reveal that predictions based on machine learning algorithms and urban indicators can reconstruct the commuters network with 90.4% of accuracy and describe 77.6% of the variance observed in the flow of people between cities. We also identify essential features to recover the network structure and the urban indicators mostly related to commuting patterns. As previously reported, distance plays a significant role in commuting, but other indicators, such as Gross Domestic Product (GDP) and unemployment rate, are also driven-forces for people to commute. We believe that our results shed new lights on the modeling of migration and reinforce the role of urban indicators on commuting patterns. Also, because link-prediction and network reconstruction are still open challenges in network science, our results have implications in other areas, like economics, social sciences, and biology, where node attributes can give us information about the existence of links connecting entities in the network.
△ Less
Submitted 9 August, 2019;
originally announced August 2019.
-
Local Interpretation Methods to Machine Learning Using the Domain of the Feature Space
Authors:
Tiago Botari,
Rafael Izbicki,
Andre C. P. L. F. de Carvalho
Abstract:
As machine learning becomes an important part of many real world applications affecting human lives, new requirements, besides high predictive accuracy, become important. One important requirement is transparency, which has been associated with model interpretability. Many machine learning algorithms induce models difficult to interpret, named black box. Moreover, people have difficulty to trust m…
▽ More
As machine learning becomes an important part of many real world applications affecting human lives, new requirements, besides high predictive accuracy, become important. One important requirement is transparency, which has been associated with model interpretability. Many machine learning algorithms induce models difficult to interpret, named black box. Moreover, people have difficulty to trust models that cannot be explained. In particular for machine learning, many groups are investigating new methods able to explain black box models. These methods usually look inside the black models to explain their inner work. By doing so, they allow the interpretation of the decision making process used by black box models. Among the recently proposed model interpretation methods, there is a group, named local estimators, which are designed to explain how the label of particular instance is predicted. For such, they induce interpretable models on the neighborhood of the instance to be explained. Local estimators have been successfully used to explain specific predictions. Although they provide some degree of model interpretability, it is still not clear what is the best way to implement and apply them. Open questions include: how to best define the neighborhood of an instance? How to control the trade-off between the accuracy of the interpretation method and its interpretability? How to make the obtained solution robust to small variations on the instance to be explained? To answer to these questions, we propose and investigate two strategies: (i) using data instance properties to provide improved explanations, and (ii) making sure that the neighborhood of an instance is properly defined by taking the geometry of the domain of the feature space into account. We evaluate these strategies in a regression task and present experimental results that show that they can improve local explanations.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
A Preliminary Study on Hyperparameter Configuration for Human Activity Recognition
Authors:
Kemilly Dearo Garcia,
Tiago Carvalho,
João Mendes-Moreira,
João M. P. Cardoso,
André C. P. L. F. de Carvalho
Abstract:
Human activity recognition (HAR) is a classification task that aims to classify human activities or predict human behavior by means of features extracted from sensors data. Typical HAR systems use wearable sensors and/or handheld and mobile devices with built-in sensing capabilities. Due to the widespread use of smartphones and to the inclusion of various sensors in all contemporary smartphones (e…
▽ More
Human activity recognition (HAR) is a classification task that aims to classify human activities or predict human behavior by means of features extracted from sensors data. Typical HAR systems use wearable sensors and/or handheld and mobile devices with built-in sensing capabilities. Due to the widespread use of smartphones and to the inclusion of various sensors in all contemporary smartphones (e.g., accelerometers and gyroscopes), they are commonly used for extracting and collecting data from sensors and even for implementing HAR systems. When using mobile devices, e.g., smartphones, HAR systems need to deal with several constraints regarding battery, computation and memory. These constraints enforce the need of a system capable of managing its resources and maintain acceptable levels of classification accuracy. Moreover, several factors can influence activity recognition, such as classification models, sensors availability and size of data window for feature extraction, making stable accuracy a difficult task. In this paper, we present a semi-supervised classifier and a study regarding the influence of hyperparameter configuration in classification accuracy, depending on the user and the activities performed by each user. This study focuses on sensing data provided by the PAMAP2 dataset. Experimental results show that it is possible to maintain classification accuracy by adjusting hyperparameters, like window size and windows overlap factor, depending on user and activity performed. These experiments motivate the development of a system able to automatically adapt hyperparameter settings for the activity performed by each user.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
cf2vec: Collaborative Filtering algorithm selection using graph distributed representations
Authors:
Tiago Cunha,
Carlos Soares,
André C. P. L. F. de Carvalho
Abstract:
Algorithm selection using Metalearning aims to find mappings between problem characteristics (i.e. metafeatures) with relative algorithm performance to predict the best algorithm(s) for new datasets. Therefore, it is of the utmost importance that the metafeatures used are informative. In Collaborative Filtering, recent research has created an extensive collection of such metafeatures. However, sin…
▽ More
Algorithm selection using Metalearning aims to find mappings between problem characteristics (i.e. metafeatures) with relative algorithm performance to predict the best algorithm(s) for new datasets. Therefore, it is of the utmost importance that the metafeatures used are informative. In Collaborative Filtering, recent research has created an extensive collection of such metafeatures. However, since these are created based on the practitioner's understanding of the problem, they may not capture the most relevant aspects necessary to properly characterize the problem. We propose to overcome this problem by taking advantage of Representation Learning, which is able to create an alternative problem characterizations by having the data guide the design of the representation instead of the practitioner's opinion. Our hypothesis states that such alternative representations can be used to replace standard metafeatures, hence hence leading to a more robust approach to Metalearning. We propose a novel procedure specially designed for Collaborative Filtering algorithm selection. The procedure models Collaborative Filtering as graphs and extracts distributed representations using graph2vec. Experimental results show that the proposed procedure creates representations that are competitive with state-of-the-art metafeatures, while requiring significantly less data and without virtually any human input.
△ Less
Submitted 17 September, 2018;
originally announced September 2018.
-
Characterizing classification datasets: a study of meta-features for meta-learning
Authors:
Adriano Rivolli,
Luís P. F. Garcia,
Carlos Soares,
Joaquin Vanschoren,
André C. P. L. F. de Carvalho
Abstract:
Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. Such recommendations are made based on meta-data, consisting of performance evaluations of algorithms on prior datasets, as well as characterizations of these datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for…
▽ More
Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. Such recommendations are made based on meta-data, consisting of performance evaluations of algorithms on prior datasets, as well as characterizations of these datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for the performance of machine learning algorithms trained on them. Unfortunately, despite being used in a large number of studies, meta-features are not uniformly described, organized and computed, making many empirical studies irreproducible and hard to compare. This paper aims to deal with this by systematizing and standardizing data characterization measures for classification datasets used in meta-learning. Moreover, it presents MFE, a new tool for extracting meta-features from datasets and identifying more subtle reproducibility issues in the literature, proposing guidelines for data characterization that strengthen reproducible empirical research in meta-learning.
△ Less
Submitted 26 August, 2019; v1 submitted 30 August, 2018;
originally announced August 2018.
-
Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets
Authors:
Tiago Cunha,
Carlos Soares,
André C. P. L. F. de Carvalho
Abstract:
To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing sev…
▽ More
To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing several dimensions of this problem. Despite interesting and effective findings, it is still unknown whether these are the most effective metafeatures. Hence, this work proposes a new set of graph metafeatures, which approach the Collaborative Filtering problem from a Graph Theory perspective. Furthermore, in order to understand whether metafeatures from multiple dimensions are a better fit, we investigate the effects of comprehensive metafeatures. These metafeatures are a selection of the best metafeatures from all existing Collaborative Filtering metafeatures. The impact of the most representative metafeatures is investigated in a controlled experimental setup. Another contribution we present is the use of a Pareto-Efficient ranking procedure to create multicriteria metatargets. These new rankings of algorithms, which take into account multiple evaluation measures, allow to explore the algorithm selection problem in a fairer and more detailed way. According to the experimental results, the graph metafeatures are a good alternative to related work metafeatures. However, the results have shown that the feature selection procedure used to create the comprehensive metafeatures is is not effective, since there is no gain in predictive performance. Finally, an extensive metaknowledge analysis was conducted to identify the most influential metafeatures.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
CF4CF: Recommending Collaborative Filtering algorithms using Collaborative Filtering
Authors:
Tiago Cunha,
Carlos Soares,
André C. P. L. F. de Carvalho
Abstract:
Automatic solutions which enable the selection of the best algorithms for a new problem are commonly found in the literature. One research area which has recently received considerable efforts is Collaborative Filtering. Existing work includes several approaches using Metalearning, which relate the characteristics of datasets with the performance of the algorithms. This work explores an alternativ…
▽ More
Automatic solutions which enable the selection of the best algorithms for a new problem are commonly found in the literature. One research area which has recently received considerable efforts is Collaborative Filtering. Existing work includes several approaches using Metalearning, which relate the characteristics of datasets with the performance of the algorithms. This work explores an alternative approach to tackle this problem. Since, in essence, both are recommendation problems, this work uses Collaborative Filtering algorithms to select Collaborative Filtering algorithms. Our approach integrates subsampling landmarkers, which are a data characterization approach commonly used in Metalearning, with a standard Collaborative Filtering method. The experimental results show that CF4CF competes with standard Metalearning strategies in the problem of Collaborative Filtering algorithm selection.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.