-
Neural Random Subspace
Authors:
Yun-Hao Cao,
Jianxin Wu,
Hanchen Wang,
Joan Lasenby
Abstract:
The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions. However, there is not a straightforward way yet to combine it with deep learning. In this paper, we therefore propose Neural Random Subspace (NRS), a novel deep learning based random subspace method. In contrast to previous forest methods, NRS enjoys the benefits of end-to-end, data…
▽ More
The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions. However, there is not a straightforward way yet to combine it with deep learning. In this paper, we therefore propose Neural Random Subspace (NRS), a novel deep learning based random subspace method. In contrast to previous forest methods, NRS enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy. Furthermore, as a non-linear component to be encoded into Convolutional Neural Networks (CNNs), NRS learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. Compared with random subspaces, random forests and gradient boosting decision trees (GBDTs), NRS achieves superior performance on 35 machine learning datasets. Moreover, on both 2D image and 3D point cloud recognition tasks, integration of NRS with CNN architectures achieves consistent improvements with minor extra cost. Code is available at https://github.com/CupidJay/NRS_pytorch.
△ Less
Submitted 14 September, 2020; v1 submitted 17 November, 2019;
originally announced November 2019.
-
The unreasonable effectiveness of the forget gate
Authors:
Jos van der Westhuizen,
Joan Lasenby
Abstract:
Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the sta…
▽ More
Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the standard LSTM on multiple benchmark datasets and competes with some of the best contemporary models. Our proposed network, the JANET, achieves accuracies of 99% and 92.5% on the MNIST and pMNIST datasets, outperforming the standard LSTM which yields accuracies of 98.5% and 91%.
△ Less
Submitted 13 September, 2018; v1 submitted 13 April, 2018;
originally announced April 2018.
-
Bayesian LSTMs in medicine
Authors:
Jos van der Westhuizen,
Joan Lasenby
Abstract:
The medical field stands to see significant benefits from the recent advances in deep learning. Knowing the uncertainty in the decision made by any machine learning algorithm is of utmost importance for medical practitioners. This study demonstrates the utility of using Bayesian LSTMs for classification of medical time series. Four medical time series datasets are used to show the accuracy improve…
▽ More
The medical field stands to see significant benefits from the recent advances in deep learning. Knowing the uncertainty in the decision made by any machine learning algorithm is of utmost importance for medical practitioners. This study demonstrates the utility of using Bayesian LSTMs for classification of medical time series. Four medical time series datasets are used to show the accuracy improvement Bayesian LSTMs provide over standard LSTMs. Moreover, we show cherry-picked examples of confident and uncertain classifications of the medical time series. With simple modifications of the common practice for deep learning, significant improvements can be made for the medical practitioner and patient.
△ Less
Submitted 5 June, 2017;
originally announced June 2017.
-
Techniques for visualizing LSTMs applied to electrocardiograms
Authors:
Jos van der Westhuizen,
Joan Lasenby
Abstract:
This paper explores four different visualization techniques for long short-term memory (LSTM) networks applied to continuous-valued time series. On the datasets analysed, we find that the best visualization technique is to learn an input deletion mask that optimally reduces the true class score. With a specific focus on single-lead electrocardiograms from the MIT-BIH arrhythmia dataset, we show th…
▽ More
This paper explores four different visualization techniques for long short-term memory (LSTM) networks applied to continuous-valued time series. On the datasets analysed, we find that the best visualization technique is to learn an input deletion mask that optimally reduces the true class score. With a specific focus on single-lead electrocardiograms from the MIT-BIH arrhythmia dataset, we show that salient input features for the LSTM classifier align well with medical theory.
△ Less
Submitted 15 June, 2018; v1 submitted 23 May, 2017;
originally announced May 2017.