Search | arXiv e-print repository

arXiv:1911.07845 [pdf, other]

Neural Random Subspace

Authors: Yun-Hao Cao, Jianxin Wu, Hanchen Wang, Joan Lasenby

Abstract: The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions. However, there is not a straightforward way yet to combine it with deep learning. In this paper, we therefore propose Neural Random Subspace (NRS), a novel deep learning based random subspace method. In contrast to previous forest methods, NRS enjoys the benefits of end-to-end, data… ▽ More The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions. However, there is not a straightforward way yet to combine it with deep learning. In this paper, we therefore propose Neural Random Subspace (NRS), a novel deep learning based random subspace method. In contrast to previous forest methods, NRS enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy. Furthermore, as a non-linear component to be encoded into Convolutional Neural Networks (CNNs), NRS learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. Compared with random subspaces, random forests and gradient boosting decision trees (GBDTs), NRS achieves superior performance on 35 machine learning datasets. Moreover, on both 2D image and 3D point cloud recognition tasks, integration of NRS with CNN architectures achieves consistent improvements with minor extra cost. Code is available at https://github.com/CupidJay/NRS_pytorch. △ Less

Submitted 14 September, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

Comments: 38 pages

arXiv:1804.04849 [pdf, other]

The unreasonable effectiveness of the forget gate

Authors: Jos van der Westhuizen, Joan Lasenby

Abstract: Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the sta… ▽ More Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the standard LSTM on multiple benchmark datasets and competes with some of the best contemporary models. Our proposed network, the JANET, achieves accuracies of 99% and 92.5% on the MNIST and pMNIST datasets, outperforming the standard LSTM which yields accuracies of 98.5% and 91%. △ Less

Submitted 13 September, 2018; v1 submitted 13 April, 2018; originally announced April 2018.

Comments: Corrected LSTM gradient derivations. Added link to code

arXiv:1706.01242 [pdf, other]

Bayesian LSTMs in medicine

Authors: Jos van der Westhuizen, Joan Lasenby

Abstract: The medical field stands to see significant benefits from the recent advances in deep learning. Knowing the uncertainty in the decision made by any machine learning algorithm is of utmost importance for medical practitioners. This study demonstrates the utility of using Bayesian LSTMs for classification of medical time series. Four medical time series datasets are used to show the accuracy improve… ▽ More The medical field stands to see significant benefits from the recent advances in deep learning. Knowing the uncertainty in the decision made by any machine learning algorithm is of utmost importance for medical practitioners. This study demonstrates the utility of using Bayesian LSTMs for classification of medical time series. Four medical time series datasets are used to show the accuracy improvement Bayesian LSTMs provide over standard LSTMs. Moreover, we show cherry-picked examples of confident and uncertain classifications of the medical time series. With simple modifications of the common practice for deep learning, significant improvements can be made for the medical practitioner and patient. △ Less

Submitted 5 June, 2017; originally announced June 2017.

Comments: 11 pages, 8 figures

arXiv:1705.08153 [pdf, other]

Techniques for visualizing LSTMs applied to electrocardiograms

Authors: Jos van der Westhuizen, Joan Lasenby

Abstract: This paper explores four different visualization techniques for long short-term memory (LSTM) networks applied to continuous-valued time series. On the datasets analysed, we find that the best visualization technique is to learn an input deletion mask that optimally reduces the true class score. With a specific focus on single-lead electrocardiograms from the MIT-BIH arrhythmia dataset, we show th… ▽ More This paper explores four different visualization techniques for long short-term memory (LSTM) networks applied to continuous-valued time series. On the datasets analysed, we find that the best visualization technique is to learn an input deletion mask that optimally reduces the true class score. With a specific focus on single-lead electrocardiograms from the MIT-BIH arrhythmia dataset, we show that salient input features for the LSTM classifier align well with medical theory. △ Less

Submitted 15 June, 2018; v1 submitted 23 May, 2017; originally announced May 2017.

Comments: presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden

Showing 1–4 of 4 results for author: Lasenby, J