Search | arXiv e-print repository

The CAST package for training and assessment of spatial prediction models in R

Authors: Hanna Meyer, Marvin Ludwig, Carles Milà, Jan Linnenbrink, Fabian Schumacher

Abstract: One key task in environmental science is to map environmental variables continuously in space or even in space and time. Machine learning algorithms are frequently used to learn from local field observations to make spatial predictions by estimating the value of the variable of interest in places where it has not been measured. However, the application of machine learning strategies for spatial ma… ▽ More One key task in environmental science is to map environmental variables continuously in space or even in space and time. Machine learning algorithms are frequently used to learn from local field observations to make spatial predictions by estimating the value of the variable of interest in places where it has not been measured. However, the application of machine learning strategies for spatial mapping involves additional challenges compared to "non-spatial" prediction tasks that often originate from spatial autocorrelation and from training data that are not independent and identically distributed. In the past few years, we developed a number of methods to support the application of machine learning for spatial data which involves the development of suitable cross-validation strategies for performance assessment and model selection, spatial feature selection, and methods to assess the area of applicability of the trained models. The intention of the CAST package is to support the application of machine learning strategies for predictive mapping by implementing such methods and making them available for easy integration into modelling workflows. Here we introduce the CAST package and its core functionalities. At the case study of mapping plant species richness, we will go through the different steps of the modelling workflow and show how CAST can be used to support more reliable spatial predictions. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 16 pages,9 figures

arXiv:2104.07386 [pdf, other]

Reference and Probability-Matching Priors for the Parameters of a Univariate Student $t$-Distribution

Authors: A. J. van der Merwe, M. J. von Maltitz, J. H. Meyer

Abstract: In this paper reference and probability-matching priors are derived for the univariate Student $t$-distribution. These priors generally lead to procedures with properties frequentists can relate to while still retaining Bayes validity. The priors are tested by performing simulation studies. The focus is on the relative mean squared error from the posterior median ($MSE(ν)/ν$) and on the frequentis… ▽ More In this paper reference and probability-matching priors are derived for the univariate Student $t$-distribution. These priors generally lead to procedures with properties frequentists can relate to while still retaining Bayes validity. The priors are tested by performing simulation studies. The focus is on the relative mean squared error from the posterior median ($MSE(ν)/ν$) and on the frequentist coverage of the 95\% credibility intervals for a sample size of $n=30$. Average interval lengths of the credibility intervals as well as the modes of the interval lengths based on 2000 simulations are also considered. The performance of the priors are also tested on real data, namely daily logarithmic returns of IBM stocks. △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2006.01659 [pdf, other]

Surprisal-Triggered Conditional Computation with Neural Networks

Authors: Loren Lugosch, Derek Nowrouzezahrai, Brett H. Meyer

Abstract: Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the inp… ▽ More Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2005.07939 [pdf, other]

doi 10.1111/2041-210X.13650

Predicting into unknown space? Estimating the area of applicability of spatial prediction models

Authors: Hanna Meyer, Edzer Pebesma

Abstract: Predictive modelling using machine learning has become very popular for spatial mapping of the environment. Models are often applied to make predictions far beyond sampling locations where new geographic locations might considerably differ from the training data in their environmental properties. However, areas in the predictor space without support of training data are problematic. Since the mode… ▽ More Predictive modelling using machine learning has become very popular for spatial mapping of the environment. Models are often applied to make predictions far beyond sampling locations where new geographic locations might considerably differ from the training data in their environmental properties. However, areas in the predictor space without support of training data are problematic. Since the model has no knowledge about these environments, predictions have to be considered uncertain. Estimating the area to which a prediction model can be reliably applied is required. Here, we suggest a methodology that delineates the "area of applicability" (AOA) that we define as the area, for which the cross-validation error of the model applies. We first propose a "dissimilarity index" (DI) that is based on the minimum distance to the training data in the predictor space, with predictors being weighted by their respective importance in the model. The AOA is then derived by applying a threshold based on the DI of the training data where the DI is calculated with respect to the cross-validation strategy used for model training. We test for the ideal threshold by using simulated data and compare the prediction error within the AOA with the cross-validation error of the model. We illustrate the approach using a simulated case study. Our simulation study suggests a threshold on DI to define the AOA at the .95 quantile of the DI in the training data. Using this threshold, the prediction error within the AOA is comparable to the cross-validation RMSE of the model, while the cross-validation error does not apply outside the AOA. This applies to models being trained with randomly distributed training data, as well as when training data are clustered in space and where spatial cross-validation is applied. We suggest to report the AOA alongside predictions, complementary to validation measures. △ Less

Submitted 16 May, 2020; originally announced May 2020.

Comments: 16 pages, 10 figures, to be submitted to Methods in Ecology and Evolution

arXiv:1908.07805 [pdf, other]

doi 10.1016/j.ecolmodel.2019.108815

Importance of spatial predictor variable selection in machine learning applications -- Moving from data reproduction to spatial prediction

Authors: Hanna Meyer, Christoph Reudenbach, Stephan Wöllauer, Thomas Nauss

Abstract: Machine learning algorithms find frequent application in spatial prediction of biotic and abiotic environmental variables. However, the characteristics of spatial data, especially spatial autocorrelation, are widely ignored. We hypothesize that this is problematic and results in models that can reproduce training data but are unable to make spatial predictions beyond the locations of the training… ▽ More Machine learning algorithms find frequent application in spatial prediction of biotic and abiotic environmental variables. However, the characteristics of spatial data, especially spatial autocorrelation, are widely ignored. We hypothesize that this is problematic and results in models that can reproduce training data but are unable to make spatial predictions beyond the locations of the training samples. We assume that not only spatial validation strategies but also spatial variable selection is essential for reliable spatial predictions. We introduce two case studies that use remote sensing to predict land cover and the leaf area index for the "Marburg Open Forest", an open research and education site of Marburg University, Germany. We use the machine learning algorithm Random Forests to train models using non-spatial and spatial cross-validation strategies to understand how spatial variable selection affects the predictions. Our findings confirm that spatial cross-validation is essential in preventing overoptimistic model performance. We further show that highly autocorrelated predictors (such as geolocation variables, e.g. latitude, longitude) can lead to considerable overfitting and result in models that can reproduce the training data but fail in making spatial predictions. The problem becomes apparent in the visual assessment of the spatial predictions that show clear artefacts that can be traced back to a misinterpretation of the spatially autocorrelated predictors by the algorithm. Spatial variable selection could automatically detect and remove such variables that lead to overfitting, resulting in reliable spatial prediction patterns and improved statistical spatial model performance. We conclude that in addition to spatial validation, a spatial variable selection must be considered in spatial predictions of ecological data to produce reliable predictions. △ Less

Submitted 21 August, 2019; originally announced August 2019.

Comments: under review in Ecological Modelling

Journal ref: Ecological Modelling, 411, 2019, 108815

arXiv:1809.11086 [pdf, other]

Learning Recurrent Binary/Ternary Weights

Authors: Arash Ardakani, Zhengyun Ji, Sean C. Smithson, Brett H. Meyer, Warren J. Gross

Abstract: Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. However, they are both complex and memory intensive due to their recursive nature. These limitations make RNNs difficult to embed on mobile devices requiring real-time processes with limited hardware resources. To address the above issues, we introduce a method that can learn binary and ternary weights d… ▽ More Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. However, they are both complex and memory intensive due to their recursive nature. These limitations make RNNs difficult to embed on mobile devices requiring real-time processes with limited hardware resources. To address the above issues, we introduce a method that can learn binary and ternary weights during the training phase to facilitate hardware implementations of RNNs. As a result, using this approach replaces all multiply-accumulate operations by simple accumulations, bringing significant benefits to custom hardware in terms of silicon area and power consumption. On the software side, we evaluate the performance (in terms of accuracy) of our method using long short-term memories (LSTMs) on various sequential models including sequence classification and language modeling. We demonstrate that our method achieves competitive results on the aforementioned tasks while using binary/ternary weights during the runtime. On the hardware side, we present custom hardware for accelerating the recurrent computations of LSTMs with binary/ternary weights. Ultimately, we show that LSTMs with binary/ternary weights can achieve up to 12x memory saving and 10x inference speedup compared to the full-precision implementation on an ASIC platform. △ Less

Submitted 24 January, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: Published as a conference paper at ICLR 2019

arXiv:1805.05090 [pdf, other]

doi 10.18637/jss.v089.i12

Hyperspectral Data Analysis in R: the hsdar Package

Authors: Lukas W. Lehnert, Hanna Meyer, Wolfgang A. Obermeier, Brenner Silva, Bianca Regeling, Jörg Bendix

Abstract: Hyperspectral remote sensing is a promising tool for a variety of applications including ecology, geology, analytical chemistry and medical research. This article presents the new \hsdar package for R statistical software, which performs a variety of analysis steps taken during a typical hyperspectral remote sensing approach. The package introduces a new class for efficiently storing large hypersp… ▽ More Hyperspectral remote sensing is a promising tool for a variety of applications including ecology, geology, analytical chemistry and medical research. This article presents the new \hsdar package for R statistical software, which performs a variety of analysis steps taken during a typical hyperspectral remote sensing approach. The package introduces a new class for efficiently storing large hyperspectral datasets such as hyperspectral cubes within R. The package includes several important hyperspectral analysis tools such as continuum removal, normalized ratio indices and integrates two widely used radiation transfer models. In addition, the package provides methods to directly use the functionality of the caret package for machine learning tasks. Two case studies demonstrate the package's range of functionality: First, plant leaf chlorophyll content is estimated and second, cancer in the human larynx is detected from hyperspectral data. △ Less

Submitted 14 May, 2018; originally announced May 2018.

arXiv:1706.07355 [pdf, other]

doi 10.1093/bioinformatics/btx552

Three-dimensional Cardiovascular Imaging-Genetics: A Mass Univariate Framework

Authors: Carlo Biffi, Antonio de Marvao, Mark I. Attard, Timothy J. W. Dawes, Nicola Whiffin, Wenjia Bai, Wenzhe Shi, Catherine Francis, Hannah Meyer, Rachel Buchan, Stuart A. Cook, Daniel Rueckert, Declan P. O'Regan

Abstract: MOTIVATION: Left ventricular (LV) hypertrophy is a strong predictor of cardiovascular outcomes, but its genetic regulation remains largely unexplained. Conventional phenotyping relies on manual calculation of LV mass and wall thickness, but advanced cardiac image analysis presents an opportunity for high-throughput mapping of genotype-phenotype associations in three dimensions (3D). RESULTS: High-… ▽ More MOTIVATION: Left ventricular (LV) hypertrophy is a strong predictor of cardiovascular outcomes, but its genetic regulation remains largely unexplained. Conventional phenotyping relies on manual calculation of LV mass and wall thickness, but advanced cardiac image analysis presents an opportunity for high-throughput mapping of genotype-phenotype associations in three dimensions (3D). RESULTS: High-resolution cardiac magnetic resonance images were automatically segmented in 1,124 healthy volunteers to create a 3D shape model of the heart. Mass univariate regression was used to plot a 3D effect-size map for the association between wall thickness and a set of predictors at each vertex in the mesh. The vertices where a significant effect exists were determined by applying threshold-free cluster enhancement to boost areas of signal with spatial contiguity. Experiments on simulated phenotypic signals and SNP replication show that this approach offers a substantial gain in statistical power for cardiac genotype-phenotype associations while providing good control of the false discovery rate. This framework models the effects of genetic variation throughout the heart and can be automatically applied to large population cohorts. AVAILABILITY: The proposed approach has been coded in an R package freely available at https://doi.org/10.5281/zenodo.834610 together with the clinical data used in this work. △ Less

Submitted 13 September, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

Comments: 14 pages, 11 figures. Version accepted by Bioinformatics (Sept 2017). Includes Supplementary Materials

Showing 1–8 of 8 results for author: Meyer, H