Showing 1–2 of 2 results for author: Little, D A

Search v0.5.6 released 2020-02-24

arXiv:1807.04098 [pdf, other]

cs.LG cs.CY cs.IR cs.NE stat.ML

doi 10.1007/978-3-030-10997-4_10

A Recurrent Neural Network Survival Model: Predicting Web User Return Time

Authors: Georg L. Grob, Ângelo Cardoso, C. H. Bryan Liu, Duncan A. Little, Benjamin Paul Chamberlain

Abstract: The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both… ▽ More The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both techniques are severely limited when applied to this problem. Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions. RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time. We develop a novel RNN survival model that removes the limitations of the state of the art methods. We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation. △ Less

Submitted 11 July, 2018; originally announced July 2018.

Comments: Accepted into ECML PKDD 2018; 8 figures and 1 table

Journal ref: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2018. Lecture Notes in Computer Science, vol 11053. pp 152-168
arXiv:1706.09865 [pdf, other]

stat.ML cs.CY cs.LG

doi 10.1007/978-3-319-71273-4_9

Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Authors: C. H. Bryan Liu, Benjamin Paul Chamberlain, Duncan A. Little, Angelo Cardoso

Abstract: Random forests are among the most popular classification and regression methods used in industrial applications. To be effective, the parameters of random forests must be carefully tuned. This is usually done by choosing values that minimize the prediction error on a held out dataset. We argue that error reduction is only one of several metrics that must be considered when optimizing random forest… ▽ More Random forests are among the most popular classification and regression methods used in industrial applications. To be effective, the parameters of random forests must be carefully tuned. This is usually done by choosing values that minimize the prediction error on a held out dataset. We argue that error reduction is only one of several metrics that must be considered when optimizing random forest parameters for commercial applications. We propose a novel metric that captures the stability of random forests predictions, which we argue is key for scenarios that require successive predictions. We motivate the need for multi-criteria optimization by showing that in practical applications, simply choosing the parameters that lead to the lowest error can introduce unnecessary costs and produce predictions that are not stable across independent runs. To optimize this multi-criteria trade-off, we present a new framework that efficiently finds a principled balance between these three considerations using Bayesian optimisation. The pitfalls of optimising forest parameters purely for error reduction are demonstrated using two publicly available real world datasets. We show that our framework leads to parameter settings that are markedly different from the values discovered by error reduction metrics. △ Less

Submitted 13 July, 2017; v1 submitted 29 June, 2017; originally announced June 2017.

Comments: To appear in ECML-PKDD 2017

Journal ref: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. LNCS vol 10536, pp. 102-113 (2017)

Search v0.5.6 released 2020-02-24