Skip to main content

Showing 1–8 of 8 results for author: Liu, C H B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.03579  [pdf, other

    stat.AP cs.DB stat.ME

    Some Statistical and Data Challenges When Building Early-Stage Digital Experimentation and Measurement Capabilities

    Authors: C. H. Bryan Liu

    Abstract: Digital experimentation and measurement (DEM) capabilities -- the knowledge and tools necessary to run experiments with digital products, services, or experiences and measure their impact -- are fast becoming part of the standard toolkit of digital/data-driven organisations in guiding business decisions. Many large technology companies report having mature DEM capabilities, and several businesses… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: PhD thesis. Imperial College London. Official library version available on: https://spiral.imperial.ac.uk/handle/10044/1/110307

  2. arXiv:2111.10198  [pdf, other

    stat.AP cs.DB stat.ME

    Datasets for Online Controlled Experiments

    Authors: C. H. Bryan Liu, Ângelo Cardoso, Paul Couturier, Emma J. McCoy

    Abstract: Online Controlled Experiments (OCE) are the gold standard to measure impact and guide decisions for digital products and services. Despite many methodological advances in this area, the scarcity of public datasets and the lack of a systematic review and categorization hinder its development. We present the first survey and taxonomy for OCE datasets, which highlight the lack of a public dataset to… ▽ More

    Submitted 14 January, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks. 17 pages, 2 figures, 2 tables. Dataset available on Open Science Framework: https://osf.io/64jsb/

  3. arXiv:1807.04098  [pdf, other

    cs.LG cs.CY cs.IR cs.NE stat.ML

    A Recurrent Neural Network Survival Model: Predicting Web User Return Time

    Authors: Georg L. Grob, Ângelo Cardoso, C. H. Bryan Liu, Duncan A. Little, Benjamin Paul Chamberlain

    Abstract: The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: Accepted into ECML PKDD 2018; 8 figures and 1 table

    Journal ref: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2018. Lecture Notes in Computer Science, vol 11053. pp 152-168

  4. arXiv:1806.02588  [pdf, other

    stat.ME cs.DM stat.AP

    Designing Experiments to Measure Incrementality on Facebook

    Authors: C. H. Bryan Liu, Elaine M. Bettaney, Benjamin Paul Chamberlain

    Abstract: The importance of Facebook advertising has risen dramatically in recent years, with the platform accounting for almost 20% of the global online ad spend in 2017. An important consideration in advertising is incrementality: how much of the change in an experimental metric is an advertising campaign responsible for. To measure incrementality, Facebook provide lift studies. As Facebook lift studies d… ▽ More

    Submitted 11 July, 2018; v1 submitted 7 June, 2018; originally announced June 2018.

    Comments: Accepted into 2018 AdKDD & TargetAd Workshop in conjunction with KDD 2018; 6 pages, 4 figures, and 2 tables

  5. arXiv:1803.06258  [pdf, other

    stat.ME cs.DM stat.AP

    Online Controlled Experiments for Personalised e-Commerce Strategies: Design, Challenges, and Pitfalls

    Authors: C. H. Bryan Liu, Benjamin Paul Chamberlain

    Abstract: Online controlled experiments are the primary tool for measuring the causal impact of product changes in digital businesses. It is increasingly common for digital products and services to interact with customers in a personalised way. Using online controlled experiments to optimise personalised interaction strategies is challenging because the usual assumption of statistically equivalent user grou… ▽ More

    Submitted 1 July, 2021; v1 submitted 16 March, 2018; originally announced March 2018.

    Comments: Not peer-reviewed but retained for historic interest. Removed an erroneous statement on Welch's t-test assumptions in Section 3.2. 9 pages, 7 figures

  6. Speeding Up BigClam Implementation on SNAP

    Authors: C. H. Bryan Liu, Benjamin Paul Chamberlain

    Abstract: We perform a detailed analysis of the C++ implementation of the Cluster Affiliation Model for Big Networks (BigClam) on the Stanford Network Analysis Project (SNAP). BigClam is a popular graph mining algorithm that is capable of finding overlapping communities in networks containing millions of nodes. Our analysis shows a key stage of the algorithm - determining if a node belongs to a community -… ▽ More

    Submitted 4 September, 2018; v1 submitted 4 December, 2017; originally announced December 2017.

    Comments: To appear in 2018 Imperial College Computing Student Workshop (ICCSW'18); 12 pages, 4 figures, and 3 tables

    Journal ref: 2018 Imperial College Computing Student Workshop (ICCSW 2018). OpenAccess Series in Informatics (OASIcs), vol. 66, pp. 1:1-1:13

  7. arXiv:1706.09865  [pdf, other

    stat.ML cs.CY cs.LG

    Generalising Random Forest Parameter Optimisation to Include Stability and Cost

    Authors: C. H. Bryan Liu, Benjamin Paul Chamberlain, Duncan A. Little, Angelo Cardoso

    Abstract: Random forests are among the most popular classification and regression methods used in industrial applications. To be effective, the parameters of random forests must be carefully tuned. This is usually done by choosing values that minimize the prediction error on a held out dataset. We argue that error reduction is only one of several metrics that must be considered when optimizing random forest… ▽ More

    Submitted 13 July, 2017; v1 submitted 29 June, 2017; originally announced June 2017.

    Comments: To appear in ECML-PKDD 2017

    Journal ref: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. LNCS vol 10536, pp. 102-113 (2017)

  8. arXiv:1703.02596  [pdf, other

    cs.LG cs.CY cs.IR cs.NE stat.ML

    Customer Lifetime Value Prediction Using Embeddings

    Authors: Benjamin Paul Chamberlain, Angelo Cardoso, C. H. Bryan Liu, Roberto Pagliari, Marc Peter Deisenroth

    Abstract: We describe the Customer LifeTime Value (CLTV) prediction system deployed at ASOS.com, a global online fashion retailer. CLTV prediction is an important problem in e-commerce where an accurate estimate of future value allows retailers to effectively allocate marketing spend, identify and nurture high value customers and mitigate exposure to losses. The system at ASOS provides daily estimates of th… ▽ More

    Submitted 6 July, 2017; v1 submitted 7 March, 2017; originally announced March 2017.

    Comments: 10 pages, 11 figures

    Journal ref: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pages 1753-1762, 2017