Skip to main content

Showing 1–14 of 14 results for author: Masoero, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.12654  [pdf, ps, other

    stat.ME stat.AP

    Robust and efficient multiple-unit switchback experimentation

    Authors: Paul Missault, Lorenzo Masoero, Christian Delbé, Thomas Richardson, Guido Imbens

    Abstract: User-randomized A/B testing has emerged as the gold standard for online experimentation. However, when this kind of approach is not feasible due to legal, ethical or practical considerations, experimenters have to consider alternatives like item-randomization. Item-randomization is often met with skepticism due to its poor empirical performance. To fill this gap, in this paper we introduce a novel… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2505.19643  [pdf, ps, other

    stat.AP

    Online activity prediction via generalized Indian buffet process models

    Authors: Mario Beraha, Lorenzo Masoero, Stefano Favaro, Thomas S. Richardson

    Abstract: Online A/B experiments generate millions of user-activity records each day, yet experimenters need timely forecasts to guide roll-outs and safeguard user experience. Motivated by the problem of activity prediction for A/B tests at Amazon, we introduce a Bayesian nonparametric model for predicting both first-time and repeat triggers in web experiments. The model is based on the stable beta-scaled p… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: This paper supersedes the two technical reports by the same authors arXiv:2401.14722 and arXiv:2402.03231

  3. arXiv:2405.18621  [pdf, other

    cs.LG stat.ME stat.ML

    Multi-Armed Bandits with Network Interference

    Authors: Abhineet Agarwal, Anish Agarwal, Lorenzo Masoero, Justin Whitehouse

    Abstract: Online experimentation with interference is a common challenge in modern applications such as e-commerce and adaptive clinical trials in medicine. For example, in online marketplaces, the revenue of a good depends on discounts applied to competing goods. Statistical inference with interference is widely studied in the offline setting, but far less is known about how to adaptively assign treatments… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2403.02154  [pdf, other

    stat.ME q-bio.GN q-bio.QM

    Double trouble: Predicting new variant counts across two heterogeneous populations

    Authors: Yunyi Shen, Lorenzo Masoero, Joshua G. Schraiber, Tamara Broderick

    Abstract: Collecting genomics data across multiple heterogeneous populations (e.g., across different cancer types) has the potential to improve our understanding of disease. Despite sequencing advances, though, resources often remain a constraint when gathering data. So it would be useful for experimental design if experimenters with access to a pilot study could predict the number of new variants they migh… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2402.03231  [pdf, other

    stat.ME cs.LG stat.AP

    Improved prediction of future user activity in online A/B testing

    Authors: Lorenzo Masoero, Mario Beraha, Thomas Richardson, Stefano Favaro

    Abstract: In online randomized experiments or A/B tests, accurate predictions of participant inclusion rates are of paramount importance. These predictions not only guide experimenters in optimizing the experiment's duration but also enhance the precision of treatment effect estimates. In this paper we present a novel, straightforward, and scalable Bayesian nonparametric approach for predicting the rate at… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  6. arXiv:2401.14722  [pdf, other

    stat.ME cs.LG stat.AP stat.ML

    A Nonparametric Bayes Approach to Online Activity Prediction

    Authors: Mario Beraha, Lorenzo Masoero, Stefano Favaro, Thomas S. Richardson

    Abstract: Accurately predicting the onset of specific activities within defined timeframes holds significant importance in several applied contexts. In particular, accurate prediction of the number of future users that will be exposed to an intervention is an important piece of information for experimenters running online experiments (A/B tests). In this work, we propose a novel approach to predict the numb… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  7. arXiv:2401.01264  [pdf, other

    stat.ME

    Multiple Randomization Designs: Estimation and Inference with Interference

    Authors: Lorenzo Masoero, Suhas Vijaykumar, Thomas Richardson, James McQueen, Ido Rosen, Brian Burdick, Pat Bajari, Guido Imbens

    Abstract: Classical designs of randomized experiments, going back to Fisher and Neyman in the 1930s still dominate practice even in online experimentation. However, such designs are of limited value for answering standard questions in settings, common in marketplaces, where multiple populations of agents interact strategically, leading to complex patterns of spillover effects. In this paper, we discuss new… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  8. arXiv:2305.01109  [pdf, other

    stat.ME stat.AP

    Leveraging covariate adjustments at scale in online A/B testing

    Authors: Lorenzo Masoero, Doug Hains, James McQueen

    Abstract: Companies offering web services routinely run randomized online experiments to estimate the causal impact associated with the adoption of new features and policies on key performance metrics of interest. These experiments are used to estimate a variety of effects: the increase in click rate due to the repositioning of a banner, the impact on subscription rate as a consequence of a discount or spec… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Journal ref: 2023 ACM SIGKDD Workshop on Causal Discovery, Prediction and Decision

  9. arXiv:2202.01910  [pdf, other

    stat.ME stat.AP

    Cross-Study Replicability in Cluster Analysis

    Authors: Lorenzo Masoero, Emma Thomas, Giovanni Parmigiani, Svitlana Tyekucheva, Lorenzo Trippa

    Abstract: In cancer research, clustering techniques are widely used for exploratory analyses and dimensionality reduction, playing a critical role in the identification of novel cancer subtypes, often with direct implications for patient management. As data collected by multiple research groups grows, it is increasingly feasible to investigate the replicability of clustering procedures, that is, their abili… ▽ More

    Submitted 9 May, 2023; v1 submitted 3 February, 2022; originally announced February 2022.

    Comments: Accepted for publication in Statistical Science

  10. arXiv:2112.13495  [pdf, other

    stat.ME cs.SI econ.EM math.ST

    Multiple Randomization Designs

    Authors: Patrick Bajari, Brian Burdick, Guido W. Imbens, Lorenzo Masoero, James McQueen, Thomas Richardson, Ido M. Rosen

    Abstract: In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by tre… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: 57 pages, 7 figures

    MSC Class: 62B15 (Primary) 91B82; 91B26; 91C20; 91B80; 91C20 (Secondary) ACM Class: J.4; G.3; I.2.6

  11. arXiv:2112.02032  [pdf, other

    stat.ME q-bio.GN q-bio.QM

    Bayesian nonparametric strategies for power maximization in rare variants association studies

    Authors: Lorenzo Masoero, Joshua Schraiber, Tamara Broderick

    Abstract: Rare variants are hypothesized to be largely responsible for heritability and susceptibility to disease in humans. So rare variants association studies hold promise for understanding disease. Conversely though, the rareness of the variants poses practical challenges; since these variants are present in few individuals, it can be difficult to develop data-collection and statistical methods that eff… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

  12. arXiv:2106.15480  [pdf, other

    stat.ME

    Scaled process priors for Bayesian nonparametric estimation of the unseen genetic variation

    Authors: Federico Camerlenghi, Stefano Favaro, Lorenzo Masoero, Tamara Broderick

    Abstract: There is a growing interest in the estimation of the number of unseen features, mostly driven by biological applications. A recent work brought out a peculiar property of the popular completely random measures (CRMs) as prior models in Bayesian nonparametric (BNP) inference for the unseen-features problem: for fixed prior's parameters, they all lead to a Poisson posterior distribution for the numb… ▽ More

    Submitted 19 February, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

  13. arXiv:2009.10780  [pdf, other

    stat.ME math.ST stat.ML

    Independent finite approximations for Bayesian nonparametric inference

    Authors: Tin D. Nguyen, Jonathan Huggins, Lorenzo Masoero, Lester Mackey, Tamara Broderick

    Abstract: Completely random measures (CRMs) and their normalizations (NCRMs) offer flexible models in Bayesian nonparametrics. But their infinite dimensionality presents challenges for inference. Two popular finite approximations are truncated finite approximations (TFAs) and independent finite approximations (IFAs). While the former have been well-studied, IFAs lack similarly general bounds on approximatio… ▽ More

    Submitted 5 November, 2023; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: The paper has been accepted for publication in Bayesian Analysis. Currently, it is posted on Bayesian Analysis Advance Publication

  14. More for less: Predicting and maximizing genetic variant discovery via Bayesian nonparametrics

    Authors: Lorenzo Masoero, Federico Camerlenghi, Stefano Favaro, Tamara Broderick

    Abstract: While the cost of sequencing genomes has decreased dramatically in recent years, this expense often remains non-trivial. Under a fixed budget, then, scientists face a natural trade-off between quantity and quality; they can spend resources to sequence a greater number of genomes (quantity) or spend resources to sequence genomes with increased accuracy (quality). Our goal is to find the optimal all… ▽ More

    Submitted 12 February, 2021; v1 submitted 11 December, 2019; originally announced December 2019.