Search | arXiv e-print repository

arXiv:2502.20097 [pdf, other]

Qini curve estimation under clustered network interference

Authors: Rickard K. A. Karlsson, Bram van den Akker, Felipe Moraes, Hugo M. Proença, Jesse H. Krijthe

Abstract: Qini curves are a widely used tool for assessing treatment policies under allocation constraints as they visualize the incremental gain of a new treatment policy versus the cost of its implementation. Standard Qini curve estimation assumes no interference between units: that is, that treating one unit does not influence the outcome of any other unit. In many real-life applications such as public p… ▽ More Qini curves are a widely used tool for assessing treatment policies under allocation constraints as they visualize the incremental gain of a new treatment policy versus the cost of its implementation. Standard Qini curve estimation assumes no interference between units: that is, that treating one unit does not influence the outcome of any other unit. In many real-life applications such as public policy or marketing, however, the presence of interference is common. Ignoring interference in these scenarios can lead to systematically biased Qini curves that over- or under-estimate a treatment policy's cost-effectiveness. In this paper, we address the problem of Qini curve estimation under clustered network interference, where interfering units form independent clusters. We propose a formal description of the problem setting with an experimental study design under which we can account for clustered network interference. Within this framework, we introduce three different estimation strategies suited for different conditions. Moreover, we introduce a marketplace simulator that emulates clustered network interference in a typical e-commerce setting. From both theoretical and empirical insights, we provide recommendations in choosing the best estimation strategy by identifying an inherent bias-variance trade-off among the estimation strategies. △ Less

Submitted 27 February, 2025; originally announced February 2025.

arXiv:2406.02743 [pdf, other]

Democratizing Propensity Score Matching Using Web Application

Authors: Adam Gajtkowski, Felipe Moraes

Abstract: Traditionally, data scientists use exploratory data analysis techniques such as correlation analysis, summary statistics, and regression analysis for identifying the most product enhancements and roadmap planning. However, these conventional approaches often yield biased conclusions and suboptimal solutions, leading to a waste of valuable time and missed opportunities for higher-value outcomes. In… ▽ More Traditionally, data scientists use exploratory data analysis techniques such as correlation analysis, summary statistics, and regression analysis for identifying the most product enhancements and roadmap planning. However, these conventional approaches often yield biased conclusions and suboptimal solutions, leading to a waste of valuable time and missed opportunities for higher-value outcomes. In contrast, there are alternative techniques that involve the use of causal inference methods. However, these methods suffer from issues of limited accessibility, as they are not easily understandable or effectively utilized by inexperienced practitioners. Additionally, their implementation necessitates a substantial investment of time and effort. To this end, this paper tackles these challenges by democratizing one of the causal inference methods called Propensity Score Matching (PSM) and enhancing its accessibility for less technically inclined users through the automation of the entire workflow using a web application. Our approach not only fills this accessibility gap but also contributes to the existing literature by introducing a more rigorous model selection process and an enhanced sensitivity analysis. By overcoming the limitations of traditional exploratory data analysis methods, our web application has empowered data scientists at Booking.com to make better use of PSM, thereby improving the overall efficacy of their analyses. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.02183 [pdf, other]

Metalearners for Ranking Treatment Effects

Authors: Toon Vanderschueren, Wouter Verbeke, Felipe Moraes, Hugo Manuel Proença

Abstract: Efficiently allocating treatments with a budget constraint constitutes an important challenge across various domains. In marketing, for example, the use of promotions to target potential customers and boost conversions is limited by the available budget. While much research focuses on estimating causal effects, there is relatively limited work on learning to allocate treatments while considering t… ▽ More Efficiently allocating treatments with a budget constraint constitutes an important challenge across various domains. In marketing, for example, the use of promotions to target potential customers and boost conversions is limited by the available budget. While much research focuses on estimating causal effects, there is relatively limited work on learning to allocate treatments while considering the operational context. Existing methods for uplift modeling or causal inference primarily estimate treatment effects, without considering how this relates to a profit maximizing allocation policy that respects budget constraints. The potential downside of using these methods is that the resulting predictive model is not aligned with the operational context. Therefore, prediction errors are propagated to the optimization of the budget allocation problem, subsequently leading to a suboptimal allocation policy. We propose an alternative approach based on learning to rank. Our proposed methodology directly learns an allocation policy by prioritizing instances in terms of their incremental profit. We propose an efficient sampling procedure for the optimization of the ranking model to scale our methodology to large-scale data sets. Theoretically, we show how learning to rank can maximize the area under a policy's incremental profit curve. Empirically, we validate our methodology and show its effectiveness in practice through a series of experiments on both synthetic and real-world data. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Showing 1–3 of 3 results for author: Moraes, F