-
Online Model Evaluation in a Large-Scale Computational Advertising Platform
Authors:
Shahriar Shariat,
Burkay Orten,
Ali Dasdan
Abstract:
Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily…
▽ More
Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily on bid prediction and optimization models. Even though the bid prediction models are well studied in the literature, the equally important subject of model evaluation is usually overlooked. Effective and reliable evaluation of an online bidding model is crucial for making faster model improvements as well as for utilizing the marketing budgets more efficiently. In this paper, we present an experimentation framework for bid prediction models where our focus is on the practical aspects of model evaluation. Specifically, we outline the unique challenges we encounter in our platform due to a variety of factors such as heterogeneous goal definitions, varying budget requirements across different campaigns, high seasonality and the auction-based environment for inventory purchasing. Then, we introduce return on investment (ROI) as a unified model performance (i.e., success) metric and explain its merits over more traditional metrics such as click-through rate (CTR) or conversion rate (CVR). Most importantly, we discuss commonly used evaluation and metric summarization approaches in detail and propose a more accurate method for online evaluation of new experimental models against the baseline. Our meta-analysis-based approach addresses various shortcomings of other methods and yields statistically robust conclusions that allow us to conclude experiments more quickly in a reliable manner. We demonstrate the effectiveness of our evaluation strategy on real campaign data through some experiments.
△ Less
Submitted 31 August, 2015;
originally announced August 2015.
-
Non-Parametric Methods Applied to the N-Sample Series Comparison
Authors:
Paolo D'Alberto,
Chris Drome,
Ali Dasdan
Abstract:
Anomaly and similarity detection in multidimensional series have a long history and have found practical usage in many different fields such as medicine, networks, and finance. Anomaly detection is of great appeal for many different disciplines; for example, mathematicians searching for a unified mathematical formulation based on probability, statisticians searching for error bound estimates, and…
▽ More
Anomaly and similarity detection in multidimensional series have a long history and have found practical usage in many different fields such as medicine, networks, and finance. Anomaly detection is of great appeal for many different disciplines; for example, mathematicians searching for a unified mathematical formulation based on probability, statisticians searching for error bound estimates, and computer scientists who are trying to design fast algorithms, to name just a few. In summary, we have two contributions: First, we present a self-contained survey of the most promising methods being used in the fields of machine learning, statistics, and bio-informatics today. Included we present discussions about conformal prediction, kernels in the Hilbert space, Kolmogorov's information measure, and non-parametric cumulative distribution function comparison methods (NCDF). Second, building upon this foundation, we provide a powerful NCDF method for series with small dimensionality. Through a combination of data organization and statistical tests, we describe extensions that scale well with increased dimensionality.
△ Less
Submitted 9 May, 2012;
originally announced May 2012.
-
On the Weakenesses of Correlation Measures used for Search Engines' Results (Unsupervised Comparison of Search Engine Rankings)
Authors:
Paolo D'Alberto,
Ali Dasdan
Abstract:
The correlation of the result lists provided by search engines is fundamental and it has deep and multidisciplinary ramifications. Here, we present automatic and unsupervised methods to assess whether or not search engines provide results that are comparable or correlated. We have two main contributions: First, we provide evidence that for more than 80% of the input queries - independently of thei…
▽ More
The correlation of the result lists provided by search engines is fundamental and it has deep and multidisciplinary ramifications. Here, we present automatic and unsupervised methods to assess whether or not search engines provide results that are comparable or correlated. We have two main contributions: First, we provide evidence that for more than 80% of the input queries - independently of their frequency - the two major search engines share only three or fewer URLs in their search results, leading to an increasing divergence. In this scenario (divergence), we show that even the most robust measures based on comparing lists is useless to apply; that is, the small contribution by too few common items will infer no confidence. Second, to overcome this problem, we propose the fist content-based measures - i.e., direct comparison of the contents from search results; these measures are based on the Jaccard ratio and distribution similarity measures (CDF measures). We show that they are orthogonal to each other (i.e., Jaccard and distribution) and extend the discriminative power w.r.t. list based measures. Our approach stems from the real need of comparing search-engine results, it is automatic from the query selection to the final evaluation and it apply to any geographical markets, thus designed to scale and to use as first filtering of query selection (necessary) for supervised methods.
△ Less
Submitted 13 July, 2011;
originally announced July 2011.