-
Equilibria and Learning in Modular Marketplaces
Authors:
Kshipra Bhawalkar,
Jeff Dean,
Christopher Liaw,
Aranyak Mehta,
Neel Patel
Abstract:
We envision a marketplace where diverse entities offer specialized "modules" through APIs, allowing users to compose the outputs of these modules for complex tasks within a given budget. This paper studies the market design problem in such an ecosystem, where module owners strategically set prices for their APIs (to maximize their profit) and a central platform orchestrates the aggregation of modu…
▽ More
We envision a marketplace where diverse entities offer specialized "modules" through APIs, allowing users to compose the outputs of these modules for complex tasks within a given budget. This paper studies the market design problem in such an ecosystem, where module owners strategically set prices for their APIs (to maximize their profit) and a central platform orchestrates the aggregation of module outputs at query-time. One can also think about this as a first-price procurement auction with budgets. The first observation is that if the platform's algorithm is to find the optimal set of modules then this could result in a poor outcome, in the sense that there are price equilibria which provide arbitrarily low value for the user. We show that under a suitable version of the "bang-per-buck" algorithm for the knapsack problem, an $\varepsilon$-approximate equilibrium always exists, for any arbitrary $\varepsilon > 0$. Further, our first main result shows that with this algorithm any such equilibrium provides a constant approximation to the optimal value that the buyer could get under various constraints including (i) a budget constraint and (ii) a budget and a matroid constraint. Finally, we demonstrate that these efficient equilibria can be learned through decentralized price adjustments by module owners using no-regret learning algorithms.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Auto-bidding and Auctions in Online Advertising: A Survey
Authors:
Gagan Aggarwal,
Ashwinkumar Badanidiyuru,
Santiago R. Balseiro,
Kshipra Bhawalkar,
Yuan Deng,
Zhe Feng,
Gagan Goel,
Christopher Liaw,
Haihao Lu,
Mohammad Mahdian,
Jieming Mao,
Aranyak Mehta,
Vahab Mirrokni,
Renato Paes Leme,
Andres Perlroth,
Georgios Piliouras,
Jon Schneider,
Ariel Schvartzman,
Balasubramanian Sivan,
Kelly Spendlove,
Yifeng Teng,
Di Wang,
Hanrui Zhang,
Mingfei Zhao,
Wennan Zhu
, et al. (1 additional authors not shown)
Abstract:
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction d…
▽ More
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Agnostic Private Density Estimation for GMMs via List Global Stability
Authors:
Mohammad Afzali,
Hassan Ashtiani,
Christopher Liaw
Abstract:
We consider the problem of private density estimation for mixtures of unrestricted high dimensional Gaussians in the agnostic setting. We prove the first upper bound on the sample complexity of this problem. Previously, private learnability of high dimensional GMMs was only known in the realizable setting [Afzali et al., 2024].
To prove our result, we exploit the notion of…
▽ More
We consider the problem of private density estimation for mixtures of unrestricted high dimensional Gaussians in the agnostic setting. We prove the first upper bound on the sample complexity of this problem. Previously, private learnability of high dimensional GMMs was only known in the realizable setting [Afzali et al., 2024].
To prove our result, we exploit the notion of $\textit{list global stability}$ [Ghazi et al., 2021b,a] that was originally introduced in the context of private supervised learning. We define an agnostic variant of this definition, showing that its existence is sufficient for agnostic private density estimation. We then construct an agnostic list globally stable learner for GMMs.
△ Less
Submitted 6 October, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
Improved Online Learning Algorithms for CTR Prediction in Ad Auctions
Authors:
Zhe Feng,
Christopher Liaw,
Zixin Zhou
Abstract:
In this work, we investigate the online learning problem of revenue maximization in ad auctions, where the seller needs to learn the click-through rates (CTRs) of each ad candidate and charge the price of the winner through a pay-per-click manner. We focus on two models of the advertisers' strategic behaviors. First, we assume that the advertiser is completely myopic; i.e.~in each round, they aim…
▽ More
In this work, we investigate the online learning problem of revenue maximization in ad auctions, where the seller needs to learn the click-through rates (CTRs) of each ad candidate and charge the price of the winner through a pay-per-click manner. We focus on two models of the advertisers' strategic behaviors. First, we assume that the advertiser is completely myopic; i.e.~in each round, they aim to maximize their utility only for the current round. In this setting, we develop an online mechanism based on upper-confidence bounds that achieves a tight $O(\sqrt{T})$ regret in the worst-case and negative regret when the values are static across all the auctions and there is a gap between the highest expected value (i.e.~value multiplied by their CTR) and second highest expected value ad. Next, we assume that the advertiser is non-myopic and cares about their long term utility. This setting is much more complex since an advertiser is incentivized to influence the mechanism by bidding strategically in earlier rounds. In this setting, we provide an algorithm to achieve negative regret for the static valuation setting (with a positive gap), which is in sharp contrast with the prior work that shows $O(T^{2/3})$ regret when the valuation is generated by adversary.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Efficiency of Non-Truthful Auctions in Auto-bidding with Budget Constraints
Authors:
Christopher Liaw,
Aranyak Mehta,
Wennan Zhu
Abstract:
We study the efficiency of non-truthful auctions for auto-bidders with both return on spend (ROS) and budget constraints. The efficiency of a mechanism is measured by the price of anarchy (PoA), which is the worst case ratio between the liquid welfare of any equilibrium and the optimal (possibly randomized) allocation. Our first main result is that the first-price auction (FPA) is optimal, among d…
▽ More
We study the efficiency of non-truthful auctions for auto-bidders with both return on spend (ROS) and budget constraints. The efficiency of a mechanism is measured by the price of anarchy (PoA), which is the worst case ratio between the liquid welfare of any equilibrium and the optimal (possibly randomized) allocation. Our first main result is that the first-price auction (FPA) is optimal, among deterministic mechanisms, in this setting. Without any assumptions, the PoA of FPA is $n$ which we prove is tight for any deterministic mechanism. However, under a mild assumption that a bidder's value for any query does not exceed their total budget, we show that the PoA is at most $2$. This bound is also tight as it matches the optimal PoA without a budget constraint. We next analyze two randomized mechanisms: randomized FPA (rFPA) and "quasi-proportional" FPA. We prove two results that highlight the efficacy of randomization in this setting. First, we show that the PoA of rFPA for two bidders is at most $1.8$ without requiring any assumptions. This extends prior work which focused only on an ROS constraint. Second, we show that quasi-proportional FPA has a PoA of $2$ for any number of bidders, without any assumptions. Both of these bypass lower bounds in the deterministic setting. Finally, we study the setting where bidders are assumed to bid uniformly. We show that uniform bidding can be detrimental for efficiency in deterministic mechanisms while being beneficial for randomized mechanisms, which is in stark contrast with the settings without budget constraints.
△ Less
Submitted 18 April, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
Authors:
Mohammad Afzali,
Hassan Ashtiani,
Christopher Liaw
Abstract:
We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that $\text{poly}(k,d,1/α,1/\varepsilon,\log(1/δ))$ samples are sufficient to estimate a mixture of $k$ Gaussians in $\mathbb{R}^d$ up to total variation distance $α$ while satisfying $(\varepsilon, δ)$-DP. This is the first finite sample complexity upper bound for the pro…
▽ More
We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that $\text{poly}(k,d,1/α,1/\varepsilon,\log(1/δ))$ samples are sufficient to estimate a mixture of $k$ Gaussians in $\mathbb{R}^d$ up to total variation distance $α$ while satisfying $(\varepsilon, δ)$-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs.
To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b).
△ Less
Submitted 23 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
The Power of Two-sided Recruitment in Two-sided Markets
Authors:
Yang Cai,
Christopher Liaw,
Aranyak Mehta,
Mingfei Zhao
Abstract:
We consider the problem of maximizing the gains from trade (GFT) in two-sided markets. The seminal impossibility result by Myerson and Satterthwaite shows that even for bilateral trade, there is no individually rational (IR), Bayesian incentive compatible (BIC) and budget balanced (BB) mechanism that can achieve the full GFT. Moreover, the optimal BIC, IR and BB mechanism that maximizes the GFT is…
▽ More
We consider the problem of maximizing the gains from trade (GFT) in two-sided markets. The seminal impossibility result by Myerson and Satterthwaite shows that even for bilateral trade, there is no individually rational (IR), Bayesian incentive compatible (BIC) and budget balanced (BB) mechanism that can achieve the full GFT. Moreover, the optimal BIC, IR and BB mechanism that maximizes the GFT is known to be complex and heavily depends on the prior. In this paper, we pursue a Bulow-Klemperer-style question, i.e., does augmentation allow for prior-independent mechanisms to compete against the optimal mechanism? Our first main result shows that in the double auction setting with $m$ i.i.d. buyers and $n$ i.i.d. sellers, by augmenting $O(1)$ buyers and sellers to the market, the GFT of a simple, dominant strategy incentive compatible (DSIC), and prior-independent mechanism in the augmented market is at least the optimal in the original market, when the buyers' distribution first-order stochastically dominates the sellers' distribution. Next, we go beyond the i.i.d. setting and study the power of two-sided recruitment in more general markets. Our second main result is that for any $ε> 0$ and any set of $O(1/ε)$ buyers and sellers where the buyers' value exceeds the sellers' value with constant probability, if we add these additional agents into any market with arbitrary correlations, the Trade Reduction mechanism obtains a $(1-ε)$-approximation of the GFT of the augmented market. Importantly, the newly recruited agents are agnostic to the original market.
△ Less
Submitted 28 March, 2024; v1 submitted 7 July, 2023;
originally announced July 2023.
-
Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models
Authors:
Jamil Arbas,
Hassan Ashtiani,
Christopher Liaw
Abstract:
We study the problem of privately estimating the parameters of $d$-dimensional Gaussian Mixture Models (GMMs) with $k$ components. For this, we develop a technique to reduce the problem to its non-private counterpart. This allows us to privatize existing non-private algorithms in a blackbox manner, while incurring only a small overhead in the sample complexity and running time. As the main applica…
▽ More
We study the problem of privately estimating the parameters of $d$-dimensional Gaussian Mixture Models (GMMs) with $k$ components. For this, we develop a technique to reduce the problem to its non-private counterpart. This allows us to privatize existing non-private algorithms in a blackbox manner, while incurring only a small overhead in the sample complexity and running time. As the main application of our framework, we develop an $(\varepsilon, δ)$-differentially private algorithm to learn GMMs using the non-private algorithm of Moitra and Valiant [MV10] as a blackbox. Consequently, this gives the first sample complexity upper bound and first polynomial time algorithm for privately learning GMMs without any boundedness assumptions on the parameters. As part of our analysis, we prove a tight (up to a constant factor) lower bound on the total variation distance of high-dimensional Gaussians which can be of independent interest.
△ Less
Submitted 7 June, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
User Response in Ad Auctions: An MDP Formulation of Long-Term Revenue Optimization
Authors:
Yang Cai,
Zhe Feng,
Christopher Liaw,
Aranyak Mehta,
Grigoris Velegkas
Abstract:
We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue. By incorporating user response, our model takes into consideration all three parties involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a user-specific click-through r…
▽ More
We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue. By incorporating user response, our model takes into consideration all three parties involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a user-specific click-through rate (CTR) with the CTR changing in the next round according to the set of ads shown to the user in the current round. We characterize the optimal mechanism for this MDP as a Myerson's auction with a notion of modified virtual value, which relies on the value distribution of the advertiser, the current user state, and the future impact of showing the ad to the user. Leveraging this characterization, we design a sample-efficient and computationally-efficient algorithm which outputs an approximately optimal policy that requires only sample access to the true MDP and the value distributions of the bidders. Finally, we propose a simple mechanism built upon second price auctions with personalized reserve prices and show it can achieve a constant-factor approximation to the optimal long term discounted revenue.
△ Less
Submitted 5 May, 2024; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Efficiency of non-truthful auctions under auto-bidding
Authors:
Christopher Liaw,
Aranyak Mehta,
Andres Perlroth
Abstract:
Auto-bidding is now widely adopted as an interface between advertisers and internet advertising as it allows advertisers to specify high-level goals, such as maximizing value subject to a value-per-spend constraint. Prior research has mostly focused on auctions which are truthful (such as SPA) since uniform bidding is optimal in such auctions, which makes it manageable to reason about equilibria.…
▽ More
Auto-bidding is now widely adopted as an interface between advertisers and internet advertising as it allows advertisers to specify high-level goals, such as maximizing value subject to a value-per-spend constraint. Prior research has mostly focused on auctions which are truthful (such as SPA) since uniform bidding is optimal in such auctions, which makes it manageable to reason about equilibria. A tantalizing question is whether one can obtain more efficient outcomes by leaving the realm of truthful auctions.
This is the first paper to study non-truthful auctions in the prior-free auto-bidding setting. Our first result is that non-truthfulness provides no benefit when one considers deterministic auctions. Any deterministic mechanism has a price of anarchy (PoA) of at least $2$, even for $2$ bidders; this matches what can be achieved by deterministic truthful mechanisms. In particular, we prove that the first price auction has PoA of exactly $2$. For our second result, we construct a randomized non-truthful auction that achieves a PoA of $1.8$ for $2$ bidders. This is the best-known PoA for this problem. The previously best-known PoA for this problem was $1.9$ and was achieved with a truthful mechanism. Moreover, we demonstrate the benefit of non-truthfulness in this setting by showing that the truthful version of this randomized auction also has a PoA of $1.9$. Finally, we show that no auction (even randomized, non-truthful) can improve upon a PoA bound of $2$ as the number of advertisers grow to infinity.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Continuous Prediction with Experts' Advice
Authors:
Victor Sanches Portella,
Christopher Liaw,
Nicholas J. A. Harvey
Abstract:
Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning.
In this paper, we employ continuous-time stochastic…
▽ More
Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning.
In this paper, we employ continuous-time stochastic calculus in order to study the discrete-time experts' problem. We use these tools to design a continuous-time, parameter-free algorithm with improved guarantees for the quantile regret. We then develop an analogous discrete-time algorithm with a very similar analysis and identical quantile regret bounds. Finally, we design an anytime continuous-time algorithm with regret matching the optimal fixed-time rate when the gains are independent Brownian Motions; in many settings, this is the most difficult case. This gives some evidence that, even with adversarial gains, the optimal anytime and fixed-time regrets may coincide.
△ Less
Submitted 30 September, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Private and polynomial time algorithms for learning Gaussians and beyond
Authors:
Hassan Ashtiani,
Christopher Liaw
Abstract:
We present a fairly general framework for reducing $(\varepsilon, δ)$ differentially private (DP) statistical estimation to its non-private counterpart. As the main application of this framework, we give a polynomial time and $(\varepsilon,δ)$-DP algorithm for learning (unrestricted) Gaussian distributions in $\mathbb{R}^d$. The sample complexity of our approach for learning the Gaussian up to tot…
▽ More
We present a fairly general framework for reducing $(\varepsilon, δ)$ differentially private (DP) statistical estimation to its non-private counterpart. As the main application of this framework, we give a polynomial time and $(\varepsilon,δ)$-DP algorithm for learning (unrestricted) Gaussian distributions in $\mathbb{R}^d$. The sample complexity of our approach for learning the Gaussian up to total variation distance $α$ is $\widetilde{O}(d^2/α^2 + d^2\sqrt{\ln(1/δ)}/α\varepsilon + d\ln(1/δ) / α\varepsilon)$ matching (up to logarithmic factors) the best known information-theoretic (non-efficient) sample complexity upper bound due to Aden-Ali, Ashtiani, and Kamath (ALT'21). In an independent work, Kamath, Mouzakis, Singhal, Steinke, and Ullman (arXiv:2111.04609) proved a similar result using a different approach and with $O(d^{5/2})$ sample complexity dependence on $d$. As another application of our framework, we provide the first polynomial time $(\varepsilon, δ)$-DP algorithm for robust learning of (unrestricted) Gaussians with sample complexity $\widetilde{O}(d^{3.5})$. In another independent work, Kothari, Manurangsi, and Velingker (arXiv:2112.03548) also provided a polynomial time $(\varepsilon, δ)$-DP algorithm for robust learning of Gaussians with sample complexity $\widetilde{O}(d^8)$.
△ Less
Submitted 22 June, 2022; v1 submitted 22 November, 2021;
originally announced November 2021.
-
Privately Learning Mixtures of Axis-Aligned Gaussians
Authors:
Ishaq Aden-Ali,
Hassan Ashtiani,
Christopher Liaw
Abstract:
We consider the problem of learning mixtures of Gaussians under the constraint of approximate differential privacy. We prove that $\widetilde{O}(k^2 d \log^{3/2}(1/δ) / α^2 \varepsilon)$ samples are sufficient to learn a mixture of $k$ axis-aligned Gaussians in $\mathbb{R}^d$ to within total variation distance $α$ while satisfying $(\varepsilon, δ)$-differential privacy. This is the first result f…
▽ More
We consider the problem of learning mixtures of Gaussians under the constraint of approximate differential privacy. We prove that $\widetilde{O}(k^2 d \log^{3/2}(1/δ) / α^2 \varepsilon)$ samples are sufficient to learn a mixture of $k$ axis-aligned Gaussians in $\mathbb{R}^d$ to within total variation distance $α$ while satisfying $(\varepsilon, δ)$-differential privacy. This is the first result for privately learning mixtures of unbounded axis-aligned (or even unbounded univariate) Gaussians. If the covariance matrices of each of the Gaussians is the identity matrix, we show that $\widetilde{O}(kd/α^2 + kd \log(1/δ) / α\varepsilon)$ samples are sufficient.
Recently, the "local covering" technique of Bun, Kamath, Steinke, and Wu has been successfully used for privately learning high-dimensional Gaussians with a known covariance matrix and extended to privately learning general high-dimensional Gaussians by Aden-Ali, Ashtiani, and Kamath. Given these positive results, this approach has been proposed as a promising direction for privately learning mixtures of Gaussians. Unfortunately, we show that this is not possible.
We design a new technique for privately learning mixture distributions. A class of distributions $\mathcal{F}$ is said to be list-decodable if there is an algorithm that, given "heavily corrupted" samples from $f\in \mathcal{F}$, outputs a list of distributions, $\widehat{\mathcal{F}}$, such that one of the distributions in $\widehat{\mathcal{F}}$ approximates $f$. We show that if $\mathcal{F}$ is privately list-decodable, then we can privately learn mixtures of distributions in $\mathcal{F}$. Finally, we show axis-aligned Gaussian distributions are privately list-decodable, thereby proving mixtures of such distributions are privately learnable.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Convergence Analysis of No-Regret Bidding Algorithms in Repeated Auctions
Authors:
Zhe Feng,
Guru Guruganesh,
Christopher Liaw,
Aranyak Mehta,
Abhishek Sethi
Abstract:
The connection between games and no-regret algorithms has been widely studied in the literature. A fundamental result is that when all players play no-regret strategies, this produces a sequence of actions whose time-average is a coarse-correlated equilibrium of the game. However, much less is known about equilibrium selection in the case that multiple equilibria exist.
In this work, we study th…
▽ More
The connection between games and no-regret algorithms has been widely studied in the literature. A fundamental result is that when all players play no-regret strategies, this produces a sequence of actions whose time-average is a coarse-correlated equilibrium of the game. However, much less is known about equilibrium selection in the case that multiple equilibria exist.
In this work, we study the convergence of no-regret bidding algorithms in auctions. Besides being of theoretical interest, bidding dynamics in auctions is an important question from a practical viewpoint as well. We study repeated game between bidders in which a single item is sold at each time step and the bidder's value is drawn from an unknown distribution. We show that if the bidders use any mean-based learning rule then the bidders converge with high probability to the truthful pure Nash Equilibrium in a second price auction, in VCG auction in the multi-slot setting and to the Bayesian Nash equilibrium in a first price auction. We note mean-based algorithms cover a wide variety of known no-regret algorithms such as Exp3, UCB, $ε$-Greedy etc. Also, we analyze the convergence of the individual iterates produced by such learning algorithms, as opposed to the time-average of the sequence. Our experiments corroborate our theoretical findings and also find a similar convergence when we use other strategies such as Deep Q-Learning.
△ Less
Submitted 13 September, 2020;
originally announced September 2020.
-
Optimal anytime regret with two experts
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Edwin Perkins,
Sikander Randhawa
Abstract:
We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optim…
▽ More
We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optimal algorithm was previously known in the anytime setting, regardless of the number of experts. Even for the case of two experts, Luo and Schapire have left open the problem of determining the optimal algorithm.
We design the first minimax optimal algorithm for minimizing regret in the anytime setting. We consider the case of two experts, and prove that the optimal regret is $γ\sqrt{t} / 2$ at all time steps $t$, where $γ$ is a natural constant that arose 35 years ago in studying fundamental properties of Brownian motion. The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from stochastic calculus.
△ Less
Submitted 26 August, 2021; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Sikander Randhawa
Abstract:
We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with hig…
▽ More
We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with high probability. Our proof uses a recently developed generalization of Freedman's inequality. Finally, we compare several of these algorithms experimentally and show that this non-uniform averaging strategy outperforms many standard techniques, and with smaller variance.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
The Vickrey Auction with a Single Duplicate Bidder Approximates the Optimal Revenue
Authors:
Hu Fu,
Christopher Liaw,
Sikander Randhawa
Abstract:
Bulow and Klemperer's well-known result states that, in a single-item auction where the $n$ bidders' values are independently and identically drawn from a regular distribution, the Vickrey auction with one additional bidder (a duplicate) extracts at least as much revenue as the optimal auction without the duplicate. Hartline and Roughgarden, in their influential 2009 paper, removed the requirement…
▽ More
Bulow and Klemperer's well-known result states that, in a single-item auction where the $n$ bidders' values are independently and identically drawn from a regular distribution, the Vickrey auction with one additional bidder (a duplicate) extracts at least as much revenue as the optimal auction without the duplicate. Hartline and Roughgarden, in their influential 2009 paper, removed the requirement that the distributions be identical, at the cost of allowing the Vickrey auction to recruit $n$ duplicates, one from each distribution, and relaxing its revenue advantage to a $2$-approximation.
In this work we restore Bulow and Klemperer's number of duplicates in Hartline and Roughgarden's more general setting with a worse approximation ratio. We show that recruiting a duplicate from one of the distributions suffices for the Vickrey auction to $10$-approximate the optimal revenue. We also show that in a $k$-items unit demand auction, recruiting $k$ duplicates suffices for the VCG auction to $O(1)$-approximate the optimal revenue.
As another result, we tighten the analysis for Hartline and Roughgarden's Vickrey auction with $n$ duplicates for the case with two bidders in the auction. We show that in this case the Vickrey auction with two duplicates obtains at least $3/4$ of the optimal revenue. This is tight by meeting a lower bound by Hartline and Roughgarden. En route, we obtain a transparent analysis of their $2$-approximation for $n$~bidders, via a natural connection to Ronen's lookahead auction.
△ Less
Submitted 9 May, 2019;
originally announced May 2019.
-
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Yaniv Plan,
Sikander Randhawa
Abstract:
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This s…
▽ More
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This shows that the upper bound is tight and that, in this setting, the last iterate of stochastic gradient descent has the same general error rate (with high probability) as deterministic gradient descent. This resolves both open questions posed by Shamir (2012).
An intermediate step of our analysis proves that the suffix averaging method achieves error $O(1/T)$ with high probability, which is optimal (for any first-order optimization method). This improves results of Rakhlin (2012) and Hazan and Kale (2014), both of which achieved error $O(1/T)$, but only in expectation, and achieved a high probability error bound of $O(\log \log(T)/T)$, which is suboptimal.
We prove analogous results for functions that are Lipschitz and convex, but not necessarily strongly convex or differentiable. After $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/\sqrt{T})$ with high probability, and there exists a function for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/\sqrt{T})$.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
Greedy and Local Ratio Algorithms in the MapReduce Model
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Paul Liu
Abstract:
MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio techniq…
▽ More
MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio technique gives $2$-approximations for weighted vertex cover and weighted matching, and an $f$-approximation for weighted set cover, all in a constant number of MapReduce rounds. Our randomized greedy technique gives algorithms for maximal independent set, maximal clique, and a $(1+ε)\ln Δ$-approximation for weighted set cover. We also give greedy algorithms for vertex colouring with $(1+o(1))Δ$ colours and edge colouring with $(1+o(1))Δ$ colours.
△ Less
Submitted 17 June, 2018;
originally announced June 2018.
-
Near-optimal Sample Complexity Bounds for Robust Learning of Gaussians Mixtures via Compression Schemes
Authors:
Hassan Ashtiani,
Shai Ben-David,
Nick Harvey,
Christopher Liaw,
Abbas Mehrabian,
Yaniv Plan
Abstract:
We prove that $\tildeΘ(k d^2 / \varepsilon^2)$ samples are necessary and sufficient for learning a mixture of $k$ Gaussians in $\mathbb{R}^d$, up to error $\varepsilon$ in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that $\tilde{O}(k d / \varepsilon^2)$ samples suffice, matching a known lower…
▽ More
We prove that $\tildeΘ(k d^2 / \varepsilon^2)$ samples are necessary and sufficient for learning a mixture of $k$ Gaussians in $\mathbb{R}^d$, up to error $\varepsilon$ in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that $\tilde{O}(k d / \varepsilon^2)$ samples suffice, matching a known lower bound. Moreover, these results hold in the agnostic-learning/robust-estimation setting as well, where the target distribution is only approximately a mixture of Gaussians.
The upper bound is shown using a novel technique for distribution learning based on a notion of `compression.' Any class of distributions that allows such a compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in $\mathbb{R}^d$ admits a small-sized compression scheme.
△ Less
Submitted 21 July, 2020; v1 submitted 14 October, 2017;
originally announced October 2017.
-
The Value of Information Concealment
Authors:
Hu Fu,
Chris Liaw,
Pinyan Lu,
Zhihao Gavin Tang
Abstract:
We consider a revenue optimizing seller selling a single item to a buyer, on whose private value the seller has a noisy signal. We show that, when the signal is kept private, arbitrarily more revenue could potentially be extracted than if the signal is leaked or revealed. We then show that, if the seller is not allowed to make payments to the buyer, the gap between the two is bounded by a multipli…
▽ More
We consider a revenue optimizing seller selling a single item to a buyer, on whose private value the seller has a noisy signal. We show that, when the signal is kept private, arbitrarily more revenue could potentially be extracted than if the signal is leaked or revealed. We then show that, if the seller is not allowed to make payments to the buyer, the gap between the two is bounded by a multiplicative factor of 3, if the value distribution conditioning on each signal is regular. We give examples showing that both conditions are necessary for a constant bound to hold.
We connect this scenario to multi-bidder single-item auctions where bidders' values are correlated. Similarly to the setting above, we show that the revenue of a Bayesian incentive compatible, ex post individually rational auction can be arbitrarily larger than that of a dominant strategy incentive compatible auction, whereas the two are no more than a factor of 5 apart if the auctioneer never pays the bidders and if each bidder's value conditioning on the others' is drawn according to a regular distribution. The upper bounds in both settings degrade gracefully when the distribution is a mixture of a small number of regular distributions.
△ Less
Submitted 18 July, 2017;
originally announced July 2017.
-
Tight Load Balancing via Randomized Local Search
Authors:
Petra Berenbrink,
Peter Kling,
Christopher Liaw,
Abbas Mehrabian
Abstract:
We consider the following balls-into-bins process with $n$ bins and $m$ balls: each ball is equipped with a mutually independent exponential clock of rate 1. Whenever a ball's clock rings, the ball samples a random bin and moves there if the number of balls in the sampled bin is smaller than in its current bin. This simple process models a typical load balancing problem where users (balls) seek a…
▽ More
We consider the following balls-into-bins process with $n$ bins and $m$ balls: each ball is equipped with a mutually independent exponential clock of rate 1. Whenever a ball's clock rings, the ball samples a random bin and moves there if the number of balls in the sampled bin is smaller than in its current bin. This simple process models a typical load balancing problem where users (balls) seek a selfish improvement of their assignment to resources (bins). From a game theoretic perspective, this is a randomized approach to the well-known Koutsoupias-Papadimitriou model, while it is known as randomized local search (RLS) in load balancing literature. Up to now, the best bound on the expected time to reach perfect balance was $O\left({(\ln n)}^2+\ln(n)\cdot n^2/m\right)$ due to Ganesh, Lilienthal, Manjunath, Proutiere, and Simatos (Load balancing via random local search in closed and open systems, Queueing Systems, 2012). We improve this to an asymptotically tight $O\left(\ln(n)+n^2/m\right)$. Our analysis is based on the crucial observation that performing "destructive moves" (reversals of RLS moves) cannot decrease the balancing time. This allows us to simplify problem instances and to ignore "inconvenient moves" in the analysis.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Approximation Schemes for Covering and Packing in the Streaming Model
Authors:
Christopher Liaw,
Paul Liu,
Robert Reiss
Abstract:
The shifting strategy, introduced by Hochbaum and Maass, and independently by Baker, is a unified framework for devising polynomial approximation schemes to NP-Hard problems. This strategy has been used to great success within the computational geometry community in a plethora of different applications; most notably covering, packing, and clustering problems. In this paper, we revisit the shifting…
▽ More
The shifting strategy, introduced by Hochbaum and Maass, and independently by Baker, is a unified framework for devising polynomial approximation schemes to NP-Hard problems. This strategy has been used to great success within the computational geometry community in a plethora of different applications; most notably covering, packing, and clustering problems. In this paper, we revisit the shifting strategy in the context of the streaming model and develop a streaming-friendly shifting strategy. When combined with the shifting coreset method introduced by Fonseca et al., we obtain streaming algorithms for various graph properties of unit disc graphs. As a further application, we present novel approximation algorithms and lower bounds for the unit disc cover (UDC) problem in the streaming model, for which currently no algorithms are known.
△ Less
Submitted 28 June, 2017;
originally announced June 2017.
-
Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks
Authors:
Peter L. Bartlett,
Nick Harvey,
Chris Liaw,
Abbas Mehrabian
Abstract:
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously kn…
▽ More
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $Θ(W U)$ on the VC-dimension. All of these bounds generalize to arbitrary piecewise linear activation functions, and also hold for the pseudodimensions of these function classes.
Combined with previous results, this gives an intriguing range of dependencies of the VC-dimension on depth for networks with different non-linearities: there is no dependence for piecewise-constant, linear dependence for piecewise-linear, and no more than quadratic dependence for general piecewise-polynomial.
△ Less
Submitted 15 October, 2017; v1 submitted 8 March, 2017;
originally announced March 2017.
-
A simple tool for bounding the deviation of random matrices on geometric sets
Authors:
Christopher Liaw,
Abbas Mehrabian,
Yaniv Plan,
Roman Vershynin
Abstract:
Let $A$ be an isotropic, sub-gaussian $m \times n$ matrix. We prove that the process $Z_x := \|Ax\|_2 - \sqrt m \|x\|_2$ has sub-gaussian increments. Using this, we show that for any bounded set $T \subseteq \mathbb{R}^n$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also prove a local version of this theorem, which allows for unbounded set…
▽ More
Let $A$ be an isotropic, sub-gaussian $m \times n$ matrix. We prove that the process $Z_x := \|Ax\|_2 - \sqrt m \|x\|_2$ has sub-gaussian increments. Using this, we show that for any bounded set $T \subseteq \mathbb{R}^n$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also prove a local version of this theorem, which allows for unbounded sets. These theorems have various applications, some of which are reviewed in this paper. In particular, we give a new result regarding model selection in the constrained linear model.
△ Less
Submitted 7 June, 2016; v1 submitted 2 March, 2016;
originally announced March 2016.