Search | arXiv e-print repository

Load Balancing with Network Latencies via Distributed Gradient Descent

Authors: Santiago R. Balseiro, Vahab S. Mirrokni, Bartek Wydrowski

Abstract: Motivated by the growing demand for serving large language model inference requests, we study distributed load balancing for global serving systems with network latencies. We consider a fluid model in which continuous flows of requests arrive at different frontends and need to be routed to distant backends for processing whose processing rates are workload dependent. Network latencies can lead to… ▽ More Motivated by the growing demand for serving large language model inference requests, we study distributed load balancing for global serving systems with network latencies. We consider a fluid model in which continuous flows of requests arrive at different frontends and need to be routed to distant backends for processing whose processing rates are workload dependent. Network latencies can lead to long travel times for requests and delayed feedback from backends. The objective is to minimize the average latency of requests, composed of the network latency and the serving latency at the backends. We introduce Distributed Gradient Descent Load Balancing (DGD-LB), a probabilistic routing algorithm in which each frontend adjusts the routing probabilities dynamically using gradient descent. Our algorithm is distributed: there is no coordination between frontends, except by observing the delayed impact other frontends have on shared backends. The algorithm uses an approximate gradient that measures the marginal impact of an additional request evaluated at a delayed system state. Equilibrium points of our algorithm minimize the centralized optimal average latencies, and we provide a novel local stability analysis showing that our algorithm converges to an optimal solution when started sufficiently close to that point. Moreover, we present sufficient conditions on the step-size of gradient descent that guarantee convergence in the presence of network latencies. Numerical experiments show that our algorithm is globally stable and optimal, confirm our stability conditions are nearly tight, and demonstrate that DGD-LB can lead to substantial gains relative to other load balancers studied in the literature when network latencies are large. △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2411.17103 [pdf, other]

Optimal and Stable Distributed Bipartite Load Balancing

Authors: Wenxin Zhang, Santiago R. Balseiro, Robert Kleinberg, Vahab Mirrokni, Balasubramanian Sivan, Bartek Wydrowski

Abstract: We study distributed load balancing in bipartite queueing systems. Specifically, a set of frontends route jobs to a set of heterogeneous backends with workload-dependent service rates, with an arbitrary bipartite graph representing the connectivity between the frontends and backends. Each frontend operates independently without any communication with the other frontends, and the goal is to minimiz… ▽ More We study distributed load balancing in bipartite queueing systems. Specifically, a set of frontends route jobs to a set of heterogeneous backends with workload-dependent service rates, with an arbitrary bipartite graph representing the connectivity between the frontends and backends. Each frontend operates independently without any communication with the other frontends, and the goal is to minimize the expectation of the sum of the latencies of all jobs. Routing based on expected latency can lead to arbitrarily poor performance compared to the centrally coordinated optimal routing. To address this, we propose a natural alternative approach that routes jobs based on marginal service rates, which does not need to know the arrival rates. Despite the distributed nature of this algorithm, it achieves effective coordination among the frontends. In a model with independent Poisson arrivals of discrete jobs at each frontend, we show that the behavior of our routing policy converges (almost surely) to the behavior of a fluid model, in the limit as job sizes tend to zero and Poisson arrival rates are scaled at each frontend so that the expected total volume of jobs arriving per unit time remains fixed. Then, in the fluid model, where job arrivals are represented by infinitely divisible continuous flows and service times are deterministic, we demonstrate that the system converges globally and strongly asymptotically to the centrally coordinated optimal routing. Moreover, we prove the following guarantee on the convergence rate: if initial workloads are $δ$-suboptimal, it takes ${O}( δ+ \log{1/ε})$ time to obtain an $ε$-suboptimal solution. △ Less

Submitted 25 November, 2024; originally announced November 2024.

arXiv:2408.07685 [pdf, ps, other]

Auto-bidding and Auctions in Online Advertising: A Survey

Authors: Gagan Aggarwal, Ashwinkumar Badanidiyuru, Santiago R. Balseiro, Kshipra Bhawalkar, Yuan Deng, Zhe Feng, Gagan Goel, Christopher Liaw, Haihao Lu, Mohammad Mahdian, Jieming Mao, Aranyak Mehta, Vahab Mirrokni, Renato Paes Leme, Andres Perlroth, Georgios Piliouras, Jon Schneider, Ariel Schvartzman, Balasubramanian Sivan, Kelly Spendlove, Yifeng Teng, Di Wang, Hanrui Zhang, Mingfei Zhao, Wennan Zhu , et al. (1 additional authors not shown)

Abstract: In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction d… ▽ More In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2406.18685 [pdf, other]

Battery Operations in Electricity Markets: Strategic Behavior and Distortions

Authors: Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes, Bolun Xu

Abstract: Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the str… ▽ More Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the strategic behavior of decentralized batteries distort decisions compared to centralized batteries? We propose an analytically tractable model that captures salient features of the highly complex electricity market. We derive in closed form the resulting battery behavior and generation cost in three operating regimes: (i) no battery, (ii) centralized battery, and (ii) decentralized profit-maximizing battery. We establish that a decentralized battery distorts its discharge decisions in three ways. First, there is quantity withholding, i.e., discharging less than centrally optimal. Second, there is a shift in participation from day-ahead to real-time, i.e., postponing some of its discharge from day-ahead to real-time. Third, there is reduction in real-time responsiveness, or discharging less in response to smoothing real-time demand than centrally optimal. We also quantify the impact of the battery market power on total system cost via the Price of Anarchy metric, and prove that the it is always between $9/8$ and $4/3$. That is, incentive misalignment always exists, but it is bounded even in the worst case. We calibrate our model to real data from Los Angeles and Houston. Lastly, we show that competition is very effective at reducing distortions, but many market power mitigation mechanisms backfire, and lead to higher total cost. △ Less

Submitted 31 March, 2025; v1 submitted 26 June, 2024; originally announced June 2024.

arXiv:2403.12260 [pdf, other]

The Best of Many Robustness Criteria in Decision Making: Formulation and Application to Robust Pricing

Authors: Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes

Abstract: In robust decision-making under non-Bayesian uncertainty, different robust optimization criteria, such as maximin performance, minimax regret, and maximin ratio, have been proposed. In many problems, all three criteria are well-motivated and well-grounded from a decision-theoretic perspective, yet different criteria give different prescriptions. This paper initiates a systematic study of overfitti… ▽ More In robust decision-making under non-Bayesian uncertainty, different robust optimization criteria, such as maximin performance, minimax regret, and maximin ratio, have been proposed. In many problems, all three criteria are well-motivated and well-grounded from a decision-theoretic perspective, yet different criteria give different prescriptions. This paper initiates a systematic study of overfitting to robustness criteria. How good is a prescription derived from one criterion when evaluated against another criterion? Does there exist a prescription that performs well against all criteria of interest? We formalize and study these questions through the prototypical problem of robust pricing under various information structures, including support, moments, and percentiles of the distribution of values. We provide a unified analysis of three focal robust criteria across various information structures and evaluate the relative performance of mechanisms optimized for each criterion against the others. We find that mechanisms optimized for one criterion often perform poorly against other criteria, highlighting the risk of overfitting to a particular robustness criterion. Remarkably, we show it is possible to design mechanisms that achieve good performance across all three criteria simultaneously, suggesting that decision-makers need not compromise among criteria. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2308.13822 [pdf, ps, other]

Dynamic Pricing for Reusable Resources: The Power of Two Prices

Authors: Santiago R. Balseiro, Will Ma, Wenxin Zhang

Abstract: Motivated by real-world applications such as rental and cloud computing services, we investigate pricing for reusable resources. We consider a system where a single resource with a fixed number of identical copies serves customers with heterogeneous willingness-to-pay (WTP), and the usage duration distribution is general. Optimal dynamic policies are computationally intractable when usage duration… ▽ More Motivated by real-world applications such as rental and cloud computing services, we investigate pricing for reusable resources. We consider a system where a single resource with a fixed number of identical copies serves customers with heterogeneous willingness-to-pay (WTP), and the usage duration distribution is general. Optimal dynamic policies are computationally intractable when usage durations are not memoryless, so existing literature has focused on static pricing, which incurs a steady-state performance loss of ${O}(\sqrt{c})$ compared to optimality when supply and demand scale with $c$. We propose a class of dynamic "stock-dependent" policies that 1) are computationally tractable and 2) can attain a steady-state performance loss of $o(\sqrt{c})$. We give parametric bounds based on the local shape of the reward function at the optimal fluid admission probability and show that the performance loss of stock-dependent policies can be as low as ${O}((\log{c})^2)$. We characterize the tight performance loss for stock-dependent policies and show that they can in fact be achieved by a simple two-price policy that sets a higher price when the stock is below some threshold and a lower price otherwise. We extend our results to settings with multiple resources and multiple customer classes. Finally, we demonstrate this "minimally dynamic" class of two-price policies performs well numerically, even in non-asymptotic settings, suggesting that a little dynamicity can go a long way. △ Less

Submitted 8 November, 2024; v1 submitted 26 August, 2023; originally announced August 2023.

arXiv:2305.09065 [pdf, other]

Robust Auction Design with Support Information

Authors: Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes

Abstract: A seller wants to sell an item to $n$ buyers. Buyer valuations are drawn i.i.d. from a distribution unknown to the seller; the seller only knows that the support is included in $[a, b]$. To be robust, the seller chooses a DSIC mechanism that optimizes the worst-case performance relative to the ideal expected revenue the seller could have collected with knowledge of buyers' valuations. Our analysis… ▽ More A seller wants to sell an item to $n$ buyers. Buyer valuations are drawn i.i.d. from a distribution unknown to the seller; the seller only knows that the support is included in $[a, b]$. To be robust, the seller chooses a DSIC mechanism that optimizes the worst-case performance relative to the ideal expected revenue the seller could have collected with knowledge of buyers' valuations. Our analysis unifies the regret and the ratio objectives. For these objectives, we derive an optimal mechanism and the corresponding performance in quasi-closed form, as a function of the support information $[a, b]$ and the number of buyers $n$. Our analysis reveals three regimes of support information and a new class of robust mechanisms. i.) When $a/b$ is below a threshold, the optimal mechanism is a second-price auction (SPA) with random reserve, a focal class in earlier literature. ii.) When $a/b$ is above another threshold, SPAs are strictly suboptimal, and an optimal mechanism belongs to a class of mechanisms we introduce, which we call pooling auctions (POOL); whenever the highest value is above a threshold, the mechanism still allocates to the highest bidder, but otherwise the mechanism allocates to a uniformly random buyer, i.e., pools low types. iii.) When $a/b$ is between two thresholds, a randomization between SPA and POOL is optimal. We also characterize optimal mechanisms within nested central subclasses of mechanisms: standard mechanisms that only allocate to the highest bidder, SPA with random reserve, and SPA with no reserve. We show strict separations in terms of performance across classes, implying that deviating from standard mechanisms is necessary for robustness. △ Less

Submitted 1 January, 2025; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: An abstract of this work appeared in Proceedings of the 24th ACM Conference on Economics and Computation (EC'23)

arXiv:2302.08530 [pdf, other]

A Field Guide for Pacing Budget and ROS Constraints

Authors: Santiago R. Balseiro, Kshipra Bhawalkar, Zhe Feng, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan, Di Wang

Abstract: Budget pacing is a popular service that has been offered by major internet advertising platforms since their inception. Budget pacing systems seek to optimize advertiser returns subject to budget constraints by smoothly spending advertiser budgets. In the past few years, autobidding products that provide real-time bidding as a service to advertisers have seen a prominent rise in adoption. A popula… ▽ More Budget pacing is a popular service that has been offered by major internet advertising platforms since their inception. Budget pacing systems seek to optimize advertiser returns subject to budget constraints by smoothly spending advertiser budgets. In the past few years, autobidding products that provide real-time bidding as a service to advertisers have seen a prominent rise in adoption. A popular autobidding strategy is value maximization subject to return-on-spend (ROS) constraints. For historical/business reasons, the systems that govern these two services, namely budget pacing and ROS pacing, are not always a unified and coordinated entity that optimizes a global objective subject to both constraints. The purpose of this work is to theoretically and empirically compare algorithms with different degrees of coordination between these two pacing systems. In particular, we compare (a) a fully-decoupled sequential algorithm that first constructs the advertiser's ROS-pacing bid and then lowers that bid for budget pacing; (b) a minimally-coupled min-pacing algorithm that runs these two services independently, obtains the bid multipliers from both of them and applies the minimum of the two multipliers as the effective multiplier; and (c) a fully-coupled dual-based algorithm that optimally combines the dual variables from both the systems. Our main contribution is to theoretically analyze the min-pacing algorithm and show that it attains similar guarantees to the fully-coupled canonical dual-based algorithm. On the other hand, we show that the sequential algorithm, even though appealing by virtue of being fully decoupled, could badly violate the constraints. We validate our theoretical findings empirically by showing that the min-pacing algorithm performs almost as well as the canonical dual-based algorithm on a semi-synthetic dataset based on a large online advertising platform's data. △ Less

Submitted 15 December, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

arXiv:2205.12447 [pdf, ps, other]

Uniformly Bounded Regret in Dynamic Fair Allocation

Authors: Santiago R. Balseiro, Shangzhou Xia

Abstract: We study a dynamic allocation problem in which $T$ sequentially arriving divisible resources are to be allocated to a number of agents with linear utilities. The marginal utilities of each resource to the agents are drawn stochastically from a known joint distribution, independently and identically across time, and the central planner makes immediate and irrevocable allocation decisions. Most work… ▽ More We study a dynamic allocation problem in which $T$ sequentially arriving divisible resources are to be allocated to a number of agents with linear utilities. The marginal utilities of each resource to the agents are drawn stochastically from a known joint distribution, independently and identically across time, and the central planner makes immediate and irrevocable allocation decisions. Most works on dynamic resource allocation aim to maximize the utilitarian welfare, i.e., the efficiency of the allocation, which may result in unfair concentration of resources on certain high-utility agents while leaving others' demands under-fulfilled. In this paper, aiming at balancing efficiency and fairness, we instead consider a broad collection of welfare metrics, the Hölder means, which includes the Nash social welfare and the egalitarian welfare. To this end, we first study a fluid-based policy derived from a deterministic surrogate to the underlying problem and show that for all smooth Hölder mean welfare metrics it attains an $O(1)$ regret over the time horizon length $T$ against the hindsight optimum, i.e., the optimal welfare if all utilities were known in advance of deciding on allocations. However, when evaluated under the non-smooth egalitarian welfare, the fluid-based policy attains a regret of order $Θ(\sqrt{T})$. We then propose a new policy built thereupon, called Backward Infrequent Re-solving with Thresholding ($\mathsf{BIRT}$), which consists of re-solving the deterministic surrogate problem at most $O(\log\log T)$ times. We prove the $\mathsf{BIRT}$ policy attains an $O(1)$ regret against the hindsight optimal egalitarian welfare, independently of the time horizon length $T$. We conclude by presenting numerical experiments to corroborate our theoretical claims and to illustrate the significant performance improvement against several benchmark policies. △ Less

Submitted 23 June, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

arXiv:2204.10478 [pdf, other]

On the Robustness of Second-Price Auctions in Prior-Independent Mechanism Design

Authors: Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes

Abstract: Classical Bayesian mechanism design relies on the common prior assumption, but such prior is often not available in practice. We study the design of prior-independent mechanisms that relax this assumption: the seller is selling an indivisible item to $n$ buyers such that the buyers' valuations are drawn from a joint distribution that is unknown to both the buyers and the seller; buyers do not need… ▽ More Classical Bayesian mechanism design relies on the common prior assumption, but such prior is often not available in practice. We study the design of prior-independent mechanisms that relax this assumption: the seller is selling an indivisible item to $n$ buyers such that the buyers' valuations are drawn from a joint distribution that is unknown to both the buyers and the seller; buyers do not need to form beliefs about competitors, and the seller assumes the distribution is adversarially chosen from a specified class. We measure performance through the worst-case regret, or the difference between the expected revenue achievable with perfect knowledge of buyers' valuations and the actual mechanism revenue. We study a broad set of classes of valuation distributions that capture a wide spectrum of possible dependencies: independent and identically distributed (i.i.d.) distributions, mixtures of i.i.d. distributions, affiliated and exchangeable distributions, exchangeable distributions, and all joint distributions. We derive in quasi closed form the minimax values and the associated optimal mechanism. In particular, we show that the first three classes admit the same minimax regret value, which is decreasing with the number of competitors, while the last two have the same minimax regret equal to that of the single buyer case. Furthermore, we show that the minimax optimal mechanisms have a simple form across all settings: a second-price auction with random reserve prices, which shows its robustness in prior-independent mechanism design. En route to our results, we also develop a principled methodology to determine the form of the optimal mechanism and worst-case distribution via first-order conditions that should be of independent interest in other minimax problems. △ Less

Submitted 11 December, 2024; v1 submitted 21 April, 2022; originally announced April 2022.

Comments: An extended abstract of this work appeared in Proceedings of the 23rd ACM Conference on Economics and Computation (EC'22)

arXiv:2202.06152 [pdf, other]

Analysis of Dual-Based PID Controllers through Convolutional Mirror Descent

Authors: Santiago R. Balseiro, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan

Abstract: Dual-based proportional-integral-derivative (PID) controllers are often employed in practice to solve online allocation problems with global constraints, such as budget pacing in online advertising. However, controllers are used in a heuristic fashion and come with no provable guarantees on their performance. This paper provides the first regret bounds on the performance of dual-based PID controll… ▽ More Dual-based proportional-integral-derivative (PID) controllers are often employed in practice to solve online allocation problems with global constraints, such as budget pacing in online advertising. However, controllers are used in a heuristic fashion and come with no provable guarantees on their performance. This paper provides the first regret bounds on the performance of dual-based PID controllers for online allocation problems. We do so by first establishing a fundamental connection between dual-based PID controllers and a new first-order algorithm for online convex optimization called \emph{Convolutional Mirror Descent} (CMD), which updates iterates based on a weighted moving average of past gradients. CMD recovers, in a special case, online mirror descent with momentum and optimistic mirror descent. We establish sufficient conditions under which CMD attains low regret for general online convex optimization problems with adversarial inputs. We leverage this new result to give the first regret bound for dual-based PID controllers for online allocation problems. As a byproduct of our proofs, we provide the first regret bound for CMD for non-smooth convex optimization, which might be of independent interest. △ Less

Submitted 19 December, 2023; v1 submitted 12 February, 2022; originally announced February 2022.

Showing 1–11 of 11 results for author: Balseiro, S R