Showing 1–2 of 2 results for author: Wickeham, A
-
A General "Power-of-d" Dispatching Framework for Heterogeneous Systems
Authors:
Jazeem Abdul Jaleel,
Sherwin Doroudi,
Kristen Gardner,
Alexander Wickeham
Abstract:
Intelligent dispatching is crucial to obtaining low response times in large-scale systems. One common scalable dispatching paradigm is the ``power-of-$d$,'' in which the dispatcher queries $d$ servers at random and assigns the job to a server based only on the state of the queried servers. The bulk of power-of-$d$ policies studied in the literature assume that the system is homogeneous, meaning th…
▽ More
Intelligent dispatching is crucial to obtaining low response times in large-scale systems. One common scalable dispatching paradigm is the ``power-of-$d$,'' in which the dispatcher queries $d$ servers at random and assigns the job to a server based only on the state of the queried servers. The bulk of power-of-$d$ policies studied in the literature assume that the system is homogeneous, meaning that all servers have the same speed; meanwhile real-world systems often exhibit server speed heterogeneity.
This paper introduces a general framework for describing and analyzing heterogeneity-aware power-of-$d$ policies. The key idea behind our framework is that dispatching policies can make use of server speed information at two decision points: when choosing which $d$ servers to query, and when assigning a job to one of those servers. Our framework explicitly separates the dispatching policy into a querying rule and an assignment rule; we consider general families of both rule types.
While the strongest assignment rules incorporate both detailed queue-length information and server speed information, these rules typically are difficult to analyze. We overcome this difficulty by focusing on heterogeneity-aware assignment rules that ignore queue length information beyond idleness status. In this setting, we analyze mean response time and formulate novel optimization problems for the joint optimization of querying and assignment. We build upon our optimized policies to develop heuristic queue length-aware dispatching policies. Our heuristic policies perform well in simulation, relative to policies that have appeared in the literature.
△ Less
Submitted 17 December, 2021; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Scalable Load Balancing in the Presence of Heterogeneous Servers
Authors:
Kristen Gardner,
Jazeem Abdul Jaleel,
Alexander Wickeham,
Sherwin Doroudi
Abstract:
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homo…
▽ More
Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance---or even instability---in heterogeneous systems.
We adapt the "power-of-d" versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules---including heterogeneity-aware policies such as Shortest-Expected-Delay---with respect to mean response time.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.