-
A Vectorised Packing Algorithm for Efficient Generation of Custom Traffic Matrices
Authors:
Christoper W. F. Parsonson,
Joshua L. Benjamin,
Georgios Zervas
Abstract:
We propose a new algorithm for generating custom network traffic matrices which achieves 13x, 38x, and 70x faster generation times than prior work on networks with 64, 256, and 1024 nodes respectively.
We propose a new algorithm for generating custom network traffic matrices which achieves 13x, 38x, and 70x faster generation times than prior work on networks with 64, 256, and 1024 nodes respectively.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks
Authors:
Christopher W. F. Parsonson,
Zacharaya Shabka,
Alessandro Ottino,
Georgios Zervas
Abstract:
From natural language processing to genome sequencing, large-scale machine learning models are bringing advances to a broad range of fields. Many of these models are too large to be trained on a single machine, and instead must be distributed across multiple devices. This has motivated the research of new compute and network systems capable of handling such tasks. In particular, recent work has fo…
▽ More
From natural language processing to genome sequencing, large-scale machine learning models are bringing advances to a broad range of fields. Many of these models are too large to be trained on a single machine, and instead must be distributed across multiple devices. This has motivated the research of new compute and network systems capable of handling such tasks. In particular, recent work has focused on developing management schemes which decide how to allocate distributed resources such that some overall objective, such as minimising the job completion time (JCT), is optimised. However, such studies omit explicit consideration of how much a job should be distributed, usually assuming that maximum distribution is desirable. In this work, we show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate. To address this, we propose PAC-ML (partitioning for asynchronous computing with machine learning). PAC-ML leverages a graph neural network and reinforcement learning to learn how much to partition computation graphs such that the number of jobs which meet arbitrary user-defined JCT requirements is maximised. In experiments with five real deep learning computation graphs on a recently proposed optical architecture across four user-defined JCT requirement distributions, we demonstrate PAC-ML achieving up to 56.2% lower blocking rates in dynamic job arrival settings than the canonical maximum parallelisation strategy used by most prior works.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
RAMP: A Flat Nanosecond Optical Network and MPI Operations for Distributed Deep Learning Systems
Authors:
Alessandro Ottino,
Joshua Benjamin,
Georgios Zervas
Abstract:
Distributed deep learning (DDL) systems strongly depend on network performance. Current electronic packet switched (EPS) network architectures and technologies suffer from variable diameter topologies, low-bisection bandwidth and over-subscription affecting completion time of communication and collective operations.
We introduce a near-exascale, full-bisection bandwidth, all-to-all, single-hop,…
▽ More
Distributed deep learning (DDL) systems strongly depend on network performance. Current electronic packet switched (EPS) network architectures and technologies suffer from variable diameter topologies, low-bisection bandwidth and over-subscription affecting completion time of communication and collective operations.
We introduce a near-exascale, full-bisection bandwidth, all-to-all, single-hop, all-optical network architecture with nanosecond reconfiguration called RAMP, which supports large-scale distributed and parallel computing systems (12.8~Tbps per node for up to 65,536 nodes).
For the first time, a custom RAMP-x MPI strategy and a network transcoder is proposed to run MPI collective operations across the optical circuit switched (OCS) network in a schedule-less and contention-less manner. RAMP achieves 7.6-171$\times$ speed-up in completion time across all MPI operations compared to realistic EPS and OCS counterparts. It can also deliver a 1.3-16$\times$ and 7.8-58$\times$ reduction in Megatron and DLRM training time respectively} while offering 42-53$\times$ and 3.3-12.4$\times$ improvement in energy consumption and cost respectively.
△ Less
Submitted 24 February, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Network Aware Compute and Memory Allocation in Optically Composable Data Centres with Deep Reinforcement Learning and Graph Neural Networks
Authors:
Zacharaya Shabka,
Georgios Zervas
Abstract:
Resource-disaggregated data centre architectures promise a means of pooling resources remotely within data centres, allowing for both more flexibility and resource efficiency underlying the increasingly important infrastructure-as-a-service business. This can be accomplished by means of using an optically circuit switched backbone in the data centre network (DCN); providing the required bandwidth…
▽ More
Resource-disaggregated data centre architectures promise a means of pooling resources remotely within data centres, allowing for both more flexibility and resource efficiency underlying the increasingly important infrastructure-as-a-service business. This can be accomplished by means of using an optically circuit switched backbone in the data centre network (DCN); providing the required bandwidth and latency guarantees to ensure reliable performance when applications are run across non-local resource pools. However, resource allocation in this scenario requires both server-level \emph{and} network-level resource to be co-allocated to requests. The online nature and underlying combinatorial complexity of this problem, alongside the typical scale of DCN topologies, makes exact solutions impossible and heuristic based solutions sub-optimal or non-intuitive to design. We demonstrate that \emph{deep reinforcement learning}, where the policy is modelled by a \emph{graph neural network} can be used to learn effective \emph{network-aware} and \emph{topologically-scalable} allocation policies end-to-end. Compared to state-of-the-art heuristics for network-aware resource allocation, the method achieves up to $20\%$ higher acceptance ratio; can achieve the same acceptance ratio as the best performing heuristic with $3\times$ less networking resources available and can maintain all-around performance when directly applied (with no further training) to DCN topologies with $10^2\times$ more servers than the topologies seen during training.
△ Less
Submitted 26 October, 2022;
originally announced November 2022.
-
One-shot, Offline and Production-Scalable PID Optimisation with Deep Reinforcement Learning
Authors:
Zacharaya Shabka,
Michael Enrico,
Nick Parsons,
Georgios Zervas
Abstract:
Proportional-integral-derivative (PID) control underlies more than $97\%$ of automated industrial processes. Controlling these processes effectively with respect to some specified set of performance goals requires finding an optimal set of PID parameters to moderate the PID loop. Tuning these parameters is a long and exhaustive process. A method (patent pending) based on deep reinforcement learnin…
▽ More
Proportional-integral-derivative (PID) control underlies more than $97\%$ of automated industrial processes. Controlling these processes effectively with respect to some specified set of performance goals requires finding an optimal set of PID parameters to moderate the PID loop. Tuning these parameters is a long and exhaustive process. A method (patent pending) based on deep reinforcement learning is presented that learns a relationship between generic system properties (e.g. resonance frequency), a multi-objective performance goal and optimal PID parameter values. Performance is demonstrated in the context of a real optical switching product of the foremost manufacturer of such devices globally. Switching is handled by piezoelectric actuators where switching time and optical loss are derived from the speed and stability of actuator-control processes respectively. The method achieves a $5\times$ improvement in the number of actuators that fall within the most challenging target switching speed, $\geq 20\%$ improvement in mean switching speed at the same optical loss and $\geq 75\%$ reduction in performance inconsistency when temperature varies between 5 and 73 degrees celcius. Furthermore, once trained (which takes $\mathcal{O}(hours)$), the model generates actuator-unique PID parameters in a one-shot inference process that takes $\mathcal{O}(ms)$ in comparison to up to $\mathcal{O}(week)$ required for conventional tuning methods, therefore accomplishing these performance improvements whilst achieving up to a $10^6\times$ speed-up. After training, the method can be applied entirely offline, incurring effectively zero optimisation-overhead in production.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Traffic Generation for Benchmarking Data Centre Networks
Authors:
Christopher W. F. Parsonson,
Joshua L. Benjamin,
Georgios Zervas
Abstract:
Benchmarking is commonly used in research fields, such as computer architecture design and machine learning, as a powerful paradigm for rigorously assessing, comparing, and developing novel technologies. However, the data centre networking community lacks a standard open-access benchmark. This is curtailing the community's understanding of existing systems and hindering the ability with which nove…
▽ More
Benchmarking is commonly used in research fields, such as computer architecture design and machine learning, as a powerful paradigm for rigorously assessing, comparing, and developing novel technologies. However, the data centre networking community lacks a standard open-access benchmark. This is curtailing the community's understanding of existing systems and hindering the ability with which novel technologies can be developed, compared, and tested.
We present TrafPy; an open-access framework for generating both realistic and custom data centre network traffic traces. TrafPy is compatible with any simulation, emulation, or experimentation environment, and can be used for standardised benchmarking and for investigating the properties and limitations of network systems such as schedulers, switches, routers, and resource managers. To demonstrate the efficacy of TrafPy, we use it to conduct a thorough investigation into the sensitivity of 4 canonical scheduling algorithms (shortest remaining processing time, fair share, first fit, and random) to varying traffic trace characteristics. We show how the fundamental scheduler performance insights revealed by these tests translate to 4 realistic data centre network types; University, Private Enterprise, Commercial Cloud, and Social Media Cloud. We then draw conclusions as to which types of scheduling policies are most suited to which types of network load conditions and traffic characteristics, leading to the possibility of application-informed decision making at the design stage and new dynamically adaptable scheduling policies. TrafPy is open-sourced via GitHub and all data associated with this manuscript via RDR.
△ Less
Submitted 25 August, 2022; v1 submitted 3 July, 2021;
originally announced July 2021.
-
Resource Allocation in Disaggregated Data Centre Systems with Reinforcement Learning
Authors:
Zacharaya Shabka,
Georgios Zervas
Abstract:
Resource-disaggregated data centres (RDDC) propose a resource-centric, and high-utilisation architecture for data centres (DC), avoiding resource fragmentation and enabling arbitrarily sized resource pools to be allocated to tasks, rather than server-sized ones. RDDCs typically impose greater demand on the network, requiring more infrastructure and increasing cost and power, so new resource alloca…
▽ More
Resource-disaggregated data centres (RDDC) propose a resource-centric, and high-utilisation architecture for data centres (DC), avoiding resource fragmentation and enabling arbitrarily sized resource pools to be allocated to tasks, rather than server-sized ones. RDDCs typically impose greater demand on the network, requiring more infrastructure and increasing cost and power, so new resource allocation algorithms that co-manage both server and networks resources are essential to ensure that allocation is not bottlenecked by the network, and that requests can be served successfully with minimal networking resources. We apply reinforcement learning (RL) to this problem for the first time and show that an RL policy based on graph neural networks can learn resource allocation policies end-to-end that outperform previous hand-engineered heuristics by up to 22.0\%, 42.6\% and 22.6\% for acceptance ratio, CPU and memory utilisation respectively, maintain performance when scaled up to RDDC topologies with $10^2\times$ more nodes than those seen during training and can achieve comparable performance to the best baselines while using $5.3\times$ less network resources.
△ Less
Submitted 11 November, 2021; v1 submitted 4 June, 2021;
originally announced June 2021.
-
On the Relationship Between Network Topology and Throughput in Mesh Optical Networks
Authors:
Daniel Semrau,
Shahzaib Durrani,
Georgios Zervas,
Robert I. Killey,
Polina Bayvel
Abstract:
The relationship between topology and network throughput of arbitrarily-connected mesh networks is studied. Taking into account nonlinear channel properties, it is shown that throughput decreases logarithmically with physical network size with minor dependence on network ellipticity.
The relationship between topology and network throughput of arbitrarily-connected mesh networks is studied. Taking into account nonlinear channel properties, it is shown that throughput decreases logarithmically with physical network size with minor dependence on network ellipticity.
△ Less
Submitted 15 August, 2020;
originally announced August 2020.
-
SWIFT: Scalable Ultra-Wideband Sub-Nanosecond Wavelength Switching for Data Centre Networks
Authors:
Thomas Gerard,
Christopher Parsonson,
Zacharaya Shabka,
Polina Bayvel,
Domaniç Lavery,
Georgios Zervas
Abstract:
We propose a time-multiplexed DS-DBR/SOA-gated system to deliver low-power fast tuning across S-/C-/L-bands. Sub-ns switching is demonstrated, supporting 122$\times$50 GHz channels over 6.05 THz using AI techniques.
We propose a time-multiplexed DS-DBR/SOA-gated system to deliver low-power fast tuning across S-/C-/L-bands. Sub-ns switching is demonstrated, supporting 122$\times$50 GHz channels over 6.05 THz using AI techniques.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
PULSE: Optical circuit switched Data Center architecture operating at nanosecond timescales
Authors:
Joshua L. Benjamin,
Thomas Gerard,
Domaniç Lavery,
Polina Bayvel,
Georgios Zervas
Abstract:
We introduce PULSE, a sub-microsecond optical circuit-switched data centre network architecture controlled by distributed hardware schedulers. PULSE is a flat architecture that uses parallel passive coupler-based broadcast and select networks. We employ a novel transceiver architecture, for dynamic wavelength-timeslot selection, to achieve a reconfiguration time down to O(100ps), establishing time…
▽ More
We introduce PULSE, a sub-microsecond optical circuit-switched data centre network architecture controlled by distributed hardware schedulers. PULSE is a flat architecture that uses parallel passive coupler-based broadcast and select networks. We employ a novel transceiver architecture, for dynamic wavelength-timeslot selection, to achieve a reconfiguration time down to O(100ps), establishing timeslots of O(10ns). A novel scheduling algorithm that has a clock period of 2.3ns performs multiple iterations to maximize throughput, wavelength usage and reduce latency, enhancing the overall performance. In order to scale, the single-hop PULSE architecture uses sub-networks that are disjoint by using multiple transceivers for each node in 64 node racks. At the reconfiguration circuit duration (epoch = 120 ns), the scheduling algorithm is shown to achieve up to 93% throughput and 100% wavelength usage of 64 wavelengths, incurring an average latency that ranges from 0.7-1.2 microseconds with best-case 0.4 microsecond median and 5 microsecond tail latency, limited by the timeslot (20 ns) and epoch size (120 ns). We show how the 4096-node PULSE architecture allows up to 260k optical channels to be re-used across sub-networks achieving a capacity of 25.6 Pbps with an energy consumption of 85 pJ/bit.
△ Less
Submitted 25 May, 2020; v1 submitted 10 February, 2020;
originally announced February 2020.
-
An Economic Analysis of User-Privacy Options in Ad-Supported Services
Authors:
Joan Feigenbaum,
Michael Mitzenmacher,
Georgios Zervas
Abstract:
We analyze the value to e-commerce website operators of offering privacy options to users, e.g., of allowing users to opt out of ad targeting. In particular, we assume that site operators have some control over the cost that a privacy option imposes on users and ask when it is to their advantage to make such costs low. We consider both the case of a single site and the case of multiple sites that…
▽ More
We analyze the value to e-commerce website operators of offering privacy options to users, e.g., of allowing users to opt out of ad targeting. In particular, we assume that site operators have some control over the cost that a privacy option imposes on users and ask when it is to their advantage to make such costs low. We consider both the case of a single site and the case of multiple sites that compete both for users who value privacy highly and for users who value it less. One of our main results in the case of a single site is that, under normally distributed utilities, if a privacy-sensitive user is worth at least $\sqrt{2} - 1$ times as much to advertisers as a privacy-insensitive user, the site operator should strive to make the cost of a privacy option as low as possible. In the case of multiple sites, we show how a Prisoner's-Dilemma situation can arise: In the equilibrium in which both sites are obliged to offer a privacy option at minimal cost, both sites obtain lower revenue than they would if they colluded and neither offered a privacy option.
△ Less
Submitted 1 August, 2012;
originally announced August 2012.
-
The Groupon Effect on Yelp Ratings: A Root Cause Analysis
Authors:
John W. Byers,
Michael Mitzenmacher,
Georgios Zervas
Abstract:
Daily deals sites such as Groupon offer deeply discounted goods and services to tens of millions of customers through geographically targeted daily e-mail marketing campaigns. In our prior work we observed that a negative side effect for merchants using Groupons is that, on average, their Yelp ratings decline significantly. However, this previous work was essentially observational, rather than exp…
▽ More
Daily deals sites such as Groupon offer deeply discounted goods and services to tens of millions of customers through geographically targeted daily e-mail marketing campaigns. In our prior work we observed that a negative side effect for merchants using Groupons is that, on average, their Yelp ratings decline significantly. However, this previous work was essentially observational, rather than explanatory. In this work, we rigorously consider and evaluate various hypotheses about underlying consumer and merchant behavior in order to understand this phenomenon, which we dub the Groupon effect. We use statistical analysis and mathematical modeling, leveraging a dataset we collected spanning tens of thousands of daily deals and over 7 million Yelp reviews. In particular, we investigate hypotheses such as whether Groupon subscribers are more critical than their peers, or whether some fraction of Groupon merchants provide significantly worse service to customers using Groupons. We suggest an additional novel hypothesis: reviews from Groupon subscribers are lower on average because such reviews correspond to real, unbiased customers, while the body of reviews on Yelp contain some fraction of reviews from biased or even potentially fake sources. Although we focus on a specific question, our work provides broad insights into both consumer and merchant behavior within the daily deals marketplace.
△ Less
Submitted 10 February, 2012;
originally announced February 2012.
-
Daily Deals: Prediction, Social Diffusion, and Reputational Ramifications
Authors:
John W. Byers,
Michael Mitzenmacher,
Georgios Zervas
Abstract:
Daily deal sites have become the latest Internet sensation, providing discounted offers to customers for restaurants, ticketed events, services, and other items. We begin by undertaking a study of the economics of daily deals on the web, based on a dataset we compiled by monitoring Groupon and LivingSocial sales in 20 large cities over several months. We use this dataset to characterize deal purch…
▽ More
Daily deal sites have become the latest Internet sensation, providing discounted offers to customers for restaurants, ticketed events, services, and other items. We begin by undertaking a study of the economics of daily deals on the web, based on a dataset we compiled by monitoring Groupon and LivingSocial sales in 20 large cities over several months. We use this dataset to characterize deal purchases; glean insights about operational strategies of these firms; and evaluate customers' sensitivity to factors such as price, deal scheduling, and limited inventory. We then marry our daily deals dataset with additional datasets we compiled from Facebook and Yelp users to study the interplay between social networks and daily deal sites. First, by studying user activity on Facebook while a deal is running, we provide evidence that daily deal sites benefit from significant word-of-mouth effects during sales events, consistent with results predicted by cascade models. Second, we consider the effects of daily deals on the longer-term reputation of merchants, based on their Yelp reviews before and after they run a daily deal. Our analysis shows that while the number of reviews increases significantly due to daily deals, average rating scores from reviewers who mention daily deals are 10% lower than scores of their peers on average.
△ Less
Submitted 7 September, 2011;
originally announced September 2011.
-
A Month in the Life of Groupon
Authors:
John W. Byers,
Michael Mitzenmacher,
Michalis Potamias,
Georgios Zervas
Abstract:
Groupon has become the latest Internet sensation, providing daily deals to customers in the form of discount offers for restaurants, ticketed events, appliances, services, and other items. We undertake a study of the economics of daily deals on the web, based on a dataset we compiled by monitoring Groupon over several weeks. We use our dataset to characterize Groupon deal purchases, and to glean i…
▽ More
Groupon has become the latest Internet sensation, providing daily deals to customers in the form of discount offers for restaurants, ticketed events, appliances, services, and other items. We undertake a study of the economics of daily deals on the web, based on a dataset we compiled by monitoring Groupon over several weeks. We use our dataset to characterize Groupon deal purchases, and to glean insights about Groupon's operational strategy. Our focus is on purchase incentives. For the primary purchase incentive, price, our regression model indicates that demand for coupons is relatively inelastic, allowing room for price-based revenue optimization. More interestingly, mining our dataset, we find evidence that Groupon customers are sensitive to other, "soft", incentives, e.g., deal scheduling and duration, deal featuring, and limited inventory. Our analysis points to the importance of considering incentives other than price in optimizing deal sites and similar systems.
△ Less
Submitted 4 May, 2011;
originally announced May 2011.
-
Heapable Sequences and Subsequences
Authors:
John Byers,
Brent Heeringa,
Michael Mitzenmacher,
Georgios Zervas
Abstract:
Let us call a sequence of numbers heapable if they can be sequentially inserted to form a binary tree with the heap property, where each insertion subsequent to the first occurs at a leaf of the tree, i.e. below a previously placed number. In this paper we consider a variety of problems related to heapable sequences and subsequences that do not appear to have been studied previously. Our motivatio…
▽ More
Let us call a sequence of numbers heapable if they can be sequentially inserted to form a binary tree with the heap property, where each insertion subsequent to the first occurs at a leaf of the tree, i.e. below a previously placed number. In this paper we consider a variety of problems related to heapable sequences and subsequences that do not appear to have been studied previously. Our motivation for introducing these concepts is two-fold. First, such problems correspond to natural extensions of the well-known secretary problem for hiring an organization with a hierarchical structure. Second, from a purely combinatorial perspective, our problems are interesting variations on similar longest increasing subsequence problems, a problem paradigm that has led to many deep mathematical connections.
We provide several basic results. We obtain an efficient algorithm for determining the heapability of a sequence, and also prove that the question of whether a sequence can be arranged in a complete binary heap is NP-hard. Regarding subsequences we show that, with high probability, the longest heapable subsequence of a random permutation of n numbers has length (1 - o(1)) n, and a subsequence of length (1 - o(1)) n can in fact be found online with high probability. We similarly show that for a random permutation a subsequence that yields a complete heap of size αn for a constant αcan be found with high probability. Our work highlights the interesting structure underlying this class of subsequence problems, and we leave many further interesting variations open for future work.
△ Less
Submitted 14 July, 2010;
originally announced July 2010.
-
Information Asymmetries in Pay-Per-Bid Auctions: How Swoopo Makes Bank
Authors:
John W. Byers,
Michael Mitzenmacher,
Georgios Zervas
Abstract:
Innovative auction methods can be exploited to increase profits, with Shubik's famous "dollar auction" perhaps being the most widely known example. Recently, some mainstream e-commerce web sites have apparently achieved the same end on a much broader scale, by using "pay-per-bid" auctions to sell items, from video games to bars of gold. In these auctions, bidders incur a cost for placing each bid…
▽ More
Innovative auction methods can be exploited to increase profits, with Shubik's famous "dollar auction" perhaps being the most widely known example. Recently, some mainstream e-commerce web sites have apparently achieved the same end on a much broader scale, by using "pay-per-bid" auctions to sell items, from video games to bars of gold. In these auctions, bidders incur a cost for placing each bid in addition to (or sometimes in lieu of) the winner's final purchase cost. Thus even when a winner's purchase cost is a small fraction of the item's intrinsic value, the auctioneer can still profit handsomely from the bid fees. Our work provides novel analyses for these auctions, based on both modeling and datasets derived from auctions at Swoopo.com, the leading pay-per-bid auction site. While previous modeling work predicts profit-free equilibria, we analyze the impact of information asymmetry broadly, as well as Swoopo features such as bidpacks and the Swoop It Now option specifically, to quantify the effects of imperfect information in these auctions. We find that even small asymmetries across players (cheaper bids, better estimates of other players' intent, different valuations of items, committed players willing to play "chicken") can increase the auction duration well beyond that predicted by previous work and thus skew the auctioneer's profit disproportionately. Finally, we discuss our findings in the context of a dataset of thousands of live auctions we observed on Swoopo, which enables us also to examine behavioral factors, such as the power of aggressive bidding. Ultimately, our findings show that even with fully rational players, if players overlook or are unaware any of these factors, the result is outsized profits for pay-per-bid auctioneers.
△ Less
Submitted 30 March, 2010; v1 submitted 5 January, 2010;
originally announced January 2010.