Search | arXiv e-print repository

Learning Best Paths in Quantum Networks

Authors: Xuchuang Wang, Maoli Liu, Xutong Liu, Zhuohua Li, Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Abstract: Quantum networks (QNs) transmit delicate quantum information across noisy quantum channels. Crucial applications, like quantum key distribution (QKD) and distributed quantum computation (DQC), rely on efficient quantum information transmission. Learning the best path between a pair of end nodes in a QN is key to enhancing such applications. This paper addresses learning the best path in a QN in th… ▽ More Quantum networks (QNs) transmit delicate quantum information across noisy quantum channels. Crucial applications, like quantum key distribution (QKD) and distributed quantum computation (DQC), rely on efficient quantum information transmission. Learning the best path between a pair of end nodes in a QN is key to enhancing such applications. This paper addresses learning the best path in a QN in the online learning setting. We explore two types of feedback: "link-level" and "path-level". Link-level feedback pertains to QNs with advanced quantum switches that enable link-level benchmarking. Path-level feedback, on the other hand, is associated with basic quantum switches that permit only path-level benchmarking. We introduce two online learning algorithms, BeQuP-Link and BeQuP-Path, to identify the best path using link-level and path-level feedback, respectively. To learn the best path, BeQuP-Link benchmarks the critical links dynamically, while BeQuP-Path relies on a subroutine, transferring path-level observations to estimate link-level parameters in a batch manner. We analyze the quantum resource complexity of these algorithms and demonstrate that both can efficiently and, with high probability, determine the best path. Finally, we perform NetSquid-based simulations and validate that both algorithms accurately and efficiently identify the best path. △ Less

Submitted 14 June, 2025; originally announced June 2025.

Comments: Accepted at INFOCOM 2025

arXiv:2505.19043 [pdf, ps, other]

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Authors: Jingyuan Liu, Zeyu Zhang, Xuchuang Wang, Xutong Liu, John C. S. Lui, Mohammad Hajiesmaili, Carlee Joe-Wong

Abstract: Contextual linear multi-armed bandits are a learning framework for making a sequence of decisions, e.g., advertising recommendations for a sequence of arriving users. Recent works have shown that clustering these users based on the similarity of their learned preferences can significantly accelerate the learning. However, prior work has primarily focused on the online setting, which requires conti… ▽ More Contextual linear multi-armed bandits are a learning framework for making a sequence of decisions, e.g., advertising recommendations for a sequence of arriving users. Recent works have shown that clustering these users based on the similarity of their learned preferences can significantly accelerate the learning. However, prior work has primarily focused on the online setting, which requires continually collecting user data, ignoring the offline data widely available in many applications. To tackle these limitations, we study the offline clustering of bandits (Off-ClusBand) problem, which studies how to use the offline dataset to learn cluster properties and improve decision-making across multiple users. The key challenge in Off-ClusBand arises from data insufficiency for users: unlike the online case, in the offline case, we have a fixed, limited dataset to work from and thus must determine whether we have enough data to confidently cluster users together. To address this challenge, we propose two algorithms: Off-C$^2$LUB, which we analytically show performs well for arbitrary amounts of user data, and Off-CLUB, which is prone to bias when data is limited but, given sufficient data, matches a theoretical lower bound that we derive for the offline clustered MAB problem. We experimentally validate these results on both real and synthetic datasets. △ Less

Submitted 25 May, 2025; originally announced May 2025.

arXiv:2504.21549 [pdf, other]

Online Experimental Design for Network Tomography

Authors: Xuchuang Wang, Yu-Zhen Janice Chen, Matheus Guedes de Andrade, Mohammad Hajiesmaili, John C. S. Lui, Ting He, Don Towsley

Abstract: How to efficiently perform network tomography is a fundamental problem in network management and monitoring. A network tomography task usually consists of applying multiple probing experiments, e.g., across different paths or via different casts (including unicast and multicast). We study how to optimize the network tomography process through online sequential decision-making. From the methodology… ▽ More How to efficiently perform network tomography is a fundamental problem in network management and monitoring. A network tomography task usually consists of applying multiple probing experiments, e.g., across different paths or via different casts (including unicast and multicast). We study how to optimize the network tomography process through online sequential decision-making. From the methodology perspective, we introduce an online probe allocation algorithm that dynamically performs network tomography based on the principles of optimal experimental design and the maximum likelihood estimation. We rigorously analyze the regret of the algorithm under the conditions that i) the optimal allocation is Lipschitz continuous in the parameters being estimated and ii) the parameter estimators satisfy a concentration property. From the application perspective, we present two case studies: a) the classical lossy packet-switched network and b) the quantum bit-flip network. We show that both cases fulfill the two theoretical conditions and provide their corresponding regrets when deploying our proposed online probe allocation algorithm. Besides these two case studies with theoretical guarantees, we also conduct simulations to compare our proposed algorithm with existing methods and demonstrate our algorithm's effectiveness in a broader range of scenarios. △ Less

Submitted 30 April, 2025; originally announced April 2025.

arXiv:2504.15812 [pdf, other]

Fusing Reward and Dueling Feedback in Stochastic Bandits

Authors: Xuchuang Wang, Qirun Zeng, Jinhang Zuo, Xutong Liu, Mohammad Hajiesmaili, John C. S. Lui, Adam Wierman

Abstract: This paper investigates the fusion of absolute (reward) and relative (dueling) feedback in stochastic bandits, where both feedback types are gathered in each decision round. We derive a regret lower bound, demonstrating that an efficient algorithm may incur only the smaller among the reward and dueling-based regret for each individual arm. We propose two fusion approaches: (1) a simple elimination… ▽ More This paper investigates the fusion of absolute (reward) and relative (dueling) feedback in stochastic bandits, where both feedback types are gathered in each decision round. We derive a regret lower bound, demonstrating that an efficient algorithm may incur only the smaller among the reward and dueling-based regret for each individual arm. We propose two fusion approaches: (1) a simple elimination fusion algorithm that leverages both feedback types to explore all arms and unifies collected information by sharing a common candidate arm set, and (2) a decomposition fusion algorithm that selects the more effective feedback to explore the corresponding arms and randomly assigns one feedback type for exploration and the other for exploitation in each round. The elimination fusion experiences a suboptimal multiplicative term of the number of arms in regret due to the intrinsic suboptimality of dueling elimination. In contrast, the decomposition fusion achieves regret matching the lower bound up to a constant under a common assumption. Extensive experiments confirm the efficacy of our algorithms and theoretical results. △ Less

Submitted 22 April, 2025; originally announced April 2025.

arXiv:2502.16128 [pdf, other]

Heterogeneous Multi-Agent Bandits with Parsimonious Hints

Authors: Amirmahdi Mirfakhar, Xuchuang Wang, Jinhang Zuo, Yair Zick, Mohammad Hajiesmaili

Abstract: We study a hinted heterogeneous multi-agent multi-armed bandits problem (HMA2B), where agents can query low-cost observations (hints) in addition to pulling arms. In this framework, each of the $M$ agents has a unique reward distribution over $K$ arms, and in $T$ rounds, they can observe the reward of the arm they pull only if no other agent pulls that arm. The goal is to maximize the total utilit… ▽ More We study a hinted heterogeneous multi-agent multi-armed bandits problem (HMA2B), where agents can query low-cost observations (hints) in addition to pulling arms. In this framework, each of the $M$ agents has a unique reward distribution over $K$ arms, and in $T$ rounds, they can observe the reward of the arm they pull only if no other agent pulls that arm. The goal is to maximize the total utility by querying the minimal necessary hints without pulling arms, achieving time-independent regret. We study HMA2B in both centralized and decentralized setups. Our main centralized algorithm, GP-HCLA, which is an extension of HCLA, uses a central decision-maker for arm-pulling and hint queries, achieving $O(M^4K)$ regret with $O(MK\log T)$ adaptive hints. In decentralized setups, we propose two algorithms, HD-ETC and EBHD-ETC, that allow agents to choose actions independently through collision-based communication and query hints uniformly until stopping, yielding $O(M^3K^2)$ regret with $O(M^3K\log T)$ hints, where the former requires knowledge of the minimum gap and the latter does not. Finally, we establish lower bounds to prove the optimality of our results and verify them through numerical simulations. △ Less

Submitted 22 February, 2025; originally announced February 2025.

Comments: Accepted at AAAI-2025

arXiv:2502.09717 [pdf, other]

Carbon- and Precedence-Aware Scheduling for Data Processing Clusters

Authors: Adam Lechowicz, Rohan Shenoy, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Christina Delimitrou

Abstract: As large-scale data processing workloads continue to grow, their carbon footprint raises concerns. Prior research on carbon-aware schedulers has focused on shifting computation to align with availability of low-carbon energy, but these approaches assume that each task can be executed independently. In contrast, data processing jobs have precedence constraints (i.e., outputs of one task are inputs… ▽ More As large-scale data processing workloads continue to grow, their carbon footprint raises concerns. Prior research on carbon-aware schedulers has focused on shifting computation to align with availability of low-carbon energy, but these approaches assume that each task can be executed independently. In contrast, data processing jobs have precedence constraints (i.e., outputs of one task are inputs for another) that complicate decisions, since delaying an upstream ``bottleneck'' task to a low-carbon period will also block downstream tasks, impacting the entire job's completion time. In this paper, we show that carbon-aware scheduling for data processing benefits from knowledge of both time-varying carbon and precedence constraints. Our main contribution is $\texttt{PCAPS}$, a carbon-aware scheduler that interfaces with modern ML scheduling policies to explicitly consider the precedence-driven importance of each task in addition to carbon. To illustrate the gains due to fine-grained task information, we also study $\texttt{CAP}$, a wrapper for any carbon-agnostic scheduler that adapts the key provisioning ideas of $\texttt{PCAPS}$. Our schedulers enable a configurable priority between carbon reduction and job completion time, and we give analytical results characterizing the trade-off between the two. Furthermore, our Spark prototype on a 100-node Kubernetes cluster shows that a moderate configuration of $\texttt{PCAPS}$ reduces carbon footprint by up to 32.9% without significantly impacting the cluster's total efficiency. △ Less

Submitted 13 February, 2025; originally announced February 2025.

Comments: 27 pages, 20 figures

arXiv:2502.08877 [pdf, other]

Dynamic Incentive Allocation for City-scale Deep Decarbonization

Authors: Anupama Sitaraman, Adam Lechowicz, Noman Bashir, Xutong Liu, Mohammad Hajiesmaili, Prashant Shenoy

Abstract: Greenhouse gas emissions from the residential sector represent a significant fraction of global emissions. Governments and utilities have designed incentives to stimulate the adoption of decarbonization technologies such as rooftop PV and heat pumps. However, studies have shown that many of these incentives are inefficient since a substantial fraction of spending does not actually promote adoption… ▽ More Greenhouse gas emissions from the residential sector represent a significant fraction of global emissions. Governments and utilities have designed incentives to stimulate the adoption of decarbonization technologies such as rooftop PV and heat pumps. However, studies have shown that many of these incentives are inefficient since a substantial fraction of spending does not actually promote adoption, and incentives are not equitably distributed across socioeconomic groups. We present a novel data-driven approach that adopts a holistic, emissions-based and city-scale perspective on decarbonization. We propose an optimization model that dynamically allocates a total incentive budget to households to directly maximize city-wide carbon reduction. We leverage techniques for the multi-armed bandits problem to estimate human factors, such as a household's willingness to adopt new technologies given a certain incentive. We apply our proposed framework to a city in the Northeast U.S., using real household energy data, grid carbon intensity data, and future price scenarios. We show that our learning-based technique significantly outperforms an example status quo incentive scheme, achieving up to 32.23% higher carbon reductions. We show that our framework can accommodate equity-aware constraints to equitably allocate incentives across socioeconomic groups, achieving 78.84% of the carbon reductions of the optimal solution on average. △ Less

Submitted 12 February, 2025; originally announced February 2025.

arXiv:2502.08003 [pdf, other]

Heterogeneous Multi-agent Multi-armed Bandits on Stochastic Block Models

Authors: Mengfan Xu, Liren Shan, Fatemeh Ghaffari, Xuchuang Wang, Xutong Liu, Mohammad Hajiesmaili

Abstract: We study a novel heterogeneous multi-agent multi-armed bandit problem with a cluster structure induced by stochastic block models, influencing not only graph topology, but also reward heterogeneity. Specifically, agents are distributed on random graphs based on stochastic block models - a generalized Erdos-Renyi model with heterogeneous edge probabilities: agents are grouped into clusters (known o… ▽ More We study a novel heterogeneous multi-agent multi-armed bandit problem with a cluster structure induced by stochastic block models, influencing not only graph topology, but also reward heterogeneity. Specifically, agents are distributed on random graphs based on stochastic block models - a generalized Erdos-Renyi model with heterogeneous edge probabilities: agents are grouped into clusters (known or unknown); edge probabilities for agents within the same cluster differ from those across clusters. In addition, the cluster structure in stochastic block model also determines our heterogeneous rewards. Rewards distributions of the same arm vary across agents in different clusters but remain consistent within a cluster, unifying homogeneous and heterogeneous settings and varying degree of heterogeneity, and rewards are independent samples from these distributions. The objective is to minimize system-wide regret across all agents. To address this, we propose a novel algorithm applicable to both known and unknown cluster settings. The algorithm combines an averaging-based consensus approach with a newly introduced information aggregation and weighting technique, resulting in a UCB-type strategy. It accounts for graph randomness, leverages both intra-cluster (homogeneous) and inter-cluster (heterogeneous) information from rewards and graphs, and incorporates cluster detection for unknown cluster settings. We derive optimal instance-dependent regret upper bounds of order $\log{T}$ under sub-Gaussian rewards. Importantly, our regret bounds capture the degree of heterogeneity in the system (an additional layer of complexity), exhibit smaller constants, scale better for large systems, and impose significantly relaxed assumptions on edge probabilities. In contrast, prior works have not accounted for this refined problem complexity, rely on more stringent assumptions, and exhibit limited scalability. △ Less

Submitted 11 February, 2025; originally announced February 2025.

Comments: 55 pages

arXiv:2501.13868 [pdf, other]

Lost in Siting: The Hidden Carbon Cost of Inequitable Residential Solar Installations

Authors: Cooper Sigrist, Adam Lechowicz, Jovan Champ, Noman Bashir, Mohammad Hajiesmaili

Abstract: The declining cost of solar photovoltaics (PV) combined with strong federal and state-level incentives have resulted in a high number of residential solar PV installations in the US. However, these installations are concentrated in particular regions, such as California, and demographics, such as high-income Asian neighborhoods. This inequitable distribution creates an illusion that further increa… ▽ More The declining cost of solar photovoltaics (PV) combined with strong federal and state-level incentives have resulted in a high number of residential solar PV installations in the US. However, these installations are concentrated in particular regions, such as California, and demographics, such as high-income Asian neighborhoods. This inequitable distribution creates an illusion that further increasing residential solar installations will become increasingly challenging. Furthermore, while the inequity in solar installations has received attention, no prior comprehensive work has been done on understanding whether our current trajectory of residential solar adoption is energy- and carbon-efficient. In this paper, we reveal the hidden energy and carbon cost of the inequitable distribution of existing installations. Using US-based data on carbon offset potential, the amount of avoided carbon emissions from using rooftop PV instead of electric grid energy, and the number of existing solar installations, we surprisingly observe that locations and demographics with a higher carbon offset potential have fewer existing installations. For instance, neighborhoods with relatively higher black population have 7.4% higher carbon offset potential than average but 36.7% fewer installations; lower-income neighborhoods have 14.7% higher potential and 47% fewer installations. We propose several equity- and carbon-aware solar siting strategies. In evaluating these strategies, we develop Sunsight, a toolkit that combines simulation/visualization tools and our relevant datasets, which we are releasing publicly. Our projections show that a multi-objective siting strategy can address two problems at once; namely, it can improve societal outcomes in terms of distributional equity and simultaneously improve the carbon-efficiency (i.e., climate impact) of current installation trends by up to 39.8%. △ Less

Submitted 23 January, 2025; originally announced January 2025.

Comments: 11 pages, 9 figures, E-Energy 2024

arXiv:2412.16539 [pdf, ps, other]

Towards Environmentally Equitable AI

Authors: Mohammad Hajiesmaili, Shaolei Ren, Ramesh K. Sitaraman, Adam Wierman

Abstract: The skyrocketing demand for artificial intelligence (AI) has created an enormous appetite for globally deployed power-hungry servers. As a result, the environmental footprint of AI systems has come under increasing scrutiny. More crucially, the current way that we exploit AI workloads' flexibility and manage AI systems can lead to wildly different environmental impacts across locations, increasing… ▽ More The skyrocketing demand for artificial intelligence (AI) has created an enormous appetite for globally deployed power-hungry servers. As a result, the environmental footprint of AI systems has come under increasing scrutiny. More crucially, the current way that we exploit AI workloads' flexibility and manage AI systems can lead to wildly different environmental impacts across locations, increasingly raising environmental inequity concerns and creating unintended sociotechnical consequences. In this paper, we advocate environmental equity as a priority for the management of future AI systems, advancing the boundaries of existing resource management for sustainable AI and also adding a unique dimension to AI fairness. Concretely, we uncover the potential of equity-aware geographical load balancing to fairly re-distribute the environmental cost across different regions, followed by algorithmic challenges. We conclude by discussing a few future directions to exploit the full potential of system management approaches to mitigate AI's environmental inequity. △ Less

Submitted 21 December, 2024; originally announced December 2024.

Comments: Accepted by Communications of the ACM. All the authors contributed equally and are listed in alphabetical order of last name

arXiv:2411.08167 [pdf, ps, other]

Multi-Agent Stochastic Bandits Robust to Adversarial Corruptions

Authors: Fatemeh Ghaffari, Xuchuang Wang, Jinhang Zuo, Mohammad Hajiesmaili

Abstract: We study the problem of multi-agent multi-armed bandits with adversarial corruption in a heterogeneous setting, where each agent accesses a subset of arms. The adversary can corrupt the reward observations for all agents. Agents share these corrupted rewards with each other, and the objective is to maximize the cumulative total reward of all agents (and not be misled by the adversary). We propose… ▽ More We study the problem of multi-agent multi-armed bandits with adversarial corruption in a heterogeneous setting, where each agent accesses a subset of arms. The adversary can corrupt the reward observations for all agents. Agents share these corrupted rewards with each other, and the objective is to maximize the cumulative total reward of all agents (and not be misled by the adversary). We propose a multi-agent cooperative learning algorithm that is robust to adversarial corruptions. For this newly devised algorithm, we demonstrate that an adversary with an unknown corruption budget $C$ only incurs an additive $O((L / L_{\min}) C)$ term to the standard regret of the model in non-corruption settings, where $L$ is the total number of agents, and $L_{\min}$ is the minimum number of agents with mutual access to an arm. As a side-product, our algorithm also improves the state-of-the-art regret bounds when reducing to both the single-agent and homogeneous multi-agent scenarios, tightening multiplicative $K$ (the number of arms) and $L$ (the number of agents) factors, respectively. △ Less

Submitted 12 November, 2024; originally announced November 2024.

arXiv:2411.01412 [pdf, other]

Near-Optimal Emission-Aware Online Ride Assignment Algorithm for Peak Demand Hours

Authors: Ali Zeynali, Mahsa Sahebdel, Noman Bashir, Ramesh K. Sitaraman, Mohammad Hajiesmaili

Abstract: Ridesharing has experienced significant global growth over the past decade and is becoming integral to future transportation networks. These services offer alternative mobility options in many urban areas, promoting car-light or car-free lifestyles, with their market share rapidly expanding due to the convenience they offer. However, alongside these benefits, concerns have arisen about the environ… ▽ More Ridesharing has experienced significant global growth over the past decade and is becoming integral to future transportation networks. These services offer alternative mobility options in many urban areas, promoting car-light or car-free lifestyles, with their market share rapidly expanding due to the convenience they offer. However, alongside these benefits, concerns have arisen about the environmental impact of ridesharing, particularly its contribution to carbon emissions. A major source of these emissions is deadhead miles that are driven without passengers between trips. This issue is especially pronounced during high-demand periods when the number of ride requests exceeds platform capacity, leading to longer deadhead miles and higher emissions. While reducing these unproductive miles can lower emissions, it may also result in longer wait times for passengers as they wait for a nearby driver, potentially diminishing the overall user experience. In this paper, we propose LARA, an online algorithm for rider-to-driver assignment that dynamically adjusts the maximum allowed deadhead miles for drivers and assigns ride requests accordingly. While LARA can be applied under any conditions, it is particularly more effective during high-demand hours, aiming to reduce both carbon emissions and rider wait times. We prove that LARA achieves near-optimal performance in online settings compared to the optimal offline algorithm. Furthermore, we evaluate LARA using both synthetic and real-world datasets, demonstrating up to 34.2% reduction in emissions and up to 42.9% reduction in rider wait times compared to state-of-the-art algorithms. While recent studies have introduced the problem of emission-aware ride assignment, LARA is the first algorithm to provide both theoretical and empirical guarantees on performance. △ Less

Submitted 2 November, 2024; originally announced November 2024.

Comments: 20 pages

arXiv:2410.17075 [pdf, other]

Combinatorial Logistic Bandits

Authors: Xutong Liu, Xiangxiang Dai, Xuchuang Wang, Mohammad Hajiesmaili, John C. S. Lui

Abstract: We introduce a novel framework called combinatorial logistic bandits (CLogB), where in each round, a subset of base arms (called the super arm) is selected, with the outcome of each base arm being binary and its expectation following a logistic parametric model. The feedback is governed by a general arm triggering process. Our study covers CLogB with reward functions satisfying two smoothness cond… ▽ More We introduce a novel framework called combinatorial logistic bandits (CLogB), where in each round, a subset of base arms (called the super arm) is selected, with the outcome of each base arm being binary and its expectation following a logistic parametric model. The feedback is governed by a general arm triggering process. Our study covers CLogB with reward functions satisfying two smoothness conditions, capturing application scenarios such as online content delivery, online learning to rank, and dynamic channel allocation. We first propose a simple yet efficient algorithm, CLogUCB, utilizing a variance-agnostic exploration bonus. Under the 1-norm triggering probability modulated (TPM) smoothness condition, CLogUCB achieves a regret bound of $\tilde{O}(d\sqrt{κKT})$, where $\tilde{O}$ ignores logarithmic factors, $d$ is the dimension of the feature vector, $κ$ represents the nonlinearity of the logistic model, and $K$ is the maximum number of base arms a super arm can trigger. This result improves on prior work by a factor of $\tilde{O}(\sqrtκ)$. We then enhance CLogUCB with a variance-adaptive version, VA-CLogUCB, which attains a regret bound of $\tilde{O}(d\sqrt{KT})$ under the same 1-norm TPM condition, improving another $\tilde{O}(\sqrtκ)$ factor. VA-CLogUCB shows even greater promise under the stronger triggering probability and variance modulated (TPVM) condition, achieving a leading $\tilde{O}(d\sqrt{T})$ regret, thus removing the additional dependency on the action-size $K$. Furthermore, we enhance the computational efficiency of VA-CLogUCB by eliminating the nonconvex optimization process when the context feature map is time-invariant while maintaining the tight $\tilde{O}(d\sqrt{T})$ regret. Finally, experiments on synthetic and real-world datasets demonstrate the superior performance of our algorithms compared to benchmark algorithms. △ Less

Submitted 13 May, 2025; v1 submitted 22 October, 2024; originally announced October 2024.

Comments: Accepted in ACM SIGMETRICS 2025

arXiv:2408.10201 [pdf, other]

LEAD: Towards Learning-Based Equity-Aware Decarbonization in Ridesharing Platforms

Authors: Mahsa Sahebdel, Ali Zeynali, Noman Bashir, Prashant Shenoy, Mohammad Hajiesmaili

Abstract: Ridesharing platforms such as Uber, Lyft, and DiDi have grown in popularity due to their on-demand availability, ease of use, and commute cost reductions, among other benefits. However, not all ridesharing promises have panned out. Recent studies demonstrate that the expected drop in traffic congestion and reduction in greenhouse gas (GHG) emissions have not materialized. This is primarily due to… ▽ More Ridesharing platforms such as Uber, Lyft, and DiDi have grown in popularity due to their on-demand availability, ease of use, and commute cost reductions, among other benefits. However, not all ridesharing promises have panned out. Recent studies demonstrate that the expected drop in traffic congestion and reduction in greenhouse gas (GHG) emissions have not materialized. This is primarily due to the substantial distances traveled by the ridesharing vehicles without passengers between rides, known as deadhead miles. Recent work has focused on reducing the impact of deadhead miles while considering additional metrics such as rider waiting time, GHG emissions from deadhead miles, or driver earnings. However, most prior studies consider these environmental and equity-based metrics individually despite them being interrelated. In this paper, we propose a Learning-based Equity-Aware Decarabonization approach, LEAD, for ridesharing platforms. LEAD targets minimizing emissions while ensuring that the driver's utility, defined as the difference between the trip distance and the deadhead miles, is fairly distributed. LEAD uses reinforcement learning to match riders with drivers based on the expected future utility of drivers and the expected carbon emissions of the platform without increasing the rider waiting times. Extensive experiments based on a real-world ridesharing dataset show that LEAD improves the defined notion of fairness by 150% when compared to emission-aware ride-assignment and reduces emissions by 14.6% while ensuring fairness within 28--52% of the fairness-focused baseline. It also reduces the rider wait time, by at least 32.1%, compared to a fairness-focused baseline. △ Less

Submitted 12 April, 2025; v1 submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.08859 [pdf, other]

Stochastic Bandits Robust to Adversarial Attacks

Authors: Xuchuang Wang, Jinhang Zuo, Xutong Liu, John C. S. Lui, Mohammad Hajiesmaili

Abstract: This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or without the knowledge of an attack budget $C$, defined as an upper bound of the summation of the difference between the actual and altered rewards. For b… ▽ More This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or without the knowledge of an attack budget $C$, defined as an upper bound of the summation of the difference between the actual and altered rewards. For both cases, we devise two types of algorithms with regret bounds having additive or multiplicative $C$ dependence terms. For the known attack budget case, we prove our algorithms achieve the regret bound of ${O}((K/Δ)\log T + KC)$ and $\tilde{O}(\sqrt{KTC})$ for the additive and multiplicative $C$ terms, respectively, where $K$ is the number of arms, $T$ is the time horizon, $Δ$ is the gap between the expected rewards of the optimal arm and the second-best arm, and $\tilde{O}$ hides the logarithmic factors. For the unknown case, we prove our algorithms achieve the regret bound of $\tilde{O}(\sqrt{KT} + KC^2)$ and $\tilde{O}(KC\sqrt{T})$ for the additive and multiplicative $C$ terms, respectively. In addition to these upper bound results, we provide several lower bounds showing the tightness of our bounds and the optimality of our algorithms. These results delineate an intrinsic separation between the bandits with attacks and corruption models [Lykouris et al., 2018]. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.07831 [pdf, other]

doi 10.1145/3711701

Learning-Augmented Competitive Algorithms for Spatiotemporal Online Allocation with Deadline Constraints

Authors: Adam Lechowicz, Nicolas Christianson, Bo Sun, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

Abstract: We introduce and study spatiotemporal online allocation with deadline constraints ($\mathsf{SOAD}$), a new online problem motivated by emerging challenges in sustainability and energy. In $\mathsf{SOAD}$, an online player completes a workload by allocating and scheduling it on the points of a metric space $(X, d)$ while subject to a deadline $T$. At each time step, a service cost function is revea… ▽ More We introduce and study spatiotemporal online allocation with deadline constraints ($\mathsf{SOAD}$), a new online problem motivated by emerging challenges in sustainability and energy. In $\mathsf{SOAD}$, an online player completes a workload by allocating and scheduling it on the points of a metric space $(X, d)$ while subject to a deadline $T$. At each time step, a service cost function is revealed that represents the cost of servicing the workload at each point, and the player must irrevocably decide the current allocation of work to points. Whenever the player moves this allocation, they incur a movement cost defined by the distance metric $d(\cdot, \ \cdot)$ that captures, e.g., an overhead cost. $\mathsf{SOAD}$ formalizes the open problem of combining general metrics and deadline constraints in the online algorithms literature, unifying problems such as metrical task systems and online search. We propose a competitive algorithm for $\mathsf{SOAD}$ along with a matching lower bound establishing its optimality. Our main algorithm, \textsc{ST-CLIP}, is a learning-augmented algorithm that takes advantage of predictions (e.g., forecasts of relevant costs) and achieves an optimal consistency-robustness trade-off. We evaluate our proposed algorithms in a simulated case study of carbon-aware spatiotemporal workload management, an application in sustainable computing that schedules a delay-tolerant batch compute job on a distributed network of data centers. In these experiments, we show that \textsc{ST-CLIP} substantially improves on heuristic baseline methods. △ Less

Submitted 12 March, 2025; v1 submitted 14 August, 2024; originally announced August 2024.

Comments: Accepted to SIGMETRICS 2025. 49 pages, 21 figures

Journal ref: Proc. ACM Meas. Anal. Comput. Syst. Volume 9, Issue 1, Article 8 (March 2025), 49 pages

arXiv:2406.18752 [pdf, other]

Competitive Algorithms for Online Knapsack with Succinct Predictions

Authors: Mohammadreza Daneshvaramoli, Helia Karisani, Adam Lechowicz, Bo Sun, Cameron Musco, Mohammad Hajiesmaili

Abstract: In the online knapsack problem, the goal is to pack items arriving online with different values and weights into a capacity-limited knapsack to maximize the total value of the accepted items. We study \textit{learning-augmented} algorithms for this problem, which aim to use machine-learned predictions to move beyond pessimistic worst-case guarantees. Existing learning-augmented algorithms for onli… ▽ More In the online knapsack problem, the goal is to pack items arriving online with different values and weights into a capacity-limited knapsack to maximize the total value of the accepted items. We study \textit{learning-augmented} algorithms for this problem, which aim to use machine-learned predictions to move beyond pessimistic worst-case guarantees. Existing learning-augmented algorithms for online knapsack consider relatively complicated prediction models that give an algorithm substantial information about the input, such as the total weight of items at each value. In practice, such predictions can be error-sensitive and difficult to learn. Motivated by this limitation, we introduce a family of learning-augmented algorithms for online knapsack that use \emph{succinct predictions}. In particular, the machine-learned prediction given to the algorithm is just a single value or interval that estimates the minimum value of any item accepted by an offline optimal solution. By leveraging a relaxation to online \emph{fractional} knapsack, we design algorithms that can leverage such succinct predictions in both the trusted setting (i.e., with perfect prediction) and the untrusted setting, where we prove that a simple meta-algorithm achieves a nearly optimal consistency-robustness trade-off. Empirically, we show that our algorithms significantly outperform baselines that do not use predictions and often outperform algorithms based on more complex prediction models. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 29 pages, 10 figures, Submitted to NeurIPS 2024

MSC Class: 68Q25; 68T05 ACM Class: F.2.2; I.2.6

arXiv:2406.01386 [pdf, ps, other]

Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

Authors: Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, John C. S. Lui, Wei Chen

Abstract: We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by lev… ▽ More We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by leveraging distinct statistical properties for multivariant random variables. For CMAB-MT, we propose a general 1-norm multivariant and triggering probability-modulated smoothness condition, and an optimistic CUCB-MT algorithm built upon this condition. Our framework can include many important problems as applications, such as episodic reinforcement learning (RL) and probabilistic maximum coverage for goods distribution, all of which meet the above smoothness condition and achieve matching or improved regret bounds compared to existing works. Through our new framework, we build the first connection between the episodic RL and CMAB literature, by offering a new angle to solve the episodic RL through the lens of CMAB, which may encourage more interactions between these two important directions. △ Less

Submitted 22 April, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

arXiv:2404.15211 [pdf, other]

doi 10.1145/3632775.3661942

LACS: Learning-Augmented Algorithms for Carbon-Aware Resource Scaling with Uncertain Demand

Authors: Roozbeh Bostandoost, Adam Lechowicz, Walid A. Hanafy, Noman Bashir, Prashant Shenoy, Mohammad Hajiesmaili

Abstract: Motivated by an imperative to reduce the carbon emissions of cloud data centers, this paper studies the online carbon-aware resource scaling problem with unknown job lengths (OCSU) and applies it to carbon-aware resource scaling for executing computing workloads. The task is to dynamically scale resources (e.g., the number of servers) assigned to a job of unknown length such that it is completed b… ▽ More Motivated by an imperative to reduce the carbon emissions of cloud data centers, this paper studies the online carbon-aware resource scaling problem with unknown job lengths (OCSU) and applies it to carbon-aware resource scaling for executing computing workloads. The task is to dynamically scale resources (e.g., the number of servers) assigned to a job of unknown length such that it is completed before a deadline, with the objective of reducing the carbon emissions of executing the workload. The total carbon emissions of executing a job originate from the emissions of running the job and excess carbon emitted while switching between different scales (e.g., due to checkpoint and resume). Prior work on carbon-aware resource scaling has assumed accurate job length information, while other approaches have ignored switching losses and require carbon intensity forecasts. These assumptions prohibit the practical deployment of prior work for online carbon-aware execution of scalable computing workload. We propose LACS, a theoretically robust learning-augmented algorithm that solves OCSU. To achieve improved practical average-case performance, LACS integrates machine-learned predictions of job length. To achieve solid theoretical performance, LACS extends the recent theoretical advances on online conversion with switching costs to handle a scenario where the job length is unknown. Our experimental evaluations demonstrate that, on average, the carbon footprint of LACS lies within 1.2% of the online baseline that assumes perfect job length information and within 16% of the offline baseline that, in addition to the job length, also requires accurate carbon intensity forecasts. Furthermore, LACS achieves a 32% reduction in carbon footprint compared to the deadline-aware carbon-agnostic execution of the job. △ Less

Submitted 4 June, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

arXiv:2402.14012 [pdf, other]

Chasing Convex Functions with Long-term Constraints

Authors: Adam Lechowicz, Nicolas Christianson, Bo Sun, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

Abstract: We introduce and study a family of online metric problems with long-term constraints. In these problems, an online player makes decisions $\mathbf{x}_t$ in a metric space $(X,d)$ to simultaneously minimize their hitting cost $f_t(\mathbf{x}_t)$ and switching cost as determined by the metric. Over the time horizon $T$, the player must satisfy a long-term demand constraint… ▽ More We introduce and study a family of online metric problems with long-term constraints. In these problems, an online player makes decisions $\mathbf{x}_t$ in a metric space $(X,d)$ to simultaneously minimize their hitting cost $f_t(\mathbf{x}_t)$ and switching cost as determined by the metric. Over the time horizon $T$, the player must satisfy a long-term demand constraint $\sum_{t} c(\mathbf{x}_t) \geq 1$, where $c(\mathbf{x}_t)$ denotes the fraction of demand satisfied at time $t$. Such problems can find a wide array of applications to online resource allocation in sustainable energy/computing systems. We devise optimal competitive and learning-augmented algorithms for the case of bounded hitting cost gradients and weighted $\ell_1$ metrics, and further show that our proposed algorithms perform well in numerical experiments. △ Less

Submitted 12 July, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted to ICML 2024. 31 pages, 12 figures

arXiv:2402.09687 [pdf, other]

Robust Learning-Augmented Dictionaries

Authors: Ali Zeynali, Shahin Kamali, Mohammad Hajiesmaili

Abstract: We present the first learning-augmented data structure for implementing dictionaries with optimal consistency and robustness. Our data structure, named RobustSL, is a skip list augmented by predictions of access frequencies of elements in a data sequence. With proper predictions, RobustSL has optimal consistency (achieves static optimality). At the same time, it maintains a logarithmic running tim… ▽ More We present the first learning-augmented data structure for implementing dictionaries with optimal consistency and robustness. Our data structure, named RobustSL, is a skip list augmented by predictions of access frequencies of elements in a data sequence. With proper predictions, RobustSL has optimal consistency (achieves static optimality). At the same time, it maintains a logarithmic running time for each operation, ensuring optimal robustness, even if predictions are generated adversarially. Therefore, RobustSL has all the advantages of the recent learning-augmented data structures of Lin, Luo, and Woodruff (ICML 2022) and Cao et al. (arXiv 2023), while providing robustness guarantees that are absent in the previous work. Numerical experiments show that RobustSL outperforms alternative data structures using both synthetic and real datasets. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 11 pages plus 4 pages appendix

arXiv:2402.01644 [pdf, other]

A Holistic Approach for Equity-aware Carbon Reduction of Ridesharing Platforms

Authors: Mahsa Sahebdel, Ali Zeynali, Noman Bashir, Prashant Shenoy, Mohammad Hajiesmaili

Abstract: Ridesharing services have revolutionized personal mobility, offering convenient on-demand transportation anytime. While early proponents of ridesharing suggested that these services would reduce the overall carbon emissions of the transportation sector, recent studies reported a type of rebound effect showing substantial carbon emissions of ridesharing platforms, mainly due to their deadhead miles… ▽ More Ridesharing services have revolutionized personal mobility, offering convenient on-demand transportation anytime. While early proponents of ridesharing suggested that these services would reduce the overall carbon emissions of the transportation sector, recent studies reported a type of rebound effect showing substantial carbon emissions of ridesharing platforms, mainly due to their deadhead miles traveled between two consecutive rides. However, reducing deadhead miles' emissions can incur longer waiting times for riders and starvation of ride assignments for some drivers. Therefore, any efforts towards reducing the carbon emissions from ridesharing platforms must consider the impact on the quality of service, e.g., waiting time, and on the equitable distribution of rides across drivers. This paper proposes a holistic approach to reduce the carbon emissions of ridesharing platforms while minimizing the degradation in user waiting times and equitable ride assignments across drivers. Towards this end, we decompose the global carbon reduction problem into two sub-problems: carbon- and equity-aware ride assignment and fuel-efficient routing. For the ride assignment problem, we consider the trade-off between the amount of carbon reduction and the rider's waiting time and propose simple yet efficient algorithms to handle the conflicting trade-offs. For the routing problem, we analyze the impact of fuel-efficient routing in reducing the carbon footprint, trip duration, and driver efficiency of ridesharing platforms using route data from Google Maps. Our comprehensive trace-driven experimental results show significant emissions reduction with a minor increase in riders' waiting times. Finally, we release E$^2$-RideKit, a toolkit that enables researchers to augment ridesharing datasets with emissions and equity information for further research on emission analysis and platform improvement. △ Less

Submitted 16 February, 2024; v1 submitted 2 January, 2024; originally announced February 2024.

arXiv:2311.01698 [pdf, other]

Adversarial Attacks on Cooperative Multi-agent Bandits

Authors: Jinhang Zuo, Zhiyao Zhang, Xuchuang Wang, Cheng Chen, Shuai Li, John C. S. Lui, Mohammad Hajiesmaili, Adam Wierman

Abstract: Cooperative multi-agent multi-armed bandits (CMA2B) consider the collaborative efforts of multiple agents in a shared multi-armed bandit game. We study latent vulnerabilities exposed by this collaboration and consider adversarial attacks on a few agents with the goal of influencing the decisions of the rest. More specifically, we study adversarial attacks on CMA2B in both homogeneous settings, whe… ▽ More Cooperative multi-agent multi-armed bandits (CMA2B) consider the collaborative efforts of multiple agents in a shared multi-armed bandit game. We study latent vulnerabilities exposed by this collaboration and consider adversarial attacks on a few agents with the goal of influencing the decisions of the rest. More specifically, we study adversarial attacks on CMA2B in both homogeneous settings, where agents operate with the same arm set, and heterogeneous settings, where agents have distinct arm sets. In the homogeneous setting, we propose attack strategies that, by targeting just one agent, convince all agents to select a particular target arm $T-o(T)$ times while incurring $o(T)$ attack costs in $T$ rounds. In the heterogeneous setting, we prove that a target arm attack requires linear attack costs and propose attack strategies that can force a maximum number of agents to suffer linear regrets while incurring sublinear costs and only manipulating the observations of a few target agents. Numerical experiments validate the effectiveness of our proposed attack strategies. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.20598 [pdf, other]

doi 10.1145/3673660.3655074

Online Conversion with Switching Costs: Robust and Learning-Augmented Algorithms

Authors: Adam Lechowicz, Nicolas Christianson, Bo Sun, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

Abstract: We introduce and study online conversion with switching costs, a family of online problems that capture emerging problems at the intersection of energy and sustainability. In this problem, an online player attempts to purchase (alternatively, sell) fractional shares of an asset during a fixed time horizon with length $T$. At each time step, a cost function (alternatively, price function) is reveal… ▽ More We introduce and study online conversion with switching costs, a family of online problems that capture emerging problems at the intersection of energy and sustainability. In this problem, an online player attempts to purchase (alternatively, sell) fractional shares of an asset during a fixed time horizon with length $T$. At each time step, a cost function (alternatively, price function) is revealed, and the player must irrevocably decide an amount of asset to convert. The player also incurs a switching cost whenever their decision changes in consecutive time steps, i.e., when they increase or decrease their purchasing amount. We introduce competitive (robust) threshold-based algorithms for both the minimization and maximization variants of this problem, and show they are optimal among deterministic online algorithms. We then propose learning-augmented algorithms that take advantage of untrusted black-box advice (such as predictions from a machine learning model) to achieve significantly better average-case performance without sacrificing worst-case competitive guarantees. Finally, we empirically evaluate our proposed algorithms using a carbon-aware EV charging case study, showing that our algorithms substantially improve on baseline methods for this problem. △ Less

Submitted 8 November, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: Appeared as a conference paper at SIGMETRICS / Performance '24. 47 pages, 9 figures

arXiv:2310.11558 [pdf, other]

Online Algorithms with Uncertainty-Quantified Predictions

Authors: Bo Sun, Jerry Huang, Nicolas Christianson, Mohammad Hajiesmaili, Adam Wierman, Raouf Boutaba

Abstract: The burgeoning field of algorithms with predictions studies the problem of using possibly imperfect machine learning predictions to improve online algorithm performance. While nearly all existing algorithms in this framework make no assumptions on prediction quality, a number of methods providing uncertainty quantification (UQ) on machine learning models have been developed in recent years, which… ▽ More The burgeoning field of algorithms with predictions studies the problem of using possibly imperfect machine learning predictions to improve online algorithm performance. While nearly all existing algorithms in this framework make no assumptions on prediction quality, a number of methods providing uncertainty quantification (UQ) on machine learning models have been developed in recent years, which could enable additional information about prediction quality at decision time. In this work, we investigate the problem of optimally utilizing uncertainty-quantified predictions in the design of online algorithms. In particular, we study two classic online problems, ski rental and online search, where the decision-maker is provided predictions augmented with UQ describing the likelihood of the ground truth falling within a particular range of values. We demonstrate that non-trivial modifications to algorithm design are needed to fully leverage the UQ predictions. Moreover, we consider how to utilize more general forms of UQ, proposing an online learning framework that learns to exploit UQ to make decisions in multi-instance settings. △ Less

Submitted 3 June, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

arXiv:2310.09920 [pdf, other]

BONES: Near-Optimal Neural-Enhanced Video Streaming

Authors: Lingdong Wang, Simran Singh, Jacob Chakareski, Mohammad Hajiesmaili, Ramesh K. Sitaraman

Abstract: Accessing high-quality video content can be challenging due to insufficient and unstable network bandwidth. Recent advances in neural enhancement have shown promising results in improving the quality of degraded videos through deep learning. Neural-Enhanced Streaming (NES) incorporates this new approach into video streaming, allowing users to download low-quality video segments and then enhance th… ▽ More Accessing high-quality video content can be challenging due to insufficient and unstable network bandwidth. Recent advances in neural enhancement have shown promising results in improving the quality of degraded videos through deep learning. Neural-Enhanced Streaming (NES) incorporates this new approach into video streaming, allowing users to download low-quality video segments and then enhance them to obtain high-quality content without violating the playback of the video stream. We introduce BONES, an NES control algorithm that jointly manages the network and computational resources to maximize the quality of experience (QoE) of the user. BONES formulates NES as a Lyapunov optimization problem and solves it in an online manner with near-optimal performance, making it the first NES algorithm to provide a theoretical performance guarantee. Comprehensive experimental results indicate that BONES increases QoE by 5\% to 20\% over state-of-the-art algorithms with minimal overhead. Our code is available at https://github.com/UMass-LIDS/bones. △ Less

Submitted 10 April, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

arXiv:2309.04023 [pdf, other]

BOLA360: Near-optimal View and Bitrate Adaptation for 360-degree Video Streaming

Authors: Ali Zeynali, Mahsa Sahebdel, Mohammad Hajiesmaili, Ramesh K. Sitaraman

Abstract: Recent advances in omnidirectional cameras and AR/VR headsets have spurred the adoption of 360-degree videos that are widely believed to be the future of online video streaming. 360-degree videos allow users to wear a head-mounted display (HMD) and experience the video as if they are physically present in the scene. Streaming high-quality 360-degree videos at scale is an unsolved problem that is m… ▽ More Recent advances in omnidirectional cameras and AR/VR headsets have spurred the adoption of 360-degree videos that are widely believed to be the future of online video streaming. 360-degree videos allow users to wear a head-mounted display (HMD) and experience the video as if they are physically present in the scene. Streaming high-quality 360-degree videos at scale is an unsolved problem that is more challenging than traditional (2D) video delivery. The data rate required to stream 360-degree videos is an order of magnitude more than traditional videos. Further, the penalty for rebuffering events where the video freezes or displays a blank screen is more severe as it may cause cybersickness. We propose an online adaptive bitrate (ABR) algorithm for 360-degree videos called BOLA360 that runs inside the client's video player and orchestrates the download of video segments from the server so as to maximize the quality-of-experience (QoE) of the user. BOLA360 conserves bandwidth by downloading only those video segments that are likely to fall within the field-of-view (FOV) of the user. In addition, BOLA360 continually adapts the bitrate of the downloaded video segments so as to enable a smooth playback without rebuffering. We prove that BOLA360 is near-optimal with respect to an optimal offline algorithm that maximizes QoE. Further, we evaluate BOLA360 on a wide range of network and user head movement profiles and show that it provides $13.6\%$ to $372.5\%$ more QoE than state-of-the-art algorithms. While ABR algorithms for traditional (2D) videos have been well-studied over the last decade, our work is the first ABR algorithm for 360-degree videos with both theoretical and empirical guarantees on its performance. △ Less

Submitted 1 October, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: 27 pages

arXiv:2308.04314 [pdf, other]

Cooperative Multi-agent Bandits: Distributed Algorithms with Optimal Individual Regret and Constant Communication Costs

Authors: Lin Yang, Xuchuang Wang, Mohammad Hajiesmaili, Lijun Zhang, John C. S. Lui, Don Towsley

Abstract: Recently, there has been extensive study of cooperative multi-agent multi-armed bandits where a set of distributed agents cooperatively play the same multi-armed bandit game. The goal is to develop bandit algorithms with the optimal group and individual regrets and low communication between agents. The prior work tackled this problem using two paradigms: leader-follower and fully distributed algor… ▽ More Recently, there has been extensive study of cooperative multi-agent multi-armed bandits where a set of distributed agents cooperatively play the same multi-armed bandit game. The goal is to develop bandit algorithms with the optimal group and individual regrets and low communication between agents. The prior work tackled this problem using two paradigms: leader-follower and fully distributed algorithms. Prior algorithms in both paradigms achieve the optimal group regret. The leader-follower algorithms achieve constant communication costs but fail to achieve optimal individual regrets. The state-of-the-art fully distributed algorithms achieve optimal individual regrets but fail to achieve constant communication costs. This paper presents a simple yet effective communication policy and integrates it into a learning algorithm for cooperative bandits. Our algorithm achieves the best of both paradigms: optimal individual regret and constant communication costs. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2306.16948 [pdf, other]

doi 10.1145/3604930.3605709

The War of the Efficiencies: Understanding the Tension between Carbon and Energy Optimization

Authors: Walid A. Hanafy, Roozbeh Bostandoost, Noman Bashir, David Irwin, Mohammad Hajiesmaili, Prashant Shenoy

Abstract: Major innovations in computing have been driven by scaling up computing infrastructure, while aggressively optimizing operating costs. The result is a network of worldwide datacenters that consume a large amount of energy, mostly in an energy-efficient manner. Since the electric grid powering these datacenters provided a simple and opaque abstraction of an unlimited and reliable power supply, the… ▽ More Major innovations in computing have been driven by scaling up computing infrastructure, while aggressively optimizing operating costs. The result is a network of worldwide datacenters that consume a large amount of energy, mostly in an energy-efficient manner. Since the electric grid powering these datacenters provided a simple and opaque abstraction of an unlimited and reliable power supply, the computing industry remained largely oblivious to the carbon intensity of the electricity it uses. Much like the rest of the society, it generally treated the carbon intensity of the electricity as constant, which was mostly true for a fossil fuel-driven grid. As a result, the cost-driven objective of increasing energy-efficiency -- by doing more work per unit of energy -- has generally been viewed as the most carbon-efficient approach. However, as the electric grid is increasingly powered by clean energy and is exposing its time-varying carbon intensity, the most energy-efficient operation is no longer necessarily the most carbon-efficient operation. There has been a recent focus on exploiting the flexibility of computing's workloads -- along temporal, spatial, and resource dimensions -- to reduce carbon emissions, which comes at the cost of either performance or energy efficiency. In this paper, we discuss the trade-offs between energy efficiency and carbon efficiency in exploiting computing's flexibility and show that blindly optimizing for energy efficiency is not always the right approach. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: 2nd Workshop on Sustainable Computer Systems (HotCarbon'23)

arXiv:2305.17071 [pdf, other]

Adversarial Attacks on Online Learning to Rank with Click Feedback

Authors: Jinhang Zuo, Zhiyao Zhang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, Adam Wierman

Abstract: Online learning to rank (OLTR) is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. Although potential attacks against OLTR algorithms may cause serious losses in real-world applications, little is known about adversarial attacks on OLTR. This paper studies attack strategies against multiple variants of OLTR. Our… ▽ More Online learning to rank (OLTR) is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. Although potential attacks against OLTR algorithms may cause serious losses in real-world applications, little is known about adversarial attacks on OLTR. This paper studies attack strategies against multiple variants of OLTR. Our first result provides an attack strategy against the UCB algorithm on classical stochastic bandits with binary feedback, which solves the key issues caused by bounded and discrete feedback that previous works can not handle. Building on this result, we design attack algorithms against UCB-based OLTR algorithms in position-based and cascade models. Finally, we propose a general attack strategy against any algorithm under the general click model. Each attack algorithm manipulates the learning agent into choosing the target attack item $T-o(T)$ times, incurring a cumulative cost of $o(T)$. Experiments on synthetic and real data further validate the effectiveness of our proposed attack algorithms. △ Less

Submitted 26 May, 2023; originally announced May 2023.

arXiv:2305.13293 [pdf, other]

Time Fairness in Online Knapsack Problems

Authors: Adam Lechowicz, Rik Sengupta, Bo Sun, Shahin Kamali, Mohammad Hajiesmaili

Abstract: The online knapsack problem is a classic problem in the field of online algorithms. Its canonical version asks how to pack items of different values and weights arriving online into a capacity-limited knapsack so as to maximize the total value of the admitted items. Although optimal competitive algorithms are known for this problem, they may be fundamentally unfair, i.e., individual items may be t… ▽ More The online knapsack problem is a classic problem in the field of online algorithms. Its canonical version asks how to pack items of different values and weights arriving online into a capacity-limited knapsack so as to maximize the total value of the admitted items. Although optimal competitive algorithms are known for this problem, they may be fundamentally unfair, i.e., individual items may be treated inequitably in different ways. We formalize a practically-relevant notion of time fairness which effectively models a trade off between static and dynamic pricing in a motivating application such as cloud resource allocation, and show that existing algorithms perform poorly under this metric. We propose a parameterized deterministic algorithm where the parameter precisely captures the Pareto-optimal trade-off between fairness (static pricing) and competitiveness (dynamic pricing). We show that randomization is theoretically powerful enough to be simultaneously competitive and fair; however, it does not work well in experiments. To further improve the trade-off between fairness and competitiveness, we develop a nearly-optimal learning-augmented algorithm which is fair, consistent, and robust (competitive), showing substantial performance improvements in numerical experiments. △ Less

Submitted 17 April, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted to ICLR 2024. 26 pages, 5 figures

arXiv:2303.17551 [pdf, other]

doi 10.1145/3626776

The Online Pause and Resume Problem: Optimal Algorithms and An Application to Carbon-Aware Load Shifting

Authors: Adam Lechowicz, Nicolas Christianson, Jinhang Zuo, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

Abstract: We introduce and study the online pause and resume problem. In this problem, a player attempts to find the $k$ lowest (alternatively, highest) prices in a sequence of fixed length $T$, which is revealed sequentially. At each time step, the player is presented with a price and decides whether to accept or reject it. The player incurs a switching cost whenever their decision changes in consecutive t… ▽ More We introduce and study the online pause and resume problem. In this problem, a player attempts to find the $k$ lowest (alternatively, highest) prices in a sequence of fixed length $T$, which is revealed sequentially. At each time step, the player is presented with a price and decides whether to accept or reject it. The player incurs a switching cost whenever their decision changes in consecutive time steps, i.e., whenever they pause or resume purchasing. This online problem is motivated by the goal of carbon-aware load shifting, where a workload may be paused during periods of high carbon intensity and resumed during periods of low carbon intensity and incurs a cost when saving or restoring its state. It has strong connections to existing problems studied in the literature on online optimization, though it introduces unique technical challenges that prevent the direct application of existing algorithms. Extending prior work on threshold-based algorithms, we introduce double-threshold algorithms for both the minimization and maximization variants of this problem. We further show that the competitive ratios achieved by these algorithms are the best achievable by any deterministic online algorithm. Finally, we empirically validate our proposed algorithm through case studies on the application of carbon-aware load shifting using real carbon trace data and existing baseline algorithms. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: 34 pages, 12 figures

Journal ref: Proc. ACM Meas. Anal. Comput. Syst. Volume 7, Issue 3, Article 45 (December 2023), 32 pages

arXiv:2303.17110 [pdf, other]

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

Authors: Xutong Liu, Jinhang Zuo, Siwei Wang, John C. S. Lui, Mohammad Hajiesmaili, Adam Wierman, Wei Chen

Abstract: We study contextual combinatorial bandits with probabilistically triggered arms (C$^2$MAB-T) under a variety of smoothness conditions that capture a wide range of applications, such as contextual cascading bandits and contextual influence maximization bandits. Under the triggering probability modulated (TPM) condition, we devise the C$^2$-UCB-T algorithm and propose a novel analysis that achieves… ▽ More We study contextual combinatorial bandits with probabilistically triggered arms (C$^2$MAB-T) under a variety of smoothness conditions that capture a wide range of applications, such as contextual cascading bandits and contextual influence maximization bandits. Under the triggering probability modulated (TPM) condition, we devise the C$^2$-UCB-T algorithm and propose a novel analysis that achieves an $\tilde{O}(d\sqrt{KT})$ regret bound, removing a potentially exponentially large factor $O(1/p_{\min})$, where $d$ is the dimension of contexts, $p_{\min}$ is the minimum positive probability that any arm can be triggered, and batch-size $K$ is the maximum number of arms that can be triggered per round. Under the variance modulated (VM) or triggering probability and variance modulated (TPVM) conditions, we propose a new variance-adaptive algorithm VAC$^2$-UCB and derive a regret bound $\tilde{O}(d\sqrt{T})$, which is independent of the batch-size $K$. As a valuable by-product, our analysis technique and variance-adaptive algorithm can be applied to the CMAB-T and C$^2$MAB setting, improving existing results there as well. We also include experiments that demonstrate the improved performance of our algorithms compared with benchmark algorithms on synthetic and real-world datasets. △ Less

Submitted 18 November, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: The 40th International Conference on Machine Learning (ICML), 2023

arXiv:2303.06396 [pdf, other]

No-regret Algorithms for Fair Resource Allocation

Authors: Abhishek Sinha, Ativ Joshi, Rajarshi Bhattacharjee, Cameron Musco, Mohammad Hajiesmaili

Abstract: We consider a fair resource allocation problem in the no-regret setting against an unrestricted adversary. The objective is to allocate resources equitably among several agents in an online fashion so that the difference of the aggregate $α$-fair utilities of the agents between an optimal static clairvoyant allocation and that of the online policy grows sub-linearly with time. The problem is chall… ▽ More We consider a fair resource allocation problem in the no-regret setting against an unrestricted adversary. The objective is to allocate resources equitably among several agents in an online fashion so that the difference of the aggregate $α$-fair utilities of the agents between an optimal static clairvoyant allocation and that of the online policy grows sub-linearly with time. The problem is challenging due to the non-additive nature of the $α$-fairness function. Previously, it was shown that no online policy can exist for this problem with a sublinear standard regret. In this paper, we propose an efficient online resource allocation policy, called Online Proportional Fair (OPF), that achieves $c_α$-approximate sublinear regret with the approximation factor $c_α=(1-α)^{-(1-α)}\leq 1.445,$ for $0\leq α< 1$. The upper bound to the $c_α$-regret for this problem exhibits a surprising phase transition phenomenon. The regret bound changes from a power-law to a constant at the critical exponent $α=\frac{1}{2}.$ As a corollary, our result also resolves an open problem raised by Even-Dar et al. [2009] on designing an efficient no-regret policy for the online job scheduling problem in certain parameter regimes. The proof of our results introduces new algorithmic and analytical techniques, including greedy estimation of the future gradients for non-additive global reward functions and bootstrapping adaptive regret bounds, which may be of independent interest. △ Less

Submitted 11 March, 2023; originally announced March 2023.

arXiv:2302.07446 [pdf, other]

On-Demand Communication for Asynchronous Multi-Agent Bandits

Authors: Yu-Zhen Janice Chen, Lin Yang, Xuchuang Wang, Xutong Liu, Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Abstract: This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents operate asynchronously -- agent pull times and rates are unknown, irregular, and heterogeneous -- and face the same instance of a K-armed bandit problem. Agents can share reward information to speed up the learning process at additional communication costs. We propose ODC, an on-demand communication pro… ▽ More This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents operate asynchronously -- agent pull times and rates are unknown, irregular, and heterogeneous -- and face the same instance of a K-armed bandit problem. Agents can share reward information to speed up the learning process at additional communication costs. We propose ODC, an on-demand communication protocol that tailors the communication of each pair of agents based on their empirical pull times. ODC is efficient when the pull times of agents are highly heterogeneous, and its communication complexity depends on the empirical pull times of agents. ODC is a generic protocol that can be integrated into most cooperative bandit algorithms without degrading their performance. We then incorporate ODC into the natural extensions of UCB and AAE algorithms and propose two communication-efficient cooperative algorithms. Our analysis shows that both algorithms are near-optimal in regret. △ Less

Submitted 30 August, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: Accepted by AISTATS 2023

arXiv:2301.06087 [pdf, other]

Near-optimal Online Algorithms for Joint Pricing and Scheduling in EV Charging Networks

Authors: Roozbeh Bostandoost, Bo Sun, Carlee Joe-Wong, Mohammad Hajiesmaili

Abstract: With the rapid acceleration of transportation electrification, public charging stations are becoming vital infrastructure in a smart sustainable city to provide on-demand electric vehicle (EV) charging services. As more consumers seek to utilize public charging services, the pricing and scheduling of such services will become vital, complementary tools to mediate competition for charging resources… ▽ More With the rapid acceleration of transportation electrification, public charging stations are becoming vital infrastructure in a smart sustainable city to provide on-demand electric vehicle (EV) charging services. As more consumers seek to utilize public charging services, the pricing and scheduling of such services will become vital, complementary tools to mediate competition for charging resources. However, determining the right prices to charge is difficult due to the online nature of EV arrivals. This paper studies a joint pricing and scheduling problem for the operator of EV charging networks with limited charging capacity and time-varying energy cost. Upon receiving a charging request, the operator offers a price, and the EV decides whether to admit the offer based on its own value and the posted price. The operator then schedules the real-time charging process to satisfy the charging request if the EV admits the offer. We propose an online pricing algorithm that can determine the posted price and EV charging schedule to maximize social welfare, i.e., the total value of EVs minus the energy cost of charging stations. Theoretically, we prove the devised algorithm can achieve the order-optimal competitive ratio under the competitive analysis framework. Practically, we show the empirical performance of our algorithm outperforms other benchmark algorithms in experiments using real EV charging data. △ Less

Submitted 26 April, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

arXiv:2301.04747 [pdf, other]

doi 10.1145/3575813.3576870

Equitable Network-Aware Decarbonization of Residential Heating at City Scale

Authors: Adam Lechowicz, Noman Bashir, John Wamburu, Mohammad Hajiesmaili, Prashant Shenoy

Abstract: Residential heating, primarily powered by natural gas, accounts for a significant portion of residential sector energy use and carbon emissions in many parts of the world. Hence, there is a push towards decarbonizing residential heating by transitioning to energy-efficient heat pumps powered by an increasingly greener and less carbon-intensive electric grid. However, such a transition will add add… ▽ More Residential heating, primarily powered by natural gas, accounts for a significant portion of residential sector energy use and carbon emissions in many parts of the world. Hence, there is a push towards decarbonizing residential heating by transitioning to energy-efficient heat pumps powered by an increasingly greener and less carbon-intensive electric grid. However, such a transition will add additional load to the electric grid triggering infrastructure upgrades, and subsequently erode the customer base using the gas distribution network. Utilities want to guide these transition efforts to ensure a phased decommissioning of the gas network and deferred electric grid infrastructure upgrades while achieving carbon reduction goals. To facilitate such a transition, we present a network-aware optimization framework for decarbonizing residential heating at city scale with an objective to maximize carbon reduction under budgetary constraints. Our approach operates on a graph representation of the gas network topology to compute the cost of transitioning and select neighborhoods for transition. We further extend our approach to explicitly incorporate equity and ensure an equitable distribution of benefits across different socioeconomic groups. We apply our framework to a city in the New England region of the U.S., using real-world gas usage, electric usage, and grid infrastructure data. We show that our network-aware strategy achieves 55% higher carbon reductions than prior network-oblivious work under the same budget. Our equity-aware strategy achieves an equitable outcome while preserving the carbon reduction benefits of the network-aware strategy. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: Accepted to e-Energy 2023. 12 pages, 10 figures

arXiv:2211.06567 [pdf, ps, other]

Online Search with Predictions: Pareto-optimal Algorithm and its Applications in Energy Markets

Authors: Russell Lee, Bo Sun, Mohammad Hajiesmaili, John C. S. Lui

Abstract: This paper develops learning-augmented algorithms for energy trading in volatile electricity markets. The basic problem is to sell (or buy) $k$ units of energy for the highest revenue (lowest cost) over uncertain time-varying prices, which can framed as a classic online search problem in the literature of competitive analysis. State-of-the-art algorithms assume no knowledge about future market pri… ▽ More This paper develops learning-augmented algorithms for energy trading in volatile electricity markets. The basic problem is to sell (or buy) $k$ units of energy for the highest revenue (lowest cost) over uncertain time-varying prices, which can framed as a classic online search problem in the literature of competitive analysis. State-of-the-art algorithms assume no knowledge about future market prices when they make trading decisions in each time slot, and aim for guaranteeing the performance for the worst-case price sequence. In practice, however, predictions about future prices become commonly available by leveraging machine learning. This paper aims to incorporate machine-learned predictions to design competitive algorithms for online search problems. An important property of our algorithms is that they achieve performances competitive with the offline algorithm in hindsight when the predictions are accurate (i.e., consistency) and also provide worst-case guarantees when the predictions are arbitrarily wrong (i.e., robustness). The proposed algorithms achieve the Pareto-optimal trade-off between consistency and robustness, where no other algorithms for online search can improve on the consistency for a given robustness. Further, we extend the basic online search problem to a more general inventory management setting that can capture storage-assisted energy trading in electricity markets. In empirical evaluations using traces from real-world applications, our learning-augmented algorithms improve the average empirical performance compared to benchmark algorithms, while also providing improved worst-case performance. △ Less

Submitted 27 February, 2024; v1 submitted 11 November, 2022; originally announced November 2022.

arXiv:2209.11934 [pdf, ps, other]

The Online Knapsack Problem with Departures

Authors: Bo Sun, Lin Yang, Mohammad Hajiesmaili, Adam Wierman, John C. S. Lui, Don Towsley, Danny H. K. Tsang

Abstract: The online knapsack problem is a classic online resource allocation problem in networking and operations research. Its basic version studies how to pack online arriving items of different sizes and values into a capacity-limited knapsack. In this paper, we study a general version that includes item departures, while also considering multiple knapsacks and multi-dimensional item sizes. We design a… ▽ More The online knapsack problem is a classic online resource allocation problem in networking and operations research. Its basic version studies how to pack online arriving items of different sizes and values into a capacity-limited knapsack. In this paper, we study a general version that includes item departures, while also considering multiple knapsacks and multi-dimensional item sizes. We design a threshold-based online algorithm and prove that the algorithm can achieve order-optimal competitive ratios. Beyond worst-case performance guarantees, we also aim to achieve near-optimal average performance under typical instances. Towards this goal, we propose a data-driven online algorithm that learns within a policy-class that guarantees a worst-case performance bound. In trace-driven experiments, we show that our data-driven algorithm outperforms other benchmark algorithms in an application of online knapsack to job scheduling for cloud computing. △ Less

Submitted 15 March, 2023; v1 submitted 24 September, 2022; originally announced September 2022.

arXiv:2209.06112 [pdf, other]

CU-Net: Real-Time High-Fidelity Color Upsampling for Point Clouds

Authors: Lingdong Wang, Mohammad Hajiesmaili, Jacob Chakareski, Ramesh K. Sitaraman

Abstract: Point cloud upsampling is essential for high-quality augmented reality, virtual reality, and telepresence applications, due to the capture, processing, and communication limitations of existing technologies. Although geometry upsampling to densify a point cloud's coordinates has been well studied, the upsampling of the color attributes has been largely overlooked. In this paper, we propose CU-Net,… ▽ More Point cloud upsampling is essential for high-quality augmented reality, virtual reality, and telepresence applications, due to the capture, processing, and communication limitations of existing technologies. Although geometry upsampling to densify a point cloud's coordinates has been well studied, the upsampling of the color attributes has been largely overlooked. In this paper, we propose CU-Net, the first deep-learning point cloud color upsampling model that enables low latency and high visual fidelity operation. CU-Net achieves linear time and space complexity by leveraging a feature extractor based on sparse convolution and a color prediction module based on neural implicit function. Therefore, CU-Net is theoretically guaranteed to be more efficient than most existing methods with quadratic complexity. Experimental results demonstrate that CU-Net can colorize a photo-realistic point cloud with nearly a million points in real time, while having notably better visual performance than baselines. Besides, CU-Net can adapt to arbitrary upsampling ratios and unseen objects without retraining. Our source code is available at https://github.com/UMass-LIDS/cunet. △ Less

Submitted 16 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

arXiv:2201.09353 [pdf, other]

Distributed Bandits with Heterogeneous Agents

Authors: Lin Yang, Yu-zhen Janice Chen, Mohammad Hajiesmaili, John CS Lui, Don Towsley

Abstract: This paper tackles a multi-agent bandit setting where $M$ agents cooperate together to solve the same instance of a $K$-armed stochastic bandit problem. The agents are \textit{heterogeneous}: each agent has limited access to a local subset of arms and the agents are asynchronous with different gaps between decision-making rounds. The goal for each agent is to find its optimal local arm, and agents… ▽ More This paper tackles a multi-agent bandit setting where $M$ agents cooperate together to solve the same instance of a $K$-armed stochastic bandit problem. The agents are \textit{heterogeneous}: each agent has limited access to a local subset of arms and the agents are asynchronous with different gaps between decision-making rounds. The goal for each agent is to find its optimal local arm, and agents can cooperate by sharing their observations with others. While cooperation between agents improves the performance of learning, it comes with an additional complexity of communication between agents. For this heterogeneous multi-agent setting, we propose two learning algorithms, \ucbo and \AAE. We prove that both algorithms achieve order-optimal regret, which is $O\left(\sum_{i:\tildeΔ_i>0} \log T/\tildeΔ_i\right)$, where $\tildeΔ_i$ is the minimum suboptimality gap between the reward mean of arm $i$ and any local optimal arm. In addition, a careful selection of the valuable information for cooperation, \AAE achieves a low communication complexity of $O(\log T)$. Last, numerical experiments verify the efficiency of both algorithms. △ Less

Submitted 16 February, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

arXiv:2109.01556 [pdf, other]

Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems

Authors: Bo Sun, Russell Lee, Mohammad Hajiesmaili, Adam Wierman, Danny H. K. Tsang

Abstract: This paper leverages machine-learned predictions to design competitive algorithms for online conversion problems with the goal of improving the competitive ratio when predictions are accurate (i.e., consistency), while also guaranteeing a worst-case competitive ratio regardless of the prediction quality (i.e., robustness). We unify the algorithmic design of both integral and fractional conversion… ▽ More This paper leverages machine-learned predictions to design competitive algorithms for online conversion problems with the goal of improving the competitive ratio when predictions are accurate (i.e., consistency), while also guaranteeing a worst-case competitive ratio regardless of the prediction quality (i.e., robustness). We unify the algorithmic design of both integral and fractional conversion problems, which are also known as the 1-max-search and one-way trading problems, into a class of online threshold-based algorithms (OTA). By incorporating predictions into design of OTA, we achieve the Pareto-optimal trade-off of consistency and robustness, i.e., no online algorithm can achieve a better consistency guarantee given for a robustness guarantee. We demonstrate the performance of OTA using numerical experiments on Bitcoin conversion. △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2106.08872 [pdf, other]

Enabling Sustainable Clouds: The Case for Virtualizing the Energy System

Authors: Noman Bashir, Tian Guo, Mohammad Hajiesmaili, David Irwin, Prashant Shenoy, Ramesh Sitaraman, Abel Souza, Adam Wierman

Abstract: Cloud platforms' growing energy demand and carbon emissions are raising concern about their environmental sustainability. The current approach to enabling sustainable clouds focuses on improving energy-efficiency and purchasing carbon offsets. These approaches have limits: many cloud data centers already operate near peak efficiency, and carbon offsets cannot scale to near zero carbon where there… ▽ More Cloud platforms' growing energy demand and carbon emissions are raising concern about their environmental sustainability. The current approach to enabling sustainable clouds focuses on improving energy-efficiency and purchasing carbon offsets. These approaches have limits: many cloud data centers already operate near peak efficiency, and carbon offsets cannot scale to near zero carbon where there is little carbon left to offset. Instead, enabling sustainable clouds will require applications to adapt to when and where unreliable low-carbon energy is available. Applications cannot do this today because their energy use and carbon emissions are not visible to them, as the energy system provides the rigid abstraction of a continuous, reliable energy supply. This vision paper instead advocates for a ``carbon first'' approach to cloud design that elevates carbon-efficiency to a first-class metric. To do so, we argue that cloud platforms should virtualize the energy system by exposing visibility into, and software-defined control of, it to applications, enabling them to define their own abstractions for managing energy and carbon emissions based on their own requirements. △ Less

Submitted 16 June, 2021; originally announced June 2021.

arXiv:2012.05361 [pdf, ps, other]

Data-driven Competitive Algorithms for Online Knapsack and Set Cover

Authors: Ali Zeynali, Bo Sun, Mohammad Hajiesmaili, Adam Wierman

Abstract: The design of online algorithms has tended to focus on algorithms with worst-case guarantees, e.g., bounds on the competitive ratio. However, it is well-known that such algorithms are often overly pessimistic, performing sub-optimally on non-worst-case inputs. In this paper, we develop an approach for data-driven design of online algorithms that maintain near-optimal worst-case guarantees while al… ▽ More The design of online algorithms has tended to focus on algorithms with worst-case guarantees, e.g., bounds on the competitive ratio. However, it is well-known that such algorithms are often overly pessimistic, performing sub-optimally on non-worst-case inputs. In this paper, we develop an approach for data-driven design of online algorithms that maintain near-optimal worst-case guarantees while also performing learning in order to perform well for typical inputs. Our approach is to identify policy classes that admit global worst-case guarantees, and then perform learning using historical data within the policy classes. We demonstrate the approach in the context of two classical problems, online knapsack and online set cover, proving competitive bounds for rich policy classes in each case. Additionally, we illustrate the practical implications via a case study on electric vehicle charging. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2010.00412 [pdf, other]

Competitive Algorithms for the Online Multiple Knapsack Problem with Application to Electric Vehicle Charging

Authors: Bo Sun, Ali Zeynali, Tongxin Li, Mohammad Hajiesmaili, Adam Wierman, Danny H. K. Tsang

Abstract: We introduce and study a general version of the fractional online knapsack problem with multiple knapsacks, heterogeneous constraints on which items can be assigned to which knapsack, and rate-limiting constraints on the assignment of items to knapsacks. This problem generalizes variations of the knapsack problem and of the one-way trading problem that have previously been treated separately, and… ▽ More We introduce and study a general version of the fractional online knapsack problem with multiple knapsacks, heterogeneous constraints on which items can be assigned to which knapsack, and rate-limiting constraints on the assignment of items to knapsacks. This problem generalizes variations of the knapsack problem and of the one-way trading problem that have previously been treated separately, and additionally finds application to the real-time control of electric vehicle (EV) charging. We introduce a new algorithm that achieves a competitive ratio within an additive factor of one of the best achievable competitive ratios for the general problem and matches or improves upon the best-known competitive ratio for special cases in the knapsack and one-way trading literatures. Moreover, our analysis provides a novel approach to online algorithm design based on an instance-dependent primal-dual analysis that connects the identification of worst-case instances to the design of algorithms. Finally, we illustrate the proposed algorithm via trace-based experiments of EV charging. △ Less

Submitted 17 October, 2020; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2005.12234 [pdf, other]

doi 10.1145/3396851.3397755

Emission-aware Energy Storage Scheduling for a Greener Grid

Authors: Rishikesh Jha, Stephen Lee, Srinivasan Iyengar, Mohammad H. Hajiesmaili, David Irwin, Prashant Shenoy

Abstract: Reducing our reliance on carbon-intensive energy sources is vital for reducing the carbon footprint of the electric grid. Although the grid is seeing increasing deployments of clean, renewable sources of energy, a significant portion of the grid demand is still met using traditional carbon-intensive energy sources. In this paper, we study the problem of using energy storage deployed in the grid to… ▽ More Reducing our reliance on carbon-intensive energy sources is vital for reducing the carbon footprint of the electric grid. Although the grid is seeing increasing deployments of clean, renewable sources of energy, a significant portion of the grid demand is still met using traditional carbon-intensive energy sources. In this paper, we study the problem of using energy storage deployed in the grid to reduce the grid's carbon emissions. While energy storage has previously been used for grid optimizations such as peak shaving and smoothing intermittent sources, our insight is to use distributed storage to enable utilities to reduce their reliance on their less efficient and most carbon-intensive power plants and thereby reduce their overall emission footprint. We formulate the problem of emission-aware scheduling of distributed energy storage as an optimization problem, and use a robust optimization approach that is well-suited for handling the uncertainty in load predictions, especially in the presence of intermittent renewables such as solar and wind. We evaluate our approach using a state of the art neural network load forecasting technique and real load traces from a distribution grid with 1,341 homes. Our results show a reduction of >0.5 million kg in annual carbon emissions -- equivalent to a drop of 23.3% in our electric grid emissions. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 11 pages, 7 figure, This paper will appear in the Proceedings of the ACM International Conference on Future Energy Systems (e-Energy 20) June 2020, Australia

arXiv:2004.04302 [pdf, other]

Hedge Your Bets: Optimizing Long-term Cloud Costs by Mixing VM Purchasing Options

Authors: Pradeep Ambati, Noman Bashir, David Irwin, Mohammad Hajiesmaili, Prashant Shenoy

Abstract: Cloud platforms offer the same VMs under many purchasing options that specify different costs and time commitments, such as on-demand, reserved, sustained-use, scheduled reserve, transient, and spot block. In general, the stronger the commitment, i.e., longer and less flexible, the lower the price. However, longer and less flexible time commitments can increase cloud costs for users if future work… ▽ More Cloud platforms offer the same VMs under many purchasing options that specify different costs and time commitments, such as on-demand, reserved, sustained-use, scheduled reserve, transient, and spot block. In general, the stronger the commitment, i.e., longer and less flexible, the lower the price. However, longer and less flexible time commitments can increase cloud costs for users if future workloads cannot utilize the VMs they committed to buying. Large cloud customers often find it challenging to choose the right mix of purchasing options to reduce their long-term costs, while retaining the ability to adjust capacity up and down in response to workload variations. To address the problem, we design policies to optimize long-term cloud costs by selecting a mix of VM purchasing options based on short- and long-term expectations of workload utilization. We consider a batch trace spanning 4 years from a large shared cluster for a major state University system that includes 14k cores and 60 million job submissions, and evaluate how these jobs could be judiciously executed using cloud servers using our approach. Our results show that our policies incur a cost within 41% of an optimistic optimal offline approach, and 50% less than solely using on-demand VMs. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: 11 pages, 10 figures. This paper will appear in the Proceedings of the IEEE International Conference on Cloud Engineering, April 2020

arXiv:1911.07972 [pdf, other]

Learning-Assisted Competitive Algorithms for Peak-Aware Energy Scheduling

Authors: Russell Lee, Mohammad H. Hajiesmaili, Jian Li

Abstract: In this paper, we study the peak-aware energy scheduling problem using the competitive framework with machine learning prediction. With the uncertainty of energy demand as the fundamental challenge, the goal is to schedule the energy output of local generation units such that the electricity bill is minimized. While this problem has been tackled using classic competitive design with worst-case gua… ▽ More In this paper, we study the peak-aware energy scheduling problem using the competitive framework with machine learning prediction. With the uncertainty of energy demand as the fundamental challenge, the goal is to schedule the energy output of local generation units such that the electricity bill is minimized. While this problem has been tackled using classic competitive design with worst-case guarantee, the goal of this paper is to develop learning-assisted competitive algorithms to improve the performance in a provable manner. We develop two deterministic and randomized algorithms that are provably robust against the poor performance of learning prediction, however, achieve the optimal performance as the error of prediction goes to zero. Extensive experiments using real data traces verify our theoretical observations and show 15.13% improved performance against pure online algorithms. △ Less

Submitted 18 November, 2019; originally announced November 2019.

arXiv:1904.13387 [pdf, other]

doi 10.1109/CDC40024.2019.9142286

Risk-Averse Explore-Then-Commit Algorithms for Finite-Time Bandits

Authors: Ali Yekkehkhany, Ebrahim Arian, Mohammad Hajiesmaili, Rakesh Nagi

Abstract: In this paper, we study multi-armed bandit problems in explore-then-commit setting. In our proposed explore-then-commit setting, the goal is to identify the best arm after a pure experimentation (exploration) phase and exploit it once or for a given finite number of times. We identify that although the arm with the highest expected reward is the most desirable objective for infinite exploitations,… ▽ More In this paper, we study multi-armed bandit problems in explore-then-commit setting. In our proposed explore-then-commit setting, the goal is to identify the best arm after a pure experimentation (exploration) phase and exploit it once or for a given finite number of times. We identify that although the arm with the highest expected reward is the most desirable objective for infinite exploitations, it is not necessarily the one that is most probable to have the highest reward in a single or finite-time exploitations. Alternatively, we advocate the idea of risk-aversion where the objective is to compete against the arm with the best risk-return trade-off. Then, we propose two algorithms whose objectives are to select the arm that is most probable to reward the most. Using a new notion of finite-time exploitation regret, we find an upper bound for the minimum number of experiments before commitment, to guarantee an upper bound for the regret. As compared to existing risk-averse bandit algorithms, our algorithms do not rely on hyper-parameters, resulting in a more robust behavior in practice, which is verified by the numerical evaluation. △ Less

Submitted 11 September, 2019; v1 submitted 30 April, 2019; originally announced April 2019.

Report number: https://ieeexplore.ieee.org/document/9142286

Journal ref: 2019 IEEE 58th Conference on Decision and Control (CDC)

arXiv:1901.04372 [pdf, other]

Online Inventory Management with Application to Energy Procurement in Data Centers

Authors: Lin Yang, Mohammad H. Hajiesmaili, Ramesh Sitaraman, Enrique Mallada, Wing S. Wong, Adam Wierman

Abstract: Motivated by the application of energy storage management in electricity markets, this paper considers the problem of online linear programming with inventory management constraints. Specifically, a decision maker should satisfy some units of an asset as her demand, either form a market with time-varying price or from her own inventory. The decision maker is presented a price in slot-by-slot manne… ▽ More Motivated by the application of energy storage management in electricity markets, this paper considers the problem of online linear programming with inventory management constraints. Specifically, a decision maker should satisfy some units of an asset as her demand, either form a market with time-varying price or from her own inventory. The decision maker is presented a price in slot-by-slot manner, and must immediately decide the purchased amount with the current price to cover the demand or to store in inventory for covering the future demand. The inventory has a limited capacity and its critical role is to buy and store assets at low price and use the stored assets to cover the demand at high price. The ultimate goal of the decision maker is to cover the demands while minimizing the cost of buying assets from the market. We propose BatMan, an online algorithm for simple inventory models, and BatManRate, an extended version for the case with rate constraints. Both BatMan and BatManRate achieve optimal competitive ratios, meaning that no other online algorithm can achieve a better theoretical guarantee. To illustrate the results, we use the proposed algorithms to design and evaluate energy procurement and storage management strategies for data centers with a portfolio of energy sources including the electric grid, local renewable generation, and energy storage systems. △ Less

Submitted 14 January, 2019; originally announced January 2019.

Showing 1–50 of 59 results for author: Hajiesmaili, M