Skip to main content

Showing 1–7 of 7 results for author: Akbarzadeh, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.09804  [pdf, other

    cs.LG

    Fair Resource Allocation in Weakly Coupled Markov Decision Processes

    Authors: Xiaohui Tu, Yossiri Adulyasak, Nima Akbarzadeh, Erick Delage

    Abstract: We consider fair resource allocation in sequential decision-making environments modeled as weakly coupled Markov decision processes, where resource constraints couple the action spaces of $N$ sub-Markov decision processes (sub-MDPs) that would otherwise operate independently. We adopt a fairness definition using the generalized Gini function instead of the traditional utilitarian (total-sum) objec… ▽ More

    Submitted 27 April, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

  2. arXiv:2410.23029  [pdf, other

    cs.LG eess.SY

    Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem

    Authors: Nima Akbarzadeh, Yossiri Adulyasak, Erick Delage

    Abstract: In restless multi-arm bandits, a central agent is tasked with optimally distributing limited resources across several bandits (arms), with each arm being a Markov decision process. In this work, we generalize the traditional restless multi-arm bandit problem with a risk-neutral objective by incorporating risk-awareness. We establish indexability conditions for the case of a risk-aware objective an… ▽ More

    Submitted 10 April, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

  3. arXiv:2306.05991  [pdf, other

    cs.LG

    Approximate information state based convergence analysis of recurrent Q-learning

    Authors: Erfan Seyedsalehi, Nima Akbarzadeh, Amit Sinha, Aditya Mahajan

    Abstract: In spite of the large literature on reinforcement learning (RL) algorithms for partially observable Markov decision processes (POMDPs), a complete theoretical understanding is still lacking. In a partially observable setting, the history of data available to the agent increases over time so most practical algorithms either truncate the history to a finite window or compress it using a recurrent ne… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 25 pages, 6 figures

  4. arXiv:2209.10914  [pdf, other

    cs.AR

    Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources

    Authors: Sina Darabi, Mohammad Sadrosadati, Joël Lindegger, Negar Akbarzadeh, Mohammad Hosseini, Jisung Park, Juan Gómez-Luna, Hamid Sarbazi-Azad, Onur Mutlu

    Abstract: Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel applications. In many GPU applications, GPU memory bandwidth bottlenecks performance, causing underutilization of GPU cores. Hence, disabling many cores does not affect the performance of memory-bound workloads. While simply power-gating unused GPU cores would save energy, prior works attempt to better utilize GPU core… ▽ More

    Submitted 6 April, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: To appear in 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022

  5. arXiv:2202.03463  [pdf, other

    cs.LG eess.SY

    On learning Whittle index policy for restless bandits with scalable regret

    Authors: Nima Akbarzadeh, Aditya Mahajan

    Abstract: Reinforcement learning is an attractive approach to learn good resource allocation and scheduling policies based on data when the system model is unknown. However, the cumulative regret of most RL algorithms scales as $\tilde O(\mathsf{S} \sqrt{\mathsf{A} T})$, where $\mathsf{S}$ is the size of the state space, $\mathsf{A}$ is the size of the action space, $T$ is the horizon, and the… ▽ More

    Submitted 26 April, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

  6. arXiv:2111.06437  [pdf, other

    cs.RO eess.SY

    Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach

    Authors: Abhinav Dahiya, Nima Akbarzadeh, Aditya Mahajan, Stephen L. Smith

    Abstract: In this paper, we consider the problem of allocating human operators in a system with multiple semi-autonomous robots. Each robot is required to perform an independent sequence of tasks, subjected to a chance of failing and getting stuck in a fault state at every task. If and when required, a human operator can assist or teleoperate a robot. Conventional MDP techniques used to solve such problems… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 11 pages + 4 page Appendix, 7 Figures

  7. arXiv:1605.06651  [pdf, other

    cs.LG

    Gambler's Ruin Bandit Problem

    Authors: Nima Akbarzadeh, Cem Tekin

    Abstract: In this paper, we propose a new multi-armed bandit problem called the Gambler's Ruin Bandit Problem (GRBP). In the GRBP, the learner proceeds in a sequence of rounds, where each round is a Markov Decision Process (MDP) with two actions (arms): a continuation action that moves the learner randomly over the state space around the current state; and a terminal action that moves the learner directly i… ▽ More

    Submitted 28 September, 2016; v1 submitted 21 May, 2016; originally announced May 2016.