-
Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks
Authors:
Benoît-Marie Robaglia,
Marceau Coupechoux,
Dimitrios Tsilimantos
Abstract:
This article addresses the problem of Ultra Reliable Low Latency Communications (URLLC) in wireless networks, a framework with particularly stringent constraints imposed by many Internet of Things (IoT) applications from diverse sectors. We propose a novel Deep Reinforcement Learning (DRL) scheduling algorithm, named NOMA-PPO, to solve the Non-Orthogonal Multiple Access (NOMA) uplink URLLC schedul…
▽ More
This article addresses the problem of Ultra Reliable Low Latency Communications (URLLC) in wireless networks, a framework with particularly stringent constraints imposed by many Internet of Things (IoT) applications from diverse sectors. We propose a novel Deep Reinforcement Learning (DRL) scheduling algorithm, named NOMA-PPO, to solve the Non-Orthogonal Multiple Access (NOMA) uplink URLLC scheduling problem involving strict deadlines. The challenge of addressing uplink URLLC requirements in NOMA systems is related to the combinatorial complexity of the action space due to the possibility to schedule multiple devices, and to the partial observability constraint that we impose to our algorithm in order to meet the IoT communication constraints and be scalable. Our approach involves 1) formulating the NOMA-URLLC problem as a Partially Observable Markov Decision Process (POMDP) and the introduction of an agent state, serving as a sufficient statistic of past observations and actions, enabling a transformation of the POMDP into a Markov Decision Process (MDP); 2) adapting the Proximal Policy Optimization (PPO) algorithm to handle the combinatorial action space; 3) incorporating prior knowledge into the learning agent with the introduction of a Bayesian policy. Numerical results reveal that not only does our approach outperform traditional multiple access protocols and DRL benchmarks on 3GPP scenarios, but also proves to be robust under various channel and traffic configurations, efficiently exploiting inherent time correlations.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Distributed no-regret edge resource allocation with limited communication
Authors:
Saad Kriouile,
Dimitrios Tsilimantos,
Theodoros Giannakas
Abstract:
To accommodate low latency and computation-intensive services, such as the Internet-of-Things (IoT), 5G networks are expected to have cloud and edge computing capabilities. To this end, we consider a generic network setup where devices, performing analytics-related tasks, can partially process a task and offload its remainder to base stations, which can then reroute it to cloud and/or to edge serv…
▽ More
To accommodate low latency and computation-intensive services, such as the Internet-of-Things (IoT), 5G networks are expected to have cloud and edge computing capabilities. To this end, we consider a generic network setup where devices, performing analytics-related tasks, can partially process a task and offload its remainder to base stations, which can then reroute it to cloud and/or to edge servers. To account for the potentially unpredictable traffic demands and edge network dynamics, we formulate the resource allocation as an online convex optimization problem with service violation constraints and allow limited communication between neighboring nodes. To address the problem, we propose an online distributed (across the nodes) primal-dual algorithm and prove that it achieves sublinear regret and violation; in fact, the achieved bound is of the same order as the best known centralized alternative. Our results are further supported using the publicly available Milano dataset.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Multi-Agent Deep Stochastic Policy Gradient for Event Based Dynamic Spectrum Access
Authors:
Rahif Kassab,
Apostolos Destounis,
Dimitrios Tsilimantos,
Merouane Debbah
Abstract:
We consider the dynamic spectrum access (DSA) problem where $K$ Internet of Things (IoT) devices compete for $T$ time slots constituting a frame. Devices collectively monitor $M$ events where each event could be monitored by multiple IoT devices. Each device, when at least one of its monitored events is active, picks an event and a time slot to transmit the corresponding active event information.…
▽ More
We consider the dynamic spectrum access (DSA) problem where $K$ Internet of Things (IoT) devices compete for $T$ time slots constituting a frame. Devices collectively monitor $M$ events where each event could be monitored by multiple IoT devices. Each device, when at least one of its monitored events is active, picks an event and a time slot to transmit the corresponding active event information. In the case where multiple devices select the same time slot, a collision occurs and all transmitted packets are discarded. In order to capture the fact that devices observing the same event may transmit redundant information, we consider the maximization of the average sum event rate of the system instead of the classical frame throughput. We propose a multi-agent reinforcement learning approach based on a stochastic version of Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to access the frame by exploiting device-level correlation and time correlation of events. Through numerical simulations, we show that the proposed approach is able to efficiently exploit the aforementioned correlations and outperforms benchmark solutions such as standard multiple access protocols and the widely used Independent Deep Q-Network (IDQN) algorithm.
△ Less
Submitted 6 April, 2020;
originally announced April 2020.
-
Optimizing Adaptive Video Streaming in Mobile Networks via Online Learning
Authors:
Theodoros Karagkioules,
Georgios S. Paschos,
Nikolaos Liakopoulos,
Attilio Fiandrotti,
Dimitrios Tsilimantos,
Marco Cagnazzo
Abstract:
In this paper, we propose a novel algorithm for video rate adaptation in HTTP Adaptive Streaming (HAS), based on online learning. The proposed algorithm, named Learn2Adapt (L2A), is shown to provide a robust rate adaptation strategy which, unlike most of the state-of-the-art techniques, does not require parameter tuning, channel model assumptions or application-specific adjustments. These properti…
▽ More
In this paper, we propose a novel algorithm for video rate adaptation in HTTP Adaptive Streaming (HAS), based on online learning. The proposed algorithm, named Learn2Adapt (L2A), is shown to provide a robust rate adaptation strategy which, unlike most of the state-of-the-art techniques, does not require parameter tuning, channel model assumptions or application-specific adjustments. These properties make it very suitable for mobile users, who typically experience fast variations in channel characteristics. Simulations show that L2A improves on the overall Quality of Experience (QoE) and in particular the average streaming rate, a result obtained independently of the channel and application scenarios.
△ Less
Submitted 7 November, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Learn2MAC: Online Learning Multiple Access for URLLC Applications
Authors:
Apostolos Destounis,
Dimitrios Tsilimantos,
Mérouane Debbah,
Georgios S. Paschos
Abstract:
This paper addresses a fundamental limitation of previous random access protocols, their lack of latency performance guarantees. We consider $K$ IoT transmitters competing for uplink resources and we design a fully distributed protocol for deciding how they access the medium. Specifically, each transmitter restricts decisions to a locally-generated dictionary of transmission patterns. At the begin…
▽ More
This paper addresses a fundamental limitation of previous random access protocols, their lack of latency performance guarantees. We consider $K$ IoT transmitters competing for uplink resources and we design a fully distributed protocol for deciding how they access the medium. Specifically, each transmitter restricts decisions to a locally-generated dictionary of transmission patterns. At the beginning of a frame, pattern $i$ is chosen with probability $p^i$, and an online exponentiated gradient algorithm is used to adjust this probability distribution. The performance of the proposed scheme is showcased in simulations, where it is compared with a baseline random access protocol. Simulation results show that (a) the proposed scheme achieves good latent throughput performance and low energy consumption, while (b) it outperforms by a big margin random transmissions.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Classifying flows and buffer state for YouTube's HTTP adaptive streaming service in mobile networks
Authors:
Dimitrios Tsilimantos,
Theodoros Karagkioules,
Stefan Valentin
Abstract:
Accurate cross-layer information is very useful to optimize mobile networks for specific applications. However, providing application-layer information to lower protocol layers has become very difficult due to the wide adoption of end-to-end encryption and due to the absence of cross-layer signaling standards. As an alternative, this paper presents a traffic profiling solution to passively estimat…
▽ More
Accurate cross-layer information is very useful to optimize mobile networks for specific applications. However, providing application-layer information to lower protocol layers has become very difficult due to the wide adoption of end-to-end encryption and due to the absence of cross-layer signaling standards. As an alternative, this paper presents a traffic profiling solution to passively estimate parameters of HTTP Adaptive Streaming (HAS) applications at the lower layers. By observing IP packet arrivals, our machine learning system identifies video flows and detects the state of an HAS client's play-back buffer in real time. Our experiments with YouTube's mobile client show that Random Forests achieve very high accuracy even with a strong variation of link quality. Since this high performance is achieved at IP level with a small, generic feature set, our approach requires no Deep Packet Inspection (DPI), comes at low complexity, and does not interfere with end-to-end encryption. Traffic profiling is, thus, a powerful new tool for monitoring and managing even encrypted HAS traffic in mobile networks.
△ Less
Submitted 29 May, 2018; v1 submitted 1 March, 2018;
originally announced March 2018.
-
Traffic Profiling for Mobile Video Streaming
Authors:
Dimitrios Tsilimantos,
Theodoros Karagkioules,
Amaya Nogales-Gómez,
Stefan Valentin
Abstract:
This paper describes a novel system that provides key parameters of HTTP Adaptive Streaming (HAS) sessions to the lower layers of the protocol stack. A non-intrusive traffic profiling solution is proposed that observes packet flows at the transmit queue of base stations, edge-routers, or gateways. By analyzing IP flows in real time, the presented scheme identifies different phases of an HAS sessio…
▽ More
This paper describes a novel system that provides key parameters of HTTP Adaptive Streaming (HAS) sessions to the lower layers of the protocol stack. A non-intrusive traffic profiling solution is proposed that observes packet flows at the transmit queue of base stations, edge-routers, or gateways. By analyzing IP flows in real time, the presented scheme identifies different phases of an HAS session and estimates important application-layer parameters, such as play-back buffer state and video encoding rate. The introduced estimators only use IP-layer information, do not require standardization and work even with traffic that is encrypted via Transport Layer Security (TLS). Experimental results for a popular video streaming service clearly verify the high accuracy of the proposed solution. Traffic profiling, thus, provides a valuable alternative to cross-layer signaling and Deep Packet Inspection (DPI) in order to perform efficient network optimization for video streaming.
△ Less
Submitted 24 May, 2017;
originally announced May 2017.
-
A Comparative Case Study of HTTP Adaptive Streaming Algorithms in Mobile Networks
Authors:
Theodoros Karagkioules,
Dimitrios Tsilimantos,
Cyril Concolato,
Stefan Valentin
Abstract:
HTTP Adaptive Streaming (HAS) techniques are now the dominant solution for video delivery in mobile networks. Over the past few years, several HAS algorithms have been introduced in order to improve user quality-of-experience (QoE) by bit-rate adaptation. Their difference is mainly the required input information, ranging from network characteristics to application-layer parameters such as the play…
▽ More
HTTP Adaptive Streaming (HAS) techniques are now the dominant solution for video delivery in mobile networks. Over the past few years, several HAS algorithms have been introduced in order to improve user quality-of-experience (QoE) by bit-rate adaptation. Their difference is mainly the required input information, ranging from network characteristics to application-layer parameters such as the playback buffer. Interestingly, despite the recent outburst in scientific papers on the topic, a comprehensive comparative study of the main algorithm classes is still missing. In this paper we provide such comparison by evaluating the performance of the state-of-the-art HAS algorithms per class, based on data from field measurements. We provide a systematic study of the main QoE factors and the impact of the target buffer level. We conclude that this target buffer level is a critical classifier for the studied HAS algorithms. While buffer-based algorithms show superior QoE in most of the cases, their performance may differ at the low target buffer levels of live streaming services. Overall, we believe that our findings provide valuable insight for the design and choice of HAS algorithms according to networks conditions and service requirements.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Anticipatory Radio Resource Management for Mobile Video Streaming with Linear Programming
Authors:
Dimitrios Tsilimantos,
Amaya Nogales-Gómez,
Stefan Valentin
Abstract:
In anticipatory networking, channel prediction is used to improve communication performance. This paper describes a new approach for allocating resources to video streaming traffic while accounting for quality of service. The proposed method is based on integrating a model of the user's local play-out buffer into the radio access network. The linearity of this model allows to formulate a Linear Pr…
▽ More
In anticipatory networking, channel prediction is used to improve communication performance. This paper describes a new approach for allocating resources to video streaming traffic while accounting for quality of service. The proposed method is based on integrating a model of the user's local play-out buffer into the radio access network. The linearity of this model allows to formulate a Linear Programming problem that optimizes the trade-off between the allocated resources and the stalling time of the media stream. Our simulation results demonstrate the full power of anticipatory optimization in a simple, yet representative, scenario. Compared to instantaneous adaptation, our anticipatory solution shows impressive gains in spectral efficiency and stalling duration at feasible computation time while being robust against prediction errors.
△ Less
Submitted 8 March, 2016;
originally announced March 2016.
-
Spectral and Energy Efficiency Trade-Offs in Cellular Networks
Authors:
Dimitrios Tsilimantos,
Jean-Marie Gorce,
Katia Jaffrès-Runser,
H. Vincent Poor
Abstract:
This paper presents a simple and effective method to study the spectral and energy efficiency (SE-EE) trade-off in cellular networks, an issue that has attracted significant recent interest in the wireless community. The proposed theoretical framework is based on an optimal radio resource allocation of transmit power and bandwidth for the downlink direction, applicable for an orthogonal cellular n…
▽ More
This paper presents a simple and effective method to study the spectral and energy efficiency (SE-EE) trade-off in cellular networks, an issue that has attracted significant recent interest in the wireless community. The proposed theoretical framework is based on an optimal radio resource allocation of transmit power and bandwidth for the downlink direction, applicable for an orthogonal cellular network. The analysis is initially focused on a single cell scenario, for which in addition to the solution of the main SE-EE optimization problem, it is proved that a traffic repartition scheme can also be adopted as a way to simplify this approach. By exploiting this interesting result along with properties of stochastic geometry, this work is extended to a more challenging multi-cell environment, where interference is shown to play an essential role and for this reason several interference reduction techniques are investigated. Special attention is also given to the case of low signal to noise ratio (SNR) and a way to evaluate the upper bound on EE in this regime is provided. This methodology leads to tractable analytical results under certain common channel properties, and thus allows the study of various models without the need for demanding system-level simulations.
△ Less
Submitted 8 November, 2015; v1 submitted 28 November, 2013;
originally announced November 2013.
-
The Coalitional Switch off Game of Service Providers
Authors:
Cengis Hasan,
Eitan Altman,
Jean-Marie Gorce,
Dimitrios Tsilimantos,
Manjesh K. Hanawal
Abstract:
This paper studies a significant problem in green networking called switching off base stations in case of cooperating service providers by means of stochastic geometric and coalitional game tools. The coalitional game herein considered is played by service providers who cooperate in switching off base stations. When they cooperate, any mobile is associated to the nearest BS of any service provide…
▽ More
This paper studies a significant problem in green networking called switching off base stations in case of cooperating service providers by means of stochastic geometric and coalitional game tools. The coalitional game herein considered is played by service providers who cooperate in switching off base stations. When they cooperate, any mobile is associated to the nearest BS of any service provider. Given a Poisson point process deployment model of nodes over an area and switching off base stations with some probability, it is proved that the distribution of signal to interference plus noise ratio remains unchanged while the transmission power is increased up to preserving the quality of service. The coalitional game behavior of a typical player is called to be \emph{hedonic} if the gain of any player depends solely on the members of the coalition to which the player belongs, thus, the coalitions form as a result of the preferences of the players over their possible coalitions' set. We also introduce a novel concept which is called the Nash-stable core containing those gain allocation methods that result in Nash-stable partitions. By this way, we always guarantee Nash stability. We study the non-emptiness of the Nash-stable core. Assuming the choice of a coalition is performed only by one player in a point of time, we prove that the Nash-stable core is non-empty when a player chooses its coalition in its turn, the player gains zero utility if the chosen coalition is visited before by itself.
△ Less
Submitted 30 July, 2013; v1 submitted 24 July, 2012;
originally announced July 2012.