-
Deep Reinforcement Learning for Distributed and Uncoordinated Cognitive Radios Resource Allocation
Authors:
Ankita Tondwalkar,
Andres Kwasinski
Abstract:
This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network where the interactions of the agents during learning may lead to a non-stationary environment. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. It is shown by consider…
▽ More
This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network where the interactions of the agents during learning may lead to a non-stationary environment. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. It is shown by considering aspects specific to deep reinforcement learning that the presented algorithm converges in an arbitrarily long time to equilibrium policies in a non-stationary multi-agent environment that results from the uncoordinated dynamic interaction between radios through the shared wireless environment. Simulation results show that the presented technique achieves a faster learning performance compared to an equivalent table-based Q-learning algorithm and is able to find the optimal policy in 99% of cases for a sufficiently long learning time. In addition, simulations show that our DQL approach requires less than half the number of learning steps to achieve the same performance as an equivalent table-based implementation. Moreover, it is shown that the use of a standard single-agent deep reinforcement learning approach may not achieve convergence when used in an uncoordinated interacting multi-radio scenario
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Deep Reinforcement Learning for Distributed Uncoordinated Cognitive Radios Resource Allocation
Authors:
Ankita Tondwalkar,
Dr Andres Kwasinski
Abstract:
This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network that coexists through underlay dynamic spectrum access (DSA) with a primary network. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. The presented algorithm is the fi…
▽ More
This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network that coexists through underlay dynamic spectrum access (DSA) with a primary network. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. The presented algorithm is the first deep reinforcement learning technique for which convergence to equilibrium policies can be shown in the non-stationary multi-agent environment that results from the uncoordinated dynamic interaction between radios through the shared wireless environment. Moreover, simulation results show that in a finite learning time the presented technique is able to find policies that yield performance within 3 % of an exhaustive search solution, finding the optimal policy in nearly 70 % of cases. Moreover, it is shown that standard single-agent deep reinforcement learning may not achieve convergence when used in a non-coordinated, coupled multi-radio scenario.
△ Less
Submitted 6 March, 2020; v1 submitted 29 October, 2019;
originally announced November 2019.
-
Neural Network Cognitive Engine for Autonomous and Distributed Underlay Dynamic Spectrum Access
Authors:
Fatemeh Shah-Mohammadi,
Andres Kwasinski
Abstract:
Two key challenges in underlay dynamic spectrum access (DSA) are how to establish an interference limit from the primary network (PN) and how cognitive radios (CRs) in the secondary network (SN) become aware of the interference they create on the PN, especially when there is no exchange of information between the two networks. These challenges are addressed in this paper by presenting a fully auto…
▽ More
Two key challenges in underlay dynamic spectrum access (DSA) are how to establish an interference limit from the primary network (PN) and how cognitive radios (CRs) in the secondary network (SN) become aware of the interference they create on the PN, especially when there is no exchange of information between the two networks. These challenges are addressed in this paper by presenting a fully autonomous and distributed underlay DSA scheme where each CR operates based on predicting its transmission effect on the PN. The scheme is based on a cognitive engine with an artificial neural network that predicts, without exchanging information between the networks, the adaptive modulation and coding configuration for the primary link nearest to a transmitting CR. By managing the effect of the SN on the PN, the presented technique maintains the relative average throughput change in the PN within a prescribed maximum value, while also finding transmit settings for the CRs that result in throughput as large as allowed by the PN interference limit. Simulation results show that the ability of the cognitive engine in estimating the effect of a CR transmission on the full adaptive modulation and coding (AMC) mode leads to a much more fine underlay transmit power control. This ability also provides higher transmission opportunities for the CRs, compared to a scheme that can only estimate the modulation scheme used at the PN link.
△ Less
Submitted 4 October, 2020; v1 submitted 28 June, 2018;
originally announced June 2018.
-
Learning for Robust Routing Based on Stochastic Game in Cognitive Radio Networks
Authors:
Wenbo Wang,
Andres Kwasinski,
Dusit Niyato,
Zhu Han
Abstract:
This paper studies the problem of robust spectrum-aware routing in a multi-hop, multi-channel Cognitive Radio Network (CRN) with the presence of malicious nodes in the secondary network. The proposed routing scheme models the interaction among the Secondary Users (SUs) as a stochastic game. By allowing the backward propagation of the path utility information from the next-hop nodes, the stochastic…
▽ More
This paper studies the problem of robust spectrum-aware routing in a multi-hop, multi-channel Cognitive Radio Network (CRN) with the presence of malicious nodes in the secondary network. The proposed routing scheme models the interaction among the Secondary Users (SUs) as a stochastic game. By allowing the backward propagation of the path utility information from the next-hop nodes, the stochastic routing game is decomposed into a series of stage games. The best-response policies are learned through the process of smooth fictitious play, which is guaranteed to converge without flooding of the information about the local utilities and behaviors. To address the problem of mixed insider attacks with both routing-toward-primary and sink-hole attacks, the trustworthiness of the neighbor nodes is evaluated through a multi-arm bandit process for each SU. The simulation results show that the proposed routing algorithm is able to enforce the cooperation of the malicious SUs and reduce the negative impact of the attacks on the routing selection process.
△ Less
Submitted 29 March, 2016;
originally announced March 2016.
-
A Survey on Applications of Model-Free Strategy Learning in Cognitive Wireless Networks
Authors:
Wenbo Wang,
Andres Kwasinski,
Dusit Niyato,
Zhu Han
Abstract:
Model-free learning has been considered as an efficient tool for designing control mechanisms when the model of the system environment or the interaction between the decision-making entities is not available as a-priori knowledge. With model-free learning, the decision-making entities adapt their behaviors based on the reinforcement from their interaction with the environment and are able to (impl…
▽ More
Model-free learning has been considered as an efficient tool for designing control mechanisms when the model of the system environment or the interaction between the decision-making entities is not available as a-priori knowledge. With model-free learning, the decision-making entities adapt their behaviors based on the reinforcement from their interaction with the environment and are able to (implicitly) build the understanding of the system through trial-and-error mechanisms. Such characteristics of model-free learning is highly in accordance with the requirement of cognition-based intelligence for devices in cognitive wireless networks. Recently, model-free learning has been considered as one key implementation approach to adaptive, self-organized network control in cognitive wireless networks. In this paper, we provide a comprehensive survey on the applications of the state-of-the-art model-free learning mechanisms in cognitive wireless networks. According to the system models that those applications are based on, a systematic overview of the learning algorithms in the domains of single-agent system, multi-agent systems and multi-player games is provided. Furthermore, the applications of model-free learning to various problems in cognitive wireless networks are discussed with the focus on how the learning mechanisms help to provide the solutions to these problems and improve the network performance over the existing model-based, non-adaptive methods. Finally, a broad
△ Less
Submitted 8 February, 2016; v1 submitted 15 April, 2015;
originally announced April 2015.
-
Power allocation with stackelberg game in femtocell networks: a self-learning approach
Authors:
Wenbo Wang,
Andres Kwasinski,
Zhu Han
Abstract:
This paper investigates the energy-efficient power allocation for a two-tier, underlaid femtocell network. The behaviors of the Macrocell Base Station (MBS) and the Femtocell Users (FUs) are modeled hierarchically as a Stackelberg game. The MBS guarantees its own QoS requirement by charging the FUs individually according to the cross-tier interference, and the FUs responds by controlling the local…
▽ More
This paper investigates the energy-efficient power allocation for a two-tier, underlaid femtocell network. The behaviors of the Macrocell Base Station (MBS) and the Femtocell Users (FUs) are modeled hierarchically as a Stackelberg game. The MBS guarantees its own QoS requirement by charging the FUs individually according to the cross-tier interference, and the FUs responds by controlling the local transmit power non-cooperatively. Due to the limit of information exchange in intra- and inter-tiers, a self-learning based strategy-updating mechanism is proposed for each user to learn the equilibrium strategies. In the same Stackelberg-game framework, two different scenarios based on the continuous and discrete power profiles for the FUs are studied, respectively. The self-learning schemes in the two scenarios are designed based on the local best response. By studying the properties of the proposed game in the two situations, the convergence property of the learning schemes is provided. The simulation results are provided to support the theoretical finding in different situations of the proposed game, and the efficiency of the learning schemes is validated.
△ Less
Submitted 10 October, 2014;
originally announced October 2014.