Abstract

This thesis investigates how distributed reinforcement learning-based resource assignment algorithms can be used to improve the performance of a cognitive radio system. Decision making in most wireless systems today, including most cognitive radio systems in development, depends purely on instantaneous measurement. The purpose of this work is to exploit the historical information the cognitive radio device has learned through the interactions with the unknown environment. Two system architectures have been investigated in this thesis. A point-to-point architecture is examined first in an open spectrum scenario. Then, for the first time distributed reinforcement learning-based algorithms are developed and examined in a novel two-hop architecture for Beyond Next Generation Mobile Network.
The traditional reinforcement learning model is modified in order to be applied to a fully distributed cognitive radio scenario. The inherent exploration versus exploitation trade-off seen in reinforcement learning is examined in the context of cognitive radio. A two-stage algorithm is proposed to effectively control the exploration phase of the learning process. This is because cognitive radio users will cause a higher level of disturbance in the exploration phase. Efficient exploration algorithms like pre-partitioning and weight-driven exploration are proposed to enable more efficient learning process. The learning efficiency in a cognitive radio scenario is defined and the learning efficiency of the proposed schemes is investigated. Results show that the performance of the cognitive radio system can be significantly enhanced by utilizing distributed reinforcement learning since the cognitive devices are able to identify the appropriate resources more efficiently.
The reinforcement learning-based ‘green’ cognitive radio approach is discussed. Techniques presented show how it is possible to largely eliminate the need for spectrum sensing, along with the associated energy consumption, by using reinforcement learning to develop a preferred channel set in each device.