Deep reinforcement learning (DRL) is evolved from a collection of powerful techniques in artificial intelligence areas, and has been extensively used in different areas. In DRL, an agent learns to take actions that would yield the most reward by interacting with the environment without prior knowledge of an exact mathematical model of the environment.
In this work, we investigate the performance improvements for wireless mobile networks via DRL. We firstly present a DRL approach in cache-enabled opportunistic interference alignment wireless networks. Most existing related works assume that the wireless channels are invariant, which is unrealistic. We consider time-varying channels, and therefore the complexity of the system is very high. We use Google TensorFlow to implement DRL in this chapter to obtain the optimal user selection policy in cache-enabled opportunistic interference alignment wireless networks. Simulation results are presented to show that the network's sum rate and energy efficiency can be significantly improved by using the proposed approach.
Secondly, we design a software-defined framework for connected vehicles, which integrates communication, caching and mobile edge computing. A deep reinforcement learning-based resource allocation scheme is proposed for the connected vehicles. The dynamic change processes of the resources are modeled as Markov chains, respectively. Without any assumptions about the objective functions or any low-complexity preprocessing, the proposed scheme can directly solve the resource allocation problems with large-scale state space. Simulation results verify that the proposed scheme can converge at a fast speed, and improve the network operator's total utilities.
Thirdly, we study trust-based social networks with mobile edge computing, in-network caching and device-to-device communications. An optimization problem is formulated to maximize the network operator's utility with comprehensive considerations of trust values, computation capabilities, wireless channel qualities, and the cache status of all the available nodes. We apply a DRL approach to automatically make a decision for allocating the network resources. The decision is made purely through observing the network's states, rather than any handcrafted or explicit control rules, which makes it adaptive to variable network conditions. Simulation results with different network parameters are presented to show the effectiveness of the proposed scheme.