On Multi-Agent Reinforcement Learning in Matrix, Stochastic and Differential Games

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Creator: 

Awheda, Mostafa Daee

Date: 

2017

Abstract: 

In this thesis, we investigate how reinforcement learning algorithms can be applied to two different types of games. The first type of games are matrix and stochastic games, where the states and actions are represented in discrete domains. In this type of games, we propose two multi-agent reinforcement learning algorithms to solve the problem of learning when each learning agent has only minimum knowledge about the underlying game and the other learning agents. We mathematically show that the proposed CLR-EMAQL algorithm converges to Nash equilibrium in games with pure Nash equilibrium. We introduce the concept of Win-or-Learn-Slow (WoLS) mechanism for the proposed EMAQL algorithm so that the proposed algorithm learns slowly when it is losing. We also provide a theoretical proof of convergence to Nash equilibrium for the proposed EMAQL algorithm in games with pure Nash equilibrium. In games with mixed Nash equilibrium, our mathematical analysis shows that the proposed EMAQL algorithm converges to an equilibrium. Although our mathematical analysis does not explicitly show that the proposed EMAQL algorithm converges to Nash equilibrium, our simulation results indicate that the proposed EMAQL algorithm does converge to Nash equilibrium. The second type of games are differential games, where the states and actions are represented in continuous domains. We provide four main contributions. First, we propose a new fuzzy reinforcement learning algorithm for differential games that have continuous state and action spaces. Second, we propose a new fuzzy reinforcement learning algorithm for pursuit-evasion games so that the pursuer trained by the proposed algorithm can capture the evader even when the environment of the game is different from the training environment. Third, we propose a new decentralized fuzzy reinforcement learning algorithm for multi-pursuer pursuit-evasion differential games with a single-superior evader that has a speed similar to the speed of the pursuers. Fourth, we propose a new decentralized fuzzy reinforcement learning algorithm for multi-pursuer pursuit-evasion differential games with a single-superior evader that has a speed that is similar to or higher than the speed of each pursuer. Simulation results show the effectiveness of the proposed algorithms.

Subject: 

Engineering - Electronics and Electrical
Artificial Intelligence

Language: 

English

Publisher: 

Carleton University

Thesis Degree Name: 

Doctor of Philosophy: 
Ph.D.

Thesis Degree Level: 

Doctoral

Thesis Degree Discipline: 

Engineering, Electrical and Computer

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).