Creator:
Date:
Abstract:
Recent years have witnessed significant progress in the sub-field of machine learning known as reinforcement learning, in which interactions between intelligent agents and the environment enable agents to learn and solve sequential decision-making problems through accumulating rewards with delays. Despite much success in single-player settings, reinforcement learning in multi-agent domains remains a challenging task in many aspects. In this thesis, the mean-field approach will be used to study binary action space stochastic games with a sufficiently large number of players that can be generalized to the multi-population case. Based on the mean-field approximation, several algorithms will be implemented and compared in numerical experiments to visualize their convergence to the equilibrium policy.