This thesis studies multiagent reinforcement learning algorithms and their implementation; In particular the Minimax-Q algorithm, the Nash-Q algorithm and the WOLF-PHC algorithm. We evaluate their ability to reach a Nash equilibrium and their performance during learning in general-sum game environments. We also testtheir performance when playing against each other. We show the problems with implementing the Nash-Q algorithm and the inconvenience of using it in future research. We fully review the Lemke-Howson Algorithm used in the Nash-Q algorithm to find the Nash equilibrium in bimatrix games. We find that the WOLF-PHC is a more adaptable algorithm and it performs better than the others in a general-sum game environment.