Learning Transition Dynamics via Rewarded Exploration: A Study using Unity's MLAgents

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.


Tynski, Jacob Andre




AI agents can benefit from understanding their environment and how it works, as being able to predict the state of the environment after one makes an action is useful for doing tasks. My work explores using a custom reward system to guide an AI agent in learning the transition dynamics of its environment via exploration. Due to the popularity of game engines, I focus on building a transition dynamics model using the game engine, Unity, which provides a package for making AI agents. I test the agent's behaviour across 8 studies, with different hyperparameters for its neural network and with and without access to memory via Long Short-Term Memory. I also conducted two tests with a different reward system to help judge the effectiveness of my approach. The results of my experiments show that the agent performs well and is capable of predicting a variable in the environment.


Artificial Intelligence




Carleton University

Thesis Degree Name: 

Master of Information Technology: 

Thesis Degree Level: 


Thesis Degree Discipline: 

Digital Media

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).