Investigating Agent Behavior In different RL methods
A downloadable BlueTeam
In order to achieve the desired behavior of an agent that learns from its mistakes and improves its performance, we need to get more familiar with the concept of Reinforcement Learning (RL). Implementing such a self-learning system is easier than we may think we already know the agent's systems but what is their behavior look like? We will show an example by with experimenting the three algorithms mentioned above. Let us assume we are in Baghdad city and we need to go to Fallujah as fast as we can. There are two roads where we can leave the city (Baghdad), routes 10 and 11. After we arrive in Fallujah east, we have only two bridges we can choose from to cross the Euphrates River to arrive at Fallujah west. The traffic is unpredictable, so it can happen that the road or bridge, we choose has a traffic jam. The time we need to cross the bridges depends on the first action (route 10 or 11), so they will be different in either case. In addition, sometimes we are redirected even though we have chosen the other route/bridge. Our goal is to build a strategy, where we gain the most reward on our journey. Rewards are the negative time we need to go through that road/bridge.
Status | Released |
Category | Other |
Author | Al-Hitawi Mohammed |
Download
Install instructions
all code and data can be found on my GitHub repository
Leave a comment
Log in with itch.io to leave a comment.