A downloadable BlueTeam

In order to achieve the desired behavior of an agent that learns from its mistakes and improves its performance, we need to get more familiar with the concept of Reinforcement Learning (RL). Implementing such a self-learning system is easier than we may think we already know the agent's systems but what is their behavior look like? We will show an example by with experimenting the three algorithms mentioned above. Let us assume we are in Baghdad city and we need to go to Fallujah as fast as we can. There are two roads where we can leave the city (Baghdad), routes 10 and 11. After we arrive in Fallujah east, we have only two bridges we can choose from to cross the Euphrates River to arrive at Fallujah west. The traffic is unpredictable, so it can happen that the road or bridge, we choose has a traffic jam. The time we need to cross the bridges depends on the first action (route 10 or 11), so they will be different in either case. In addition, sometimes we are redirected even though we have chosen the other route/bridge. Our goal is to build a strategy, where we gain the most reward on our journey. Rewards are the negative time we need to go through that road/bridge.

Download

Download
Blue_team_MechanisticInterpretability.zip 503 kB

Install instructions

all code and data can be found on my GitHub repository 

code for our project

Leave a comment

Log in with itch.io to leave a comment.