The Agent and Its Environment
Attention: New versions of gym has changed the information printed when you print the observation space
In the video, when we printed out the observation space, we got the following.
But in newer versions of gym, the same code will print more information about the underlying Box object.
This change was introduced in this commit.
Here's how to interpret what's printed now.
- The first element (-3.4028234663852886e+38) is the minimum value that any of the elements in the observation can take. If the minimum is different for individual elements (e.g cart position, cart velocity etc.), the minimum of the individual minimums is printed.
- The second element (3.4028234663852886e+38) is the maximum value that any of the elements in the observation can take. If the maximum is different for individual elements (e.g cart position, cart velocity etc.), the maximum of the individual maximums is printed.
- The third element i.e. (4,) is the shape of the numpy array holding the individual elements. In CartPole-v0, we know that the observation looks like np.array([cart_pos, cart_vel, pole_angle, pole_tip_vel]). The shape of this array is (4,).
- The fourth element is the dtype of the numpy array. In CartPole-v0, this is float32. No surprises there!
- Gym GitHub Wiki: Here, you can find details of selected Gym environments. There is also an informal leaderboard, where you can compete against others in teaching the agent to solve this environment