Commit 0abdcade authored by Harry Pigot's avatar Harry Pigot
Browse files

update readme

parent 1b71f41a
......@@ -2,12 +2,9 @@
Here's a quick overview of the outcomes. Details and instructions are given in the Jupyter notebook.
- save plots and best models
- rerun and save disc training results plots
- rename file and folders (update image save locations)
- gitlab repo
The notebook isn't entirely stable across runs, so your results may vary.
## Discrete Action Inverted Pendulum Environment
## Discrete Action Inverted Pendulum Environment
The `CartPole-v1` environment gives a reward of 1 for every step that the pendulum is upright (+- 15 degrees) and visible in the simulation (position +-2.4). I took an "extended rewards" approach by adding penalties for the pendulum angle and cart position.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment