Files
easy-rl/codes
JohnJim0816 312b57fdff update
2021-04-04 16:59:03 +08:00
..
2021-03-28 11:18:52 +08:00
2021-03-23 16:10:11 +08:00
2021-04-04 16:59:03 +08:00
2021-03-31 15:37:09 +08:00
2021-03-31 15:37:09 +08:00
2021-04-04 16:59:03 +08:00
2021-03-31 15:37:09 +08:00
2021-03-28 11:18:52 +08:00
2021-03-31 15:37:09 +08:00
2021-03-28 11:18:52 +08:00
2021-03-28 11:18:52 +08:00
2021-03-23 16:05:16 +08:00
2021-04-04 16:59:03 +08:00
2021-03-28 11:18:52 +08:00
2021-03-30 19:28:02 +08:00
2021-03-23 16:10:11 +08:00
2021-03-23 16:10:11 +08:00
2021-04-04 16:59:03 +08:00
2021-04-04 16:59:03 +08:00
2021-04-04 16:59:03 +08:00

Eng|中文

Introduction

This repo is used to learn basic RL algorithms, we will make it detailed comment and clear structure as much as possible:

The code structure mainly contains several scripts as following

  • model.py basic network model of RL, like MLP, CNN
  • memory.py Replay Buffer
  • plot.py use seaborn to plot rewards curvesaved in folder result.
  • env.py to custom or normalize environments
  • agent.py core algorithms, include a python Class with functions(choose action, update)
  • main.py main function

Note that model.py,memory.py,plot.py shall be utilized in different algorithmsthus they are put into common folder。

Runnig Environment

python 3.7.9、pytorch 1.6.0、gym 0.18.0

Usage

run main.py or main.ipynb

Schedule

Name Related materials Used Envs Notes
On-Policy First-Visit MC Racetrack
Q-Learning CliffWalking-v0
Sarsa Racetrack
DQN DQN-paper CartPole-v0
DQN-cnn DQN-paper CartPole-v0
DoubleDQN CartPole-v0 not well
Hierarchical DQN Hierarchical DQN CartPole-v0
PolicyGradient CartPole-v0
A2C CartPole-v0
A3C
SAC
PPO PPO paper CartPole-v0
DDPG DDPG Paper Pendulum-v0
TD3 Twin Dueling DDPG Paper
GAIL

Refs

RL-Adventure-2

RL-Adventure

https://www.cnblogs.com/lucifer1997/p/13458563.html