easy-rl

bacow/easy-rl

Fork 0

Files

History

JohnJim0816 747f3238c0 update

2021-05-04 15:30:01 +08:00

A2C

update

2021-05-04 15:30:01 +08:00

assets

update

2021-03-23 16:10:11 +08:00

common

update

2021-05-04 15:30:01 +08:00

DDPG

update

2021-05-04 15:30:01 +08:00

DoubleDQN

update

2021-05-04 15:30:01 +08:00

DQN

update

2021-05-04 15:30:01 +08:00

envs

update

2021-05-03 23:00:01 +08:00

HierarchicalDQN

update

2021-03-31 15:37:09 +08:00

MonteCarlo

update

2021-03-28 11:18:52 +08:00

PolicyGradient

update

2021-04-28 22:11:22 +08:00

PPO

update

2021-04-28 22:11:22 +08:00

QLearning

update

2021-05-03 23:00:01 +08:00

RandomPolicy

update

2021-04-28 22:11:22 +08:00

SAC

update

2021-04-29 14:44:25 +08:00

Sarsa

update

2021-03-28 11:18:52 +08:00

TD3

update

2021-04-28 22:11:22 +08:00

LICENSE

update

2021-03-23 16:10:11 +08:00

README_en.md

update

2021-05-04 15:30:01 +08:00

README.md

update

2021-05-03 23:00:01 +08:00

README_en.md

Eng|中文

Introduction

This repo is used to learn basic RL algorithms, we will make it detailed comment and clear structure as much as possible:

The code structure mainly contains several scripts as following：

model.py basic network model of RL, like MLP, CNN
memory.py Replay Buffer
plot.py use seaborn to plot rewards curve，saved in folder result.
env.py to custom or normalize environments
agent.py core algorithms, include a python Class with functions(choose action, update)
main.py main function

Note that model.py,memory.py,plot.py shall be utilized in different algorithms，thus they are put into common folder。

Runnig Environment

python 3.7、pytorch 1.6.0-1.7.1、gym 0.17.0-0.18.0

Usage

run python scripts or jupyter notebook file with train to train the agent, if there is a task like task0_train.py, it means to train with task 0.

similar to file with eval, which means to evaluate the agent.

Schedule

Name	Related materials	Used Envs
On-Policy First-Visit MC	medium blog	Racetrack
Q-Learning	towardsdatascience blog,q learning paper	CliffWalking-v0
Sarsa	geeksforgeeks blog	Racetrack
DQN	DQN Paper,Nature DQN Paper	CartPole-v0
DQN-cnn	DQN Paper	CartPole-v0
DoubleDQN	DoubleDQN Paper	CartPole-v0
Hierarchical DQN	H-DQN Paper	CartPole-v0
PolicyGradient	Lil'log	CartPole-v0
A2C	A3C Paper	CartPole-v0
SAC	SAC Paper	Pendulum-v0
PPO	PPO paper	CartPole-v0
DDPG	DDPG Paper	Pendulum-v0
TD3	TD3 Paper	HalfCheetah-v2

Refs

RL-Adventure-2

RL-Adventure

README_en.md Unescape Escape

Introduction

Runnig Environment

Usage

Schedule

Refs

README_en.md