Introduction
This repo is used to learn basic RL algorithms, we will make it detailed comment and clear structure as much as possible:
The code structure mainly contains several scripts as following:
model.pybasic network model of RL, like MLP, CNNmemory.pyReplay Bufferplot.pyuse seaborn to plot rewards curve,saved in folderresult.env.pyto custom or normalize environmentsagent.pycore algorithms, include a python Class with functions(choose action, update)main.pymain function
Note that model.py,memory.py,plot.py shall be utilized in different algorithms,thus they are put into common folder。
Runnig Environment
python 3.7、pytorch 1.6.0-1.7.1、gym 0.17.0-0.18.0
Usage
run python scripts or jupyter notebook file with train to train the agent, if there is a task like task0_train.py, it means to train with task 0.
similar to file with eval, which means to evaluate the agent.