diff --git a/docs/chapter12/assets/image-20201015221602396.png b/docs/chapter12/assets/image-20201015221602396.png new file mode 100644 index 0000000..c10ebf4 Binary files /dev/null and b/docs/chapter12/assets/image-20201015221602396.png differ diff --git a/docs/chapter12/assets/moving_average_rewards_eval.png b/docs/chapter12/assets/moving_average_rewards_eval.png new file mode 100644 index 0000000..3e9c92f Binary files /dev/null and b/docs/chapter12/assets/moving_average_rewards_eval.png differ diff --git a/docs/chapter12/assets/moving_average_rewards_train.png b/docs/chapter12/assets/moving_average_rewards_train.png new file mode 100644 index 0000000..666e14d Binary files /dev/null and b/docs/chapter12/assets/moving_average_rewards_train.png differ diff --git a/docs/chapter12/assets/rewards_eval.png b/docs/chapter12/assets/rewards_eval.png new file mode 100644 index 0000000..f7b3c04 Binary files /dev/null and b/docs/chapter12/assets/rewards_eval.png differ diff --git a/docs/chapter12/assets/rewards_train.png b/docs/chapter12/assets/rewards_train.png new file mode 100644 index 0000000..ee4862f Binary files /dev/null and b/docs/chapter12/assets/rewards_train.png differ diff --git a/docs/chapter12/assets/steps_eval.png b/docs/chapter12/assets/steps_eval.png new file mode 100644 index 0000000..d6d77d7 Binary files /dev/null and b/docs/chapter12/assets/steps_eval.png differ diff --git a/docs/chapter12/assets/steps_train.png b/docs/chapter12/assets/steps_train.png new file mode 100644 index 0000000..c6a9675 Binary files /dev/null and b/docs/chapter12/assets/steps_train.png differ diff --git a/docs/chapter12/project3.md b/docs/chapter12/project3.md index 5134cfb..4533b69 100644 --- a/docs/chapter12/project3.md +++ b/docs/chapter12/project3.md @@ -53,7 +53,23 @@ for i_episode in range(1, cfg.max_episodes+1): # cfg.max_episodes为最大训练 训练并绘制reward以及滑动平均后的reward随epiosde的变化曲线图并记录超参数写成报告,图示如下: -![moving_average_rewards](img/moving_average_rewards-8929361.png) +![rewards_train](assets/rewards_train.png) + +![moving_average_rewards_train](assets/moving_average_rewards_train.png) + +![steps_train](assets/steps_train.png) + +同时也可以绘制测试(eval)模型时的曲线: + +![rewards_eval](assets/rewards_eval.png) + +![moving_average_rewards_eval](assets/moving_average_rewards_eval.png) + +![steps_eval](assets/steps_eval.png) + +也可以[tensorboard](https://pytorch.org/docs/stable/tensorboard.html)查看结果,如下: + +![image-20201015221602396](assets/image-20201015221602396.png) ### 注意 diff --git a/docs/chapter7/assets/image-20201015221032985.png b/docs/chapter7/assets/image-20201015221032985.png new file mode 100644 index 0000000..1f443a4 Binary files /dev/null and b/docs/chapter7/assets/image-20201015221032985.png differ diff --git a/docs/chapter7/assets/moving_average_rewards_eval.png b/docs/chapter7/assets/moving_average_rewards_eval.png new file mode 100644 index 0000000..c2ba80b Binary files /dev/null and b/docs/chapter7/assets/moving_average_rewards_eval.png differ diff --git a/docs/chapter7/assets/moving_average_rewards_train.png b/docs/chapter7/assets/moving_average_rewards_train.png new file mode 100644 index 0000000..34af087 Binary files /dev/null and b/docs/chapter7/assets/moving_average_rewards_train.png differ diff --git a/docs/chapter7/assets/rewards_eval.png b/docs/chapter7/assets/rewards_eval.png new file mode 100644 index 0000000..735fa2b Binary files /dev/null and b/docs/chapter7/assets/rewards_eval.png differ diff --git a/docs/chapter7/assets/rewards_train.png b/docs/chapter7/assets/rewards_train.png new file mode 100644 index 0000000..471ecff Binary files /dev/null and b/docs/chapter7/assets/rewards_train.png differ diff --git a/docs/chapter7/assets/steps_eval.png b/docs/chapter7/assets/steps_eval.png new file mode 100644 index 0000000..c3864ee Binary files /dev/null and b/docs/chapter7/assets/steps_eval.png differ diff --git a/docs/chapter7/assets/steps_train.png b/docs/chapter7/assets/steps_train.png new file mode 100644 index 0000000..3ba5e60 Binary files /dev/null and b/docs/chapter7/assets/steps_train.png differ diff --git a/docs/chapter7/project2.md b/docs/chapter7/project2.md index f372d42..7b93147 100644 --- a/docs/chapter7/project2.md +++ b/docs/chapter7/project2.md @@ -54,7 +54,23 @@ for i_episode in range(1, cfg.max_episodes+1): # cfg.max_episodes为最大训练 训练并绘制reward以及滑动平均后的reward随epiosde的变化曲线图并记录超参数写成报告,图示如下: -![p2](img/p2.png) +![rewards_train](assets/rewards_train.png) + +![moving_average_rewards_train](assets/moving_average_rewards_train.png) + +![steps_train](assets/steps_train.png) + +同时也可以绘制测试(eval)模型时的曲线: + +![rewards_eval](assets/rewards_eval.png)() + +![moving_average_rewards_eval](assets/moving_average_rewards_eval.png) + +![steps_eval](assets/steps_eval.png) + +也可以[tensorboard](https://pytorch.org/docs/stable/tensorboard.html)查看结果,如下: + +![image-20201015221032985](assets/image-20201015221032985.png) ### 代码清单 @@ -66,6 +82,6 @@ for i_episode in range(1, cfg.max_episodes+1): # cfg.max_episodes为最大训练 **memory.py**:保存Replay Buffer -**plot.py**:保存相关绘制函数 +**plot.py**:保存相关绘制函数,可选 [参考代码](https://github.com/datawhalechina/leedeeprl-notes/tree/master/codes/dqn) \ No newline at end of file