add a link
This commit is contained in:
@@ -563,6 +563,8 @@ Sarsa 是用自己的策略产生了 S,A,R,S',A' 这一条轨迹。然后拿着
|
|||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
|
* [百度强化学习](https://aistudio.baidu.com/aistudio/education/lessonvideo/460292)
|
||||||
|
|
||||||
* [强化学习基础 David Silver 笔记](https://zhuanlan.zhihu.com/c_135909947)
|
* [强化学习基础 David Silver 笔记](https://zhuanlan.zhihu.com/c_135909947)
|
||||||
* [Intro to Reinforcement Learning (强化学习纲要)](https://github.com/zhoubolei/introRL)
|
* [Intro to Reinforcement Learning (强化学习纲要)](https://github.com/zhoubolei/introRL)
|
||||||
* [Reinforcement Learning: An Introduction (second edition)](https://book.douban.com/subject/30323890/)
|
* [Reinforcement Learning: An Introduction (second edition)](https://book.douban.com/subject/30323890/)
|
||||||
|
|||||||
Reference in New Issue
Block a user