# 李宏毅深度强化学习笔记(LeeDeepRL-Notes) ## 笔记在线阅读地址在线阅读地址：https://datawhalechina.github.io/leedeeprl-notes/ ## 课程在线观看地址 - bilibili：[李宏毅《深度强化学习》](https://www.bilibili.com/video/BV1MW411w79n) ## 目录 - [P1 Policy Gradient](https://datawhalechina.github.io/leedeeprl-notes/#/chapter1/chapter1) - [P2 Proximal Policy Optimization (PPO)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter2/chapter2) - [P3 Q-learning (Basic Idea)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter3/chapter3) - [P4 Q-learning (Advanced Tips)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter4/chapter4) - [P5 Q-learning (Continuous Action)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter5/chapter5) - [P6 Actor-Critic](https://datawhalechina.github.io/leedeeprl-notes/#/chapter6/chapter6) - [P7 Sparse Reward](https://datawhalechina.github.io/leedeeprl-notes/#/chapter7/chapter7) - [P8 Imitation Learning](https://datawhalechina.github.io/leedeeprl-notes/#/chapter8/chapter8) ## 主要贡献者 - [@qiwang067](https://github.com/qiwang067) ## 关注我们

Datawhale，一个专注于AI领域的学习圈子。初衷是for the learner，和学习者一起成长。目前加入学习社群的人数已经数千人，组织了机器学习，深度学习，数据分析，数据挖掘，爬虫，编程，统计学，Mysql，数据竞赛等多个领域的内容学习，微信搜索公众号Datawhale可以加入我们。

## LICENSE

本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。