change readme

This commit is contained in:
qiwang067
2020-07-04 15:34:17 +08:00
parent e2f1216780
commit cb17ff6a7c
2 changed files with 12 additions and 12 deletions

View File

@@ -10,14 +10,14 @@
- bilibili[李宏毅《深度强化学习》](https://www.bilibili.com/video/BV1MW411w79n)
## 目录
- [P1 Policy Gradient](https://datawhalechina.github.io/leedeeprl-notes/#/chapter1/chapter1)
- [P1 策略梯度](https://datawhalechina.github.io/leedeeprl-notes/#/chapter1/chapter1)
- [P2 Proximal Policy Optimization (PPO)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter2/chapter2)
- [P3 Q-learning (Basic Idea)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter3/chapter3)
- [P4 Q-learning (Advanced Tips)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter4/chapter4)
- [P5 Q-learning (Continuous Action)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter5/chapter5)
- [P3 Q-learning (基本概念)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter3/chapter3)
- [P4 Q-learning (进阶技巧)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter4/chapter4)
- [P5 Q-learning (连续行动)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter5/chapter5)
- [P6 Actor-Critic](https://datawhalechina.github.io/leedeeprl-notes/#/chapter6/chapter6)
- [P7 Sparse Reward](https://datawhalechina.github.io/leedeeprl-notes/#/chapter7/chapter7)
- [P8 Imitation Learning](https://datawhalechina.github.io/leedeeprl-notes/#/chapter8/chapter8)
- [P7 稀疏奖励](https://datawhalechina.github.io/leedeeprl-notes/#/chapter7/chapter7)
- [P8 模仿学习](https://datawhalechina.github.io/leedeeprl-notes/#/chapter8/chapter8)
## 主要贡献者

View File

@@ -7,14 +7,14 @@
- bilibili[李宏毅《深度强化学习》](https://www.bilibili.com/video/BV1MW411w79n)
## 目录
- [P1 Policy Gradient](https://datawhalechina.github.io/leedeeprl-notes/#/chapter1/chapter1)
- [P1 策略梯度](https://datawhalechina.github.io/leedeeprl-notes/#/chapter1/chapter1)
- [P2 Proximal Policy Optimization (PPO)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter2/chapter2)
- [P3 Q-learning (Basic Idea)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter3/chapter3)
- [P4 Q-learning (Advanced Tips)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter4/chapter4)
- [P5 Q-learning (Continuous Action)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter5/chapter5)
- [P3 Q-learning (基本概念)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter3/chapter3)
- [P4 Q-learning (进阶技巧)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter4/chapter4)
- [P5 Q-learning (连续行动)](https://datawhalechina.github.io/leedeeprl-notes/#/chapter5/chapter5)
- [P6 Actor-Critic](https://datawhalechina.github.io/leedeeprl-notes/#/chapter6/chapter6)
- [P7 Sparse Reward](https://datawhalechina.github.io/leedeeprl-notes/#/chapter7/chapter7)
- [P8 Imitation Learning](https://datawhalechina.github.io/leedeeprl-notes/#/chapter8/chapter8)
- [P7 稀疏奖励](https://datawhalechina.github.io/leedeeprl-notes/#/chapter7/chapter7)
- [P8 模仿学习](https://datawhalechina.github.io/leedeeprl-notes/#/chapter8/chapter8)
## 主要贡献者