fix ch2
This commit is contained in:
@@ -99,4 +99,4 @@
|
||||
答:$n$越大,方差越大,期望偏差越小。值函数的更新公式? 话不多说,公式如下:
|
||||
$$
|
||||
Q\left(S, A\right) \leftarrow Q\left(S, A\right)+\alpha\left[\sum_{i=1}^{n} \gamma^{i-1} R_{t+i}+\gamma^{n} \max _{a} Q\left(S',a\right)-Q\left(S, A\right)\right]
|
||||
$$
|
||||
$$
|
||||
Reference in New Issue
Block a user