diff --git a/docs/chapter4/chapter4.md b/docs/chapter4/chapter4.md
index 4550df0..c459be8 100644
--- a/docs/chapter4/chapter4.md
+++ b/docs/chapter4/chapter4.md
@@ -302,7 +302,7 @@ $$
$$
(s_1,a_1,G_1),(s_2,a_2,G_2),\cdots,(s_T,a_T,G_T)
$$
-然后针对每个动作计算梯度 $\nabla \ln \pi(a_t|s_t,\theta)$ 。在代码上计算时,我们要获取神经网络的输出。神经网络会输出每个动作对应的概率值(比如0.2、0.5、0.3),然后我们还可以获取实际的动作$a_t$,把动作转成独热(one-hot)向量(比如[0,1,0])与 $\log [0.2,0.5,0.3]$ 相乘就可以得到 $\ln \pi(a_t|s_t,\theta)$ 。
+然后针对每个动作计算梯度 $\nabla \log \pi(a_t|s_t,\theta)$ 。在代码上计算时,我们要获取神经网络的输出。神经网络会输出每个动作对应的概率值(比如0.2、0.5、0.3),然后我们还可以获取实际的动作$a_t$,把动作转成独热(one-hot)向量(比如[0,1,0])与 $\log [0.2,0.5,0.3]$ 相乘就可以得到 $\log \pi(a_t|s_t,\theta)$ 。
@@ -353,7 +353,7 @@ $$
图 4.18 策略梯度损失
-如图 4.19 所示,实际上我们在计算策略梯度损失的时候,要先对实际执行的动作取独热向量,再获取神经网络预测的动作概率,将它们相乘,我们就可以得到 $\ln \pi(a_t|s_t,\theta)$,这就是我们要构造的损失。因为我们可以获取整个回合的所有的轨迹,所以我们可以对这一条轨迹里面的每个动作都去计算一个损失。把所有的损失加起来,我们再将其“扔”给 Adam 的优化器去自动更新参数就好了。
+如图 4.19 所示,实际上我们在计算策略梯度损失的时候,要先对实际执行的动作取独热向量,再获取神经网络预测的动作概率,将它们相乘,我们就可以得到 $\log \pi(a_t|s_t,\theta)$,这就是我们要构造的损失。因为我们可以获取整个回合的所有的轨迹,所以我们可以对这一条轨迹里面的每个动作都去计算一个损失。把所有的损失加起来,我们再将其“扔”给 Adam 的优化器去自动更新参数就好了。
diff --git a/docs/img/ch2/图片 [Auto-saved].pptx b/docs/img/ch2/图片 [Auto-saved].pptx
new file mode 100644
index 0000000..5e8717f
Binary files /dev/null and b/docs/img/ch2/图片 [Auto-saved].pptx differ
diff --git a/docs/img/ch3/3.19a.png b/docs/img/ch3/3.19a.png
deleted file mode 100644
index 24d0585..0000000
Binary files a/docs/img/ch3/3.19a.png and /dev/null differ
diff --git a/docs/img/ch3/3.19b.png b/docs/img/ch3/3.19b.png
deleted file mode 100644
index 96d50fb..0000000
Binary files a/docs/img/ch3/3.19b.png and /dev/null differ
diff --git a/docs/img/ch3/3.8a.png b/docs/img/ch3/3.8a.png
deleted file mode 100644
index 9eea2ba..0000000
Binary files a/docs/img/ch3/3.8a.png and /dev/null differ
diff --git a/docs/img/ch3/3.8b.png b/docs/img/ch3/3.8b.png
deleted file mode 100644
index 62cd27a..0000000
Binary files a/docs/img/ch3/3.8b.png and /dev/null differ
diff --git a/docs/img/ch3/3.8c.png b/docs/img/ch3/3.8c.png
deleted file mode 100644
index 0991ab9..0000000
Binary files a/docs/img/ch3/3.8c.png and /dev/null differ
diff --git a/docs/img/ch3/model_free_control_5.png b/docs/img/ch3/model_free_control_5.png
deleted file mode 100644
index 7cacf6c..0000000
Binary files a/docs/img/ch3/model_free_control_5.png and /dev/null differ
diff --git a/docs/img/ch3/model_free_control_6.png b/docs/img/ch3/model_free_control_6.png
deleted file mode 100644
index 97ff496..0000000
Binary files a/docs/img/ch3/model_free_control_6.png and /dev/null differ
diff --git a/docs/img/ch3/model_free_control_9.png b/docs/img/ch3/model_free_control_9.png
deleted file mode 100644
index a6d415d..0000000
Binary files a/docs/img/ch3/model_free_control_9.png and /dev/null differ
diff --git a/docs/img/ch4/4.22.png b/docs/img/ch4/4.22.png
index 450b44a..5039171 100644
Binary files a/docs/img/ch4/4.22.png and b/docs/img/ch4/4.22.png differ