diff --git a/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md b/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md index ef3013d..0282de0 100644 --- a/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md +++ b/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md @@ -157,7 +157,11 @@ soft actor-critic算法用伪代码可表示为: 虽然SAC算法采用了energy-based模型,但是实际上策略分布仍为高斯分布,存在一定的局限性。 - +==================================== +作者:杨骏铭 +研究单位:南京邮电大学 +研究方向:强化学习、对抗学习 +联系邮箱:jmingyang@outlook.com