From c88b1281f3b31e0197b59e9534d6ead1044f4c6a Mon Sep 17 00:00:00 2001 From: Yiyuan Yang Date: Sun, 20 Nov 2022 23:37:00 +0800 Subject: [PATCH] Update Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md --- ...opy Deep Reinforcement Learning with a Stochastic Actor.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md b/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md index 0282de0..5cc9551 100644 --- a/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md +++ b/papers/Policy_gradient/Soft Actor-Critic_Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.md @@ -158,9 +158,13 @@ soft actor-critic算法用伪代码可表示为: 虽然SAC算法采用了energy-based模型,但是实际上策略分布仍为高斯分布,存在一定的局限性。 ==================================== + 作者:杨骏铭 + 研究单位:南京邮电大学 + 研究方向:强化学习、对抗学习 + 联系邮箱:jmingyang@outlook.com