From d48e1b14150fe107894d908236dc09e589ec24d5 Mon Sep 17 00:00:00 2001 From: Yiyuan Yang Date: Sun, 20 Nov 2022 21:58:37 +0800 Subject: [PATCH] Update readme.md --- papers/readme.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/papers/readme.md b/papers/readme.md index 5880a1b..93ff2bb 100644 --- a/papers/readme.md +++ b/papers/readme.md @@ -21,7 +21,7 @@ | | Emergence of Locomotion Behaviours in Rich Environments (**PPO-Penalty**) | https://arxiv.org/abs/1707.02286 | | | | Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (**ACKTP**) | https://arxiv.org/abs/1708.05144 | | | | Sample Efficient Actor-Critic with Experience Replay (**ACER**) | https://arxiv.org/abs/1611.01224 | | -| | Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with (**SAC**) | https://arxiv.org/abs/1801.01290 | | +| | Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor(**SAC**) | https://arxiv.org/abs/1801.01290 | | | | Deterministic Policy Gradient Algorithms (**DPG**) | http://proceedings.mlr.press/v32/silver14.pdf | | | | Continuous Control With Deep Reinforcement Learning (**DDPG**) | https://arxiv.org/abs/1509.02971 | | | | Addressing Function Approximation Error in Actor-Critic Methods (**TD3**) | https://arxiv.org/abs/1802.09477 | |