diff --git a/docs/chapter5/第五章 动手搭建大模型.md b/docs/chapter5/第五章 动手搭建大模型.md index c42a156..ddbf7a5 100644 --- a/docs/chapter5/第五章 动手搭建大模型.md +++ b/docs/chapter5/第五章 动手搭建大模型.md @@ -114,7 +114,7 @@ torch.Size([1, 50, 768]) 在 LLaMA2 模型中,虽然只有 LLaMA2-70B模型使用了分组查询注意力机制(Grouped-Query Attention,GQA),但我们依然选择使用 GQA 来构建我们的 LLaMA Attention 模块,它可以提高模型的效率,并节省一些显存占用。
- alt text + alt text

图 5.2 LLaMA2 Attention 结构

diff --git a/docs/images/5-images/Attention.png b/docs/images/5-images/Attention.png deleted file mode 100644 index 99070f0..0000000 Binary files a/docs/images/5-images/Attention.png and /dev/null differ diff --git a/docs/images/5-images/llama2-attention.png b/docs/images/5-images/llama2-attention.png new file mode 100644 index 0000000..0d3bf83 Binary files /dev/null and b/docs/images/5-images/llama2-attention.png differ