From 2fca30c23954ac6a9da9c13f8d450e1a149912fe Mon Sep 17 00:00:00 2001 From: KMnO4-zx <1021385881@qq.com> Date: Wed, 18 Jun 2025 16:32:07 +0800 Subject: [PATCH] =?UTF-8?q?docs(chapter5):=20=E6=9B=B4=E6=96=B0LLaMA2?= =?UTF-8?q?=E6=B3=A8=E6=84=8F=E5=8A=9B=E6=9C=BA=E5=88=B6=E5=9B=BE=E7=A4=BA?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/chapter5/第五章 动手搭建大模型.md | 2 +- docs/images/5-images/Attention.png | Bin 178723 -> 0 bytes docs/images/5-images/llama2-attention.png | Bin 0 -> 329034 bytes 3 files changed, 1 insertion(+), 1 deletion(-) delete mode 100644 docs/images/5-images/Attention.png create mode 100644 docs/images/5-images/llama2-attention.png diff --git a/docs/chapter5/第五章 动手搭建大模型.md b/docs/chapter5/第五章 动手搭建大模型.md index c42a156..ddbf7a5 100644 --- a/docs/chapter5/第五章 动手搭建大模型.md +++ b/docs/chapter5/第五章 动手搭建大模型.md @@ -114,7 +114,7 @@ torch.Size([1, 50, 768]) 在 LLaMA2 模型中,虽然只有 LLaMA2-70B模型使用了分组查询注意力机制(Grouped-Query Attention,GQA),但我们依然选择使用 GQA 来构建我们的 LLaMA Attention 模块,它可以提高模型的效率,并节省一些显存占用。
+ 图 5.2 LLaMA2 Attention 结构