From a24229134ab1bb8f225968d7450728101e1f9d6b Mon Sep 17 00:00:00 2001
From: Colt Steele MacBook <colt@anthropic.com>
Date: Wed, 4 Sep 2024 18:46:32 -0600
Subject: [PATCH] add note to all promptfoo lectures

---
 .../06_prompt_foo_code_graded_classification/lesson.ipynb     | 3 +++
 prompt_evaluations/07_prompt_foo_custom_graders/lesson.ipynb  | 3 +++
 prompt_evaluations/08_prompt_foo_model_graded/lesson.ipynb    | 3 +++
 .../09_custom_model_graded_prompt_foo/lesson.ipynb            | 4 ++++
 4 files changed, 13 insertions(+)

diff --git a/prompt_evaluations/06_prompt_foo_code_graded_classification/lesson.ipynb b/prompt_evaluations/06_prompt_foo_code_graded_classification/lesson.ipynb
index defc42f..685e4b1 100644
--- a/prompt_evaluations/06_prompt_foo_code_graded_classification/lesson.ipynb
+++ b/prompt_evaluations/06_prompt_foo_code_graded_classification/lesson.ipynb
@@ -6,6 +6,9 @@
    "source": [
     "# Promptfoo: classification evaluations\n",
     "\n",
+    "**Note: This lesson lives in a folder that contains relevant code files. Download the entire folder if you want to follow along and run the evaluation yourself**\n",
+    "\n",
+    "\n",
     "In an earlier lesson, we evaluated prompts to classify customer complains like: \n",
     "\n",
     "> Whenever I open your app, my phone gets really slow\n",
diff --git a/prompt_evaluations/07_prompt_foo_custom_graders/lesson.ipynb b/prompt_evaluations/07_prompt_foo_custom_graders/lesson.ipynb
index 441af89..b94a5f2 100644
--- a/prompt_evaluations/07_prompt_foo_custom_graders/lesson.ipynb
+++ b/prompt_evaluations/07_prompt_foo_custom_graders/lesson.ipynb
@@ -6,6 +6,9 @@
    "source": [
     "# Promptfoo: custom code graders\n",
     "\n",
+    "**Note: This lesson lives in a folder that contains relevant code files. Download the entire folder if you want to follow along and run the evaluation yourself**\n",
+    "\n",
+    "\n",
     "So far we've seen how to use some of the built-in promptfoo graders like `exact-match` and `contains-all`.  Those are often useful features, but promptfoo also gives us the ability to write custom grading logic for more specific grading tasks. \n",
     "\n",
     "To demonstrate this, we'll use a very simple prompt template:\n",
diff --git a/prompt_evaluations/08_prompt_foo_model_graded/lesson.ipynb b/prompt_evaluations/08_prompt_foo_model_graded/lesson.ipynb
index e5c3759..8bac09c 100644
--- a/prompt_evaluations/08_prompt_foo_model_graded/lesson.ipynb
+++ b/prompt_evaluations/08_prompt_foo_model_graded/lesson.ipynb
@@ -6,6 +6,9 @@
    "source": [
     "# Model-graded evaluations with promptfoo\n",
     "\n",
+    "**Note: This lesson lives in a folder that contains relevant code files. Download the entire folder if you want to follow along and run the evaluation yourself**\n",
+    "\n",
+    "\n",
     "So far, we've only written code-graded evaluations. Whenever possible, code-graded evaluations are the simplest and least-expensive evaluations to run. They offer clear-cut, objective assessments based on predefined criteria, making them ideal for tasks with straightforward, quantifiable outcomes. The trouble is that code-graded evaluations can only grade certain types of outputs, primarily those that can be reduced to exact matches, numerical comparisons, or other programmable logic.\n",
     "\n",
     "However, many real-world applications of language models require more nuanced evaluation. Suppose we wanted to build a chatbot to be used in middle-school classrooms. We might want to evaluate the outputs to make sure they use age-appropriate language, maintain an educational tone, avoid answering non-academic questions, or provide explanations at a suitable complexity level for middle schoolers. These criteria are subjective and context-dependent, making them challenging to assess with traditional code-based methods. This is where model-graded evaluations can help!\n",
diff --git a/prompt_evaluations/09_custom_model_graded_prompt_foo/lesson.ipynb b/prompt_evaluations/09_custom_model_graded_prompt_foo/lesson.ipynb
index 1c73493..8725ae9 100644
--- a/prompt_evaluations/09_custom_model_graded_prompt_foo/lesson.ipynb
+++ b/prompt_evaluations/09_custom_model_graded_prompt_foo/lesson.ipynb
@@ -5,6 +5,10 @@
    "metadata": {},
    "source": [
     "# Custom model-graded evals \n",
+    "\n",
+    "**Note: This lesson lives in a folder that contains relevant code files. Download the entire folder if you want to follow along and run the evaluation yourself**\n",
+    "\n",
+    "\n",
     "In this lesson, we'll see how we can write custom model-graded evaluations using promptfoo. We'll start with a simple prompting goal: we want to write a prompt that can turn long, technically complex Wikipedia articles into short summaries appropriate for a grade school audience.\n",
     "\n",
     "For example, given the entire [Wikipedia entry on convolutional neural networks](https://en.wikipedia.org/wiki/Convolutional_neural_network), we want simple output summary like this one:\n",