Files
2024-09-18 12:33:21 -07:00

779 lines
26 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Appendix 10.2: Tool Use\n",
"\n",
"- [Lesson](#lesson)\n",
"- [Exercises](#exercises)\n",
"- [Example Playground](#example-playground)\n",
"\n",
"## Setup\n",
"\n",
"Run the following setup cell to load your API key and establish the `get_completion` helper function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install anthropic\n",
"\n",
"# Import python's built-in regular expression library\n",
"import re\n",
"import anthropic\n",
"\n",
"# Retrieve the API_KEY variable from the IPython store\n",
"%store -r API_KEY\n",
"\n",
"client = anthropic.Anthropic(api_key=API_KEY)\n",
"\n",
"# Rewrittten to call Claude 3 Sonnet, which is generally better at tool use, and include stop_sequences\n",
"def get_completion(messages, system_prompt=\"\", prefill=\"\",stop_sequences=None):\n",
" message = client.messages.create(\n",
" model=\"claude-3-sonnet-20240229\",\n",
" max_tokens=2000,\n",
" temperature=0.0,\n",
" system=system_prompt,\n",
" messages=messages,\n",
" stop_sequences=stop_sequences\n",
" )\n",
" return message.content[0].text"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Lesson\n",
"\n",
"While it might seem conceptually complex at first, tool use, a.k.a. function calling, is actually quite simple! You already know all the skills necessary to implement tool use, which is really just a combination of substitution and prompt chaining.\n",
"\n",
"In previous substitution exercises, we substituted text into prompts. With tool use, we substitute tool or function results into prompts. Claude can't literally call or access tools and functions. Instead, we have Claude:\n",
"1. Output the tool name and arguments it wants to call\n",
"2. Halt any further response generation while the tool is called\n",
"3. Then we reprompt with the appended tool results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function calling is useful because it expands Claude's capabilities and enables Claude to handle much more complex, multi-step tasks.\n",
"Some examples of functions you can give Claude:\n",
"- Calculator\n",
"- Word counter\n",
"- SQL database querying and data retrieval\n",
"- Weather API"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can get Claude to do tool use by combining these two elements:\n",
"\n",
"1. A system prompt, in which we give Claude an explanation of the concept of tool use as well as a detailed descriptive list of the tools it has access to\n",
"2. The control logic with which to orchestrate and execute Claude's tool use requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tool use roadmap\n",
"\n",
"*This lesson teaches our current tool use format. However, we will be updating and improving tool use functionality in the near future, including:*\n",
"* *A more streamlined format for function definitions and calls*\n",
"* *More robust error handling and edge case coverage*\n",
"* *Tighter integration with the rest of our API*\n",
"* *Better reliability and performance, especially for more complex tool use tasks*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Examples\n",
"\n",
"To enable tool use in Claude, we start with the system prompt. In this special tool use system prompt, wet tell Claude:\n",
"* The basic premise of tool use and what it entails\n",
"* How Claude can call and use the tools it's been given\n",
"* A detailed list of tools it has access to in this specific scenario \n",
"\n",
"Here's the first part of the system prompt, explaining tool use to Claude. This part of the system prompt is generalizable across all instances of prompting Claude for tool use. The tool calling structure we're giving Claude (`<function_calls> [...] </function_calls>`) is a structure Claude has been specifically trained to use, so we recommend that you stick with this."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_prompt_tools_general_explanation = \"\"\"You have access to a set of functions you can use to answer the user's question. This includes access to a\n",
"sandboxed computing environment. You do NOT currently have the ability to inspect files or interact with external\n",
"resources, except by invoking the below functions.\n",
"\n",
"You can invoke one or more functions by writing a \"<function_calls>\" block like the following as part of your\n",
"reply to the user:\n",
"<function_calls>\n",
"<invoke name=\"$FUNCTION_NAME\">\n",
"<antml:parameter name=\"$PARAMETER_NAME\">$PARAMETER_VALUE</parameter>\n",
"...\n",
"</invoke>\n",
"<nvoke name=\"$FUNCTION_NAME2\">\n",
"...\n",
"</invoke>\n",
"</function_calls>\n",
"\n",
"String and scalar parameters should be specified as is, while lists and objects should use JSON format. Note that\n",
"spaces for string values are not stripped. The output is not expected to be valid XML and is parsed with regular\n",
"expressions.\n",
"\n",
"The output and/or any errors will appear in a subsequent \"<function_results>\" block, and remain there as part of\n",
"your reply to the user.\n",
"You may then continue composing the rest of your reply to the user, respond to any errors, or make further function\n",
"calls as appropriate.\n",
"If a \"<function_results>\" does NOT appear after your function calls, then they are likely malformatted and not\n",
"recognized as a call.\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's the second part of the system prompt, which defines the exact tools Claude has access to in this specific situation. In this example, we will be giving Claude a calculator tool, which takes three parameters: two operands and an operator. \n",
"\n",
"Then we combine the two parts of the system prompt."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_prompt_tools_specific_tools = \"\"\"Here are the functions available in JSONSchema format:\n",
"<tools>\n",
"<tool_description>\n",
"<tool_name>calculator</tool_name>\n",
"<description>\n",
"Calculator function for doing basic arithmetic.\n",
"Supports addition, subtraction, multiplication\n",
"</description>\n",
"<parameters>\n",
"<parameter>\n",
"<name>first_operand</name>\n",
"<type>int</type>\n",
"<description>First operand (before the operator)</description>\n",
"</parameter>\n",
"<parameter>\n",
"<name>second_operand</name>\n",
"<type>int</type>\n",
"<description>Second operand (after the operator)</description>\n",
"</parameter>\n",
"<parameter>\n",
"<name>operator</name>\n",
"<type>str</type>\n",
"<description>The operation to perform. Must be either +, -, *, or /</description>\n",
"</parameter>\n",
"</parameters>\n",
"</tool_description>\n",
"</tools>\n",
"\"\"\"\n",
"\n",
"system_prompt = system_prompt_tools_general_explanation + system_prompt_tools_specific_tools"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can give Claude a question that requires use of the `calculator` tool. We will use `<function_calls\\>` in `stop_sequences` to detect if and when Claude calls the function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"multiplication_message = {\n",
" \"role\": \"user\",\n",
" \"content\": \"Multiply 1,984,135 by 9,343,116\"\n",
"}\n",
"\n",
"stop_sequences = [\"</function_calls>\"]\n",
"\n",
"# Get Claude's response\n",
"function_calling_response = get_completion([multiplication_message], system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(function_calling_response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can extract out the parameters from Claude's function call and actually run the function on Claude's behalf.\n",
"\n",
"First we'll define the function's code."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def do_pairwise_arithmetic(num1, num2, operation):\n",
" if operation == '+':\n",
" return num1 + num2\n",
" elif operation == \"-\":\n",
" return num1 - num2\n",
" elif operation == \"*\":\n",
" return num1 * num2\n",
" elif operation == \"/\":\n",
" return num1 / num2\n",
" else:\n",
" return \"Error: Operation not supported.\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we'll extract the parameters from Claude's function call response. If all the parameters exist, we run the calculator tool."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def find_parameter(message, parameter_name):\n",
" parameter_start_string = f\"name=\\\"{parameter_name}\\\">\"\n",
" start = message.index(parameter_start_string)\n",
" if start == -1:\n",
" return None\n",
" if start > 0:\n",
" start = start + len(parameter_start_string)\n",
" end = start\n",
" while message[end] != \"<\":\n",
" end += 1\n",
" return message[start:end]\n",
"\n",
"first_operand = find_parameter(function_calling_response, \"first_operand\")\n",
"second_operand = find_parameter(function_calling_response, \"second_operand\")\n",
"operator = find_parameter(function_calling_response, \"operator\")\n",
"\n",
"if first_operand and second_operand and operator:\n",
" result = do_pairwise_arithmetic(int(first_operand), int(second_operand), operator)\n",
" print(\"---------------- RESULT ----------------\")\n",
" print(f\"{result:,}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have a result, we have to properly format that result so that when we pass it back to Claude, Claude understands what tool that result is in relation to. There is a set format for this that Claude has been trained to recognize:\n",
"```\n",
"<function_results>\n",
"<result>\n",
"<tool_name>{TOOL_NAME}</tool_name>\n",
"<stdout>\n",
"{TOOL_RESULT}\n",
"</stdout>\n",
"</result>\n",
"</function_results>\n",
"```\n",
"\n",
"Run the cell below to format the above tool result into this structure."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def construct_successful_function_run_injection_prompt(invoke_results):\n",
" constructed_prompt = (\n",
" \"<function_results>\\n\"\n",
" + '\\n'.join(\n",
" f\"<result>\\n<tool_name>{res['tool_name']}</tool_name>\\n<stdout>\\n{res['tool_result']}\\n</stdout>\\n</result>\"\n",
" for res in invoke_results\n",
" ) + \"\\n</function_results>\"\n",
" )\n",
"\n",
" return constructed_prompt\n",
"\n",
"formatted_results = [{\n",
" 'tool_name': 'do_pairwise_arithmetic',\n",
" 'tool_result': result\n",
"}]\n",
"function_results = construct_successful_function_run_injection_prompt(formatted_results)\n",
"print(function_results)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now all we have to do is send this result back to Claude by appending the result to the same message chain as before, and we're good!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"full_first_response = function_calling_response + \"</function_calls>\"\n",
"\n",
"# Construct the full conversation\n",
"messages = [multiplication_message,\n",
"{\n",
" \"role\": \"assistant\",\n",
" \"content\": full_first_response\n",
"},\n",
"{\n",
" \"role\": \"user\",\n",
" \"content\": function_results\n",
"}]\n",
" \n",
"# Print Claude's response\n",
"final_response = get_completion(messages, system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(\"------------- FINAL RESULT -------------\")\n",
"print(final_response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Congratulations on running an entire tool use chain end to end!\n",
"\n",
"Now what if we give Claude a question that doesn't that doesn't require using the given tool at all?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"non_multiplication_message = {\n",
" \"role\": \"user\",\n",
" \"content\": \"Tell me the capital of France.\"\n",
"}\n",
"\n",
"stop_sequences = [\"</function_calls>\"]\n",
"\n",
"# Get Claude's response\n",
"function_calling_response = get_completion([non_multiplication_message], system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(function_calling_response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Success! As you can see, Claude knew not to call the function when it wasn't needed.\n",
"\n",
"If you would like to experiment with the lesson prompts without changing any content above, scroll all the way to the bottom of the lesson notebook to visit the [**Example Playground**](#example-playground)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Exercises\n",
"- [Exercise 10.2.1 - SQL](#exercise-1021---SQL)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 10.2.1 - SQL\n",
"In this exercise, you'll be writing a tool use prompt for querying and writing to the world's smallest \"database\". Here's the initialized database, which is really just a dictionary."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"db = {\n",
" \"users\": [\n",
" {\"id\": 1, \"name\": \"Alice\", \"email\": \"alice@example.com\"},\n",
" {\"id\": 2, \"name\": \"Bob\", \"email\": \"bob@example.com\"},\n",
" {\"id\": 3, \"name\": \"Charlie\", \"email\": \"charlie@example.com\"}\n",
" ],\n",
" \"products\": [\n",
" {\"id\": 1, \"name\": \"Widget\", \"price\": 9.99},\n",
" {\"id\": 2, \"name\": \"Gadget\", \"price\": 14.99},\n",
" {\"id\": 3, \"name\": \"Doohickey\", \"price\": 19.99}\n",
" ]\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And here is the code for the functions that write to and from the database."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_user(user_id):\n",
" for user in db[\"users\"]:\n",
" if user[\"id\"] == user_id:\n",
" return user\n",
" return None\n",
"\n",
"def get_product(product_id):\n",
" for product in db[\"products\"]:\n",
" if product[\"id\"] == product_id:\n",
" return product\n",
" return None\n",
"\n",
"def add_user(name, email):\n",
" user_id = len(db[\"users\"]) + 1\n",
" user = {\"id\": user_id, \"name\": name, \"email\": email}\n",
" db[\"users\"].append(user)\n",
" return user\n",
"\n",
"def add_product(name, price):\n",
" product_id = len(db[\"products\"]) + 1\n",
" product = {\"id\": product_id, \"name\": name, \"price\": price}\n",
" db[\"products\"].append(product)\n",
" return product"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To solve the exercise, start by defining a system prompt like `system_prompt_tools_specific_tools` above. Make sure to include the name and description of each tool, along with the name and type and description of each parameter for each function. We've given you some starting scaffolding below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_prompt_tools_specific_tools_sql = \"\"\"\n",
"\"\"\"\n",
"\n",
"system_prompt = system_prompt_tools_general_explanation + system_prompt_tools_specific_tools_sql"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When you're ready, you can try out your tool definition system prompt on the examples below. Just run the below cell!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"examples = [\n",
" \"Add a user to the database named Deborah.\",\n",
" \"Add a product to the database named Thingo\",\n",
" \"Tell me the name of User 2\",\n",
" \"Tell me the name of Product 3\"\n",
"]\n",
"\n",
"for example in examples:\n",
" message = {\n",
" \"role\": \"user\",\n",
" \"content\": example\n",
" }\n",
"\n",
" # Get & print Claude's response\n",
" function_calling_response = get_completion([message], system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
" print(example, \"\\n----------\\n\\n\", function_calling_response, \"\\n*********\\n*********\\n*********\\n\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you did it right, the function calling messages should call the `add_user`, `add_product`, `get_user`, and `get_product` functions correctly.\n",
"\n",
"For extra credit, add some code cells and write parameter-parsing code. Then call the functions with the parameters Claude gives you to see the state of the \"database\" after the call."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"❓ If you want to see a possible solution, run the cell below!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from hints import exercise_10_2_1_solution; print(exercise_10_2_1_solution)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Congrats!\n",
"\n",
"Congratulations on learning tool use and function calling! Head over to the last appendix section if you would like to learn more about search & RAG."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Example Playground\n",
"\n",
"This is an area for you to experiment freely with the prompt examples shown in this lesson and tweak prompts to see how it may affect Claude's responses."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_prompt_tools_general_explanation = \"\"\"You have access to a set of functions you can use to answer the user's question. This includes access to a\n",
"sandboxed computing environment. You do NOT currently have the ability to inspect files or interact with external\n",
"resources, except by invoking the below functions.\n",
"\n",
"You can invoke one or more functions by writing a \"<function_calls>\" block like the following as part of your\n",
"reply to the user:\n",
"<function_calls>\n",
"<invoke name=\"$FUNCTION_NAME\">\n",
"<antml:parameter name=\"$PARAMETER_NAME\">$PARAMETER_VALUE</parameter>\n",
"...\n",
"</invoke>\n",
"<nvoke name=\"$FUNCTION_NAME2\">\n",
"...\n",
"</invoke>\n",
"</function_calls>\n",
"\n",
"String and scalar parameters should be specified as is, while lists and objects should use JSON format. Note that\n",
"spaces for string values are not stripped. The output is not expected to be valid XML and is parsed with regular\n",
"expressions.\n",
"\n",
"The output and/or any errors will appear in a subsequent \"<function_results>\" block, and remain there as part of\n",
"your reply to the user.\n",
"You may then continue composing the rest of your reply to the user, respond to any errors, or make further function\n",
"calls as appropriate.\n",
"If a \"<function_results>\" does NOT appear after your function calls, then they are likely malformatted and not\n",
"recognized as a call.\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"system_prompt_tools_specific_tools = \"\"\"Here are the functions available in JSONSchema format:\n",
"<tools>\n",
"<tool_description>\n",
"<tool_name>calculator</tool_name>\n",
"<description>\n",
"Calculator function for doing basic arithmetic.\n",
"Supports addition, subtraction, multiplication\n",
"</description>\n",
"<parameters>\n",
"<parameter>\n",
"<name>first_operand</name>\n",
"<type>int</type>\n",
"<description>First operand (before the operator)</description>\n",
"</parameter>\n",
"<parameter>\n",
"<name>second_operand</name>\n",
"<type>int</type>\n",
"<description>Second operand (after the operator)</description>\n",
"</parameter>\n",
"<parameter>\n",
"<name>operator</name>\n",
"<type>str</type>\n",
"<description>The operation to perform. Must be either +, -, *, or /</description>\n",
"</parameter>\n",
"</parameters>\n",
"</tool_description>\n",
"</tools>\n",
"\"\"\"\n",
"\n",
"system_prompt = system_prompt_tools_general_explanation + system_prompt_tools_specific_tools"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"multiplication_message = {\n",
" \"role\": \"user\",\n",
" \"content\": \"Multiply 1,984,135 by 9,343,116\"\n",
"}\n",
"\n",
"stop_sequences = [\"</function_calls>\"]\n",
"\n",
"# Get Claude's response\n",
"function_calling_response = get_completion([multiplication_message], system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(function_calling_response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def do_pairwise_arithmetic(num1, num2, operation):\n",
" if operation == '+':\n",
" return num1 + num2\n",
" elif operation == \"-\":\n",
" return num1 - num2\n",
" elif operation == \"*\":\n",
" return num1 * num2\n",
" elif operation == \"/\":\n",
" return num1 / num2\n",
" else:\n",
" return \"Error: Operation not supported.\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def find_parameter(message, parameter_name):\n",
" parameter_start_string = f\"name=\\\"{parameter_name}\\\">\"\n",
" start = message.index(parameter_start_string)\n",
" if start == -1:\n",
" return None\n",
" if start > 0:\n",
" start = start + len(parameter_start_string)\n",
" end = start\n",
" while message[end] != \"<\":\n",
" end += 1\n",
" return message[start:end]\n",
"\n",
"first_operand = find_parameter(function_calling_response, \"first_operand\")\n",
"second_operand = find_parameter(function_calling_response, \"second_operand\")\n",
"operator = find_parameter(function_calling_response, \"operator\")\n",
"\n",
"if first_operand and second_operand and operator:\n",
" result = do_pairwise_arithmetic(int(first_operand), int(second_operand), operator)\n",
" print(\"---------------- RESULT ----------------\")\n",
" print(f\"{result:,}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def construct_successful_function_run_injection_prompt(invoke_results):\n",
" constructed_prompt = (\n",
" \"<function_results>\\n\"\n",
" + '\\n'.join(\n",
" f\"<result>\\n<tool_name>{res['tool_name']}</tool_name>\\n<stdout>\\n{res['tool_result']}\\n</stdout>\\n</result>\"\n",
" for res in invoke_results\n",
" ) + \"\\n</function_results>\"\n",
" )\n",
"\n",
" return constructed_prompt\n",
"\n",
"formatted_results = [{\n",
" 'tool_name': 'do_pairwise_arithmetic',\n",
" 'tool_result': result\n",
"}]\n",
"function_results = construct_successful_function_run_injection_prompt(formatted_results)\n",
"print(function_results)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"full_first_response = function_calling_response + \"</function_calls>\"\n",
"\n",
"# Construct the full conversation\n",
"messages = [multiplication_message,\n",
"{\n",
" \"role\": \"assistant\",\n",
" \"content\": full_first_response\n",
"},\n",
"{\n",
" \"role\": \"user\",\n",
" \"content\": function_results\n",
"}]\n",
" \n",
"# Print Claude's response\n",
"final_response = get_completion(messages, system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(\"------------- FINAL RESULT -------------\")\n",
"print(final_response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"non_multiplication_message = {\n",
" \"role\": \"user\",\n",
" \"content\": \"Tell me the capital of France.\"\n",
"}\n",
"\n",
"stop_sequences = [\"</function_calls>\"]\n",
"\n",
"# Get Claude's response\n",
"function_calling_response = get_completion([non_multiplication_message], system_prompt=system_prompt, stop_sequences=stop_sequences)\n",
"print(function_calling_response)"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}