courses/real_world_prompting/04_call_summarizer.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lesson 4: Call transcript summarizer\n",
    "\n",
    "In this lesson, we're going to write a complex prompt for a common customer use-case: summarizing.  Specifically, we'll summarize long customer service call transcripts.  Our goal is to summarize customer service calls for customer support metrics.  We want summaries of complete customer service calls to evaluate the efficacy of our customer support team.  This means we'll exclude calls that have connection issues, language barriers, and other issues that hinder effective summarization.\n",
    "\n",
    "Let's imagine we work for Acme Corporation, a company that sells smart home devices. The company handles hundreds of customer service calls daily and needs a way to quickly turn these conversations into **useful, structured data**. \n",
    "\n",
    "Some important considerations include:\n",
    "* Calls can be short and sweet or long and complicated.\n",
    "* Customers might be calling about anything from a simple Wi-Fi connection issue to a complex system malfunction.\n",
    "* We need our summaries in a specific format so they're easy to analyze later.\n",
    "* We have to be careful not to include any personal customer information in our summaries.\n",
    "\n",
    "To help us out, we'll follow the best practices we described previously:\n",
    "* Use a system prompt to set the stage.\n",
    "* Structure the prompt for optimal performance.\n",
    "* Give clear instructions and define your desired output.\n",
    "* Use XML tags to organize information.\n",
    "* Handle special cases and edge scenarios.\n",
    "* Provide examples to guide the model.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Understanding the data\n",
    "\n",
    "Now that we understand our task, let's take a look at the data we'll be working with. In this lesson, we'll use a variety of simulated customer service call transcripts from Acme Corporation's smart home device support team. These transcripts will help us create a robust prompt that can handle different scenarios.\n",
    "\n",
    "Let's examine some of the types of call transcripts we might encounter:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A short and simple transcript:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "call1 = \"\"\"\n",
    "Agent: Thank you for calling Acme Smart Home Support. This is Alex. How can I help you?\n",
    "Customer: Hi, I can't turn on my smart light bulb.\n",
    "Agent: I see. Have you tried resetting the bulb?\n",
    "Customer: Oh, no. How do I do that?\n",
    "Agent: Just turn the power off for 5 seconds, then back on. It should reset.\n",
    "Customer: Ok, I'll try that. Thanks!\n",
    "Agent: You're welcome. Call us back if you need further assistance.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A medium-length transcript with an eventual resolution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "call2 = \"\"\"\n",
    "Agent: Acme Smart Home Support, this is Jamie. How may I assist you today?\n",
    "Customer: Hi Jamie, my Acme SmartTherm isn't maintaining the temperature I set. It's set to 72 but the house is much warmer.\n",
    "Agent: I'm sorry to hear that. Let's troubleshoot. Is your SmartTherm connected to Wi-Fi?\n",
    "Customer: Yes, the Wi-Fi symbol is showing on the display.\n",
    "Agent: Great. Let's recalibrate your SmartTherm. Press and hold the menu button for 5 seconds.\n",
    "Customer: Okay, done. A new menu came up.\n",
    "Agent: Perfect. Navigate to \"Calibration\" and press select. Adjust the temperature to match your room thermometer.\n",
    "Customer: Alright, I've set it to 79 degrees to match.\n",
    "Agent: Great. Press select to confirm. It will recalibrate, which may take a few minutes. Check back in an hour to see if it's fixed.\n",
    "Customer: Okay, I'll do that. Thank you for your help, Jamie.\n",
    "Agent: You're welcome! Is there anything else I can assist you with today?\n",
    "Customer: No, that's all. Thanks again.\n",
    "Agent: Thank you for choosing Acme Smart Home. Have a great day!\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A longer call with no resolution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "call3 = \"\"\"\n",
    "Agent: Thank you for contacting Acme Smart Home Support. This is Sarah. How can I help you today?\n",
    "Customer: Hi Sarah, I'm having trouble with my Acme SecureHome system. The alarm keeps going off randomly.\n",
    "Agent: I'm sorry to hear that. Can you tell me when this started happening?\n",
    "Customer: It started about two days ago. It's gone off three times now, always in the middle of the night.\n",
    "Agent: I see. Are there any error messages on the control panel when this happens?\n",
    "Customer: No, I didn't notice any. But I was pretty groggy each time.\n",
    "Agent: Understood. Let's check a few things. First, can you confirm that all your doors and windows are closing properly?\n",
    "Customer: Yes, I've checked all of them. They're fine.\n",
    "Agent: Okay. Next, let's check the battery in your control panel. Can you tell me if the low battery indicator is on?\n",
    "Customer: Give me a moment... No, the battery indicator looks normal.\n",
    "Agent: Alright. It's possible that one of your sensors is malfunctioning. I'd like to run a diagnostic, but I'll need to transfer you to our technical team for that. Is that okay?\n",
    "Customer: Yes, that's fine. I just want this fixed. It's really disruptive.\n",
    "Agent: I completely understand. I'm going to transfer you now. They'll be able to run a full system diagnostic and hopefully resolve the issue for you.\n",
    "Customer: Okay, thank you.\n",
    "Agent: You're welcome. Thank you for your patience, and I hope you have a great rest of your day.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These examples showcase the variety of calls and considerations we need to handle:\n",
    "* Calls have wildly different lengths.\n",
    "* Calls feature various support issues (simple fixes, device malfunctions, complex problems).\n",
    "* Some calls end with a resolution and others remain unresolved cases.\n",
    "* Some calls require follow-up.\n",
    "\n",
    "As we build our prompt, we'll need to ensure it can effectively summarize all these types of calls, extracting the key information and presenting it in a consistent, structured format.\n",
    "In the next section, we'll start building our prompt, step by step, to handle this diverse range of call transcripts. \n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A simple version of the prompt\n",
    "Now that we understand our task and the kind of data we're working with, let's start building our prompt. We'll begin with a basic version and gradually refine it to handle the complexities of our call summarization task.\n",
    "\n",
    "Let's begin with this very simple prompt that outlines the basic task:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt = \"\"\"\n",
    "Summarize the following customer service call transcript. Focus on the main issue, how it was resolved, and any required follow-up.\n",
    "\n",
    "{transcript}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This basic prompt gives Claude a general idea of what we want, but it has several limitations:\n",
    "\n",
    "* It doesn't specify the desired output format, which could lead to inconsistent summaries.\n",
    "* It doesn't provide guidance on how to handle different scenarios (like unresolved issues or insufficient information).\n",
    "* It doesn't set any constraints on length or content, potentially resulting in overly long or detailed summaries.\n",
    "* It doesn't instruct Claude to omit personal information, which could lead to privacy issues.\n",
    "\n",
    "With that said, let's test it out to get a sense of how it performs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: you may need to restart the kernel to use updated packages.\n",
      "Your browser has been opened to visit:\n",
      "\n",
      "    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=764086051850-6qr4p6gpi6hn506pt8ejuq83di341hur.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A8085%2F&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fsqlservice.login&state=1JlV7BunjhxeVP0sHTct8UQyia4vQW&access_type=offline&code_challenge=qx8gVXITZrRA8x4zSIulz7tYTCgNRtPLYti6p2dEna8&code_challenge_method=S256\n",
      "\n",
      "^C\n",
      "\n",
      "\n",
      "Command killed by keyboard interrupt\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%pip install --quiet -U python-dotenv google-cloud-aiplatform \"anthropic[vertex]\"\n",
    "!gcloud auth application-default login"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "from anthropic import AnthropicVertex\n",
    "from dotenv import load_dotenv\n",
    "import os\n",
    "\n",
    "load_dotenv()\n",
    "\n",
    "project_id = os.environ.get(\"PROJECT_ID\")\n",
    "# Where the model is running. e.g. us-central1 or europe-west4 for haiku\n",
    "region = os.environ.get(\"REGION\")\n",
    "\n",
    "client = AnthropicVertex(project_id=project_id, region=region)\n",
    "\n",
    "def summarize_call(transcript):\n",
    "    final_prompt = prompt.format(transcript=transcript)\n",
    "    # Make the API call\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-sonnet@20240229\",\n",
    "        max_tokens=4096,\n",
    "        messages=[\n",
    "            {\"role\": \"user\", \"content\": final_prompt}\n",
    "        ]\n",
    "    )\n",
    "    print(response.content[0].text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Here is a summary of the customer service call transcript:\n",
      "\n",
      "Main Issue: The customer was unable to turn on their smart light bulb. \n",
      "\n",
      "Resolution: The customer service agent instructed the customer to reset the bulb by turning the power off for 5 seconds and then back on, which should reset the bulb.\n",
      "\n",
      "Follow-up: The agent told the customer to call back if they still needed further assistance after trying the reset procedure.\n",
      "\n",
      "Overall, it was a straightforward issue where the agent provided a simple troubleshooting step to potentially resolve the customer's problem with the smart light bulb not turning on. No major follow-up was required beyond checking if the reset worked or if the customer needed additional help.\n"
     ]
    }
   ],
   "source": [
    "summarize_call(call1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Summary:\n",
      "\n",
      "Main Issue: The customer's Acme SmartTherm thermostat was not maintaining the set temperature correctly. The thermostat was set to 72°F, but the house was much warmer.\n",
      "\n",
      "Resolution: The agent guided the customer through recalibrating the SmartTherm thermostat. This involved accessing the \"Calibration\" menu, adjusting the temperature to match a separate room thermometer reading of 79°F, and confirming the new calibration setting. The recalibration process may take a few minutes.\n",
      "\n",
      "Follow-up: The agent instructed the customer to check back in an hour to see if the recalibration fixed the temperature issue with the SmartTherm thermostat maintaining the desired temperature setting.\n"
     ]
    }
   ],
   "source": [
    "summarize_call(call2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Here is a summary of the customer service call transcript:\n",
      "\n",
      "Main Issue:\n",
      "The customer was having an issue with their Acme SecureHome security system, where the alarm kept going off randomly in the middle of the night, despite no apparent cause.\n",
      "\n",
      "How it was Resolved:\n",
      "The agent first checked if the issue could be caused by doors/windows not closing properly or a low battery in the control panel, but ruled those out based on the customer's responses. Unable to diagnose the issue further, the agent transferred the call to the technical team so they could run a full diagnostic on the system to identify the root cause, which was likely a malfunctioning sensor.\n",
      "\n",
      "Required Follow-Up:\n",
      "The technical team needs to complete the system diagnostic on the customer's SecureHome system to pinpoint the faulty sensor or component causing the random alarms. Once identified, they can work to replace/repair that part to resolve the issue permanently for the customer.\n"
     ]
    }
   ],
   "source": [
    "summarize_call(call3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, while Claude does provide a summary, it's not in a format that would be easy to analyze systematically. The summary might be too long or too short, and it might not consistently cover all the points we're interested in.\n",
    "\n",
    "\n",
    "In the next steps, we'll start adding more structure and guidance to our prompt to address these limitations. We'll see how each addition improves the quality and consistency of Claude's summaries.\n",
    "\n",
    "Remember, prompt engineering is an iterative process. We start simple and gradually refine our prompt.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Adding a system prompt\n",
    "\n",
    "The easiest place to start is with a system prompt that sets the overall context and role for Claude, helping to guide its behavior throughout the interaction.\n",
    "\n",
    "Let's start with this system prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "system = \"\"\"\n",
    "You are an expert customer service analyst, skilled at extracting key information from call transcripts and summarizing them in a structured format.\n",
    "Your task is to analyze customer service call transcripts and generate concise, accurate summaries while maintaining a professional tone.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Structuring our main prompt\n",
    "\n",
    "Next, we're going to start writing the main prompt.  We'll rely on some of these prompting tips:\n",
    "\n",
    "- Put long documents (our transcripts) at the top.\n",
    "- Add detailed instructions and output format requirements.\n",
    "- Introduce XML tags for structuring the prompt and output.\n",
    "- Give Claude space \"to think out loud\". \n",
    "\n",
    "Because this prompt may get quite long, we'll write individual pieces in isolation and then combine them together.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The input data\n",
    "When working with large language models like Claude, it's crucial to put long documents, like our call transcripts, at the beginning of the prompt. This ensures that Claude has all the necessary context before receiving specific instructions. We should also use XML tags to identify the transcript in the prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_pt1 = \"\"\"\n",
    "Analyze the following customer service call transcript and generate a JSON summary of the interaction:\n",
    "\n",
    "<transcript>\n",
    "[INSERT CALL TRANSCRIPT HERE]\n",
    "</transcript>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Instructions and output format\n",
    "\n",
    "Before we go any further, let's think clearly about what a good structured output format might look like.  To make our life easier when parsing the results, it's often easiest to ask Claude for a JSON response.  What should a good JSON look like in this case?\n",
    "\n",
    "At a minimum, our JSON output should include the following:\n",
    "- A status as to whether Claude had enough information to generate a summary.  We'll come back to this.  For now, we'll assume that all summaries have a status of \"COMPLETE\" meaning that Claude could generate a summary.\n",
    "- A summary of the customer issue\n",
    "- If the call requires additional follow up\n",
    "- Details on any follow up actions, if required (call the customer back, etc.)\n",
    "- How the issue was resolved\n",
    "- A list of ambiguities or vague points in the conversation\n",
    "\n",
    "Here's a proposed sample JSON structure:\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Brief description of the main problem or reason for the call\",\n",
    "    \"resolution\": \"How the issue was addressed or resolved, if applicable\",\n",
    "    \"followUpRequired\": true/false,\n",
    "    \"followUpDetails\": \"Description of any necessary follow-up actions, or null if none required\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"List of any unclear or vague points in the conversation, or an empty array if none\"]\n",
    "}\n",
    "```\n",
    "\n",
    "Let's create a new piece of our prompt that includes specific instructions, including:\n",
    "- Create a summary focusing on the main issue, resolution, and any follow-up actions required.\n",
    "- Generate a JSON output following our specific, standardized format.\n",
    "- Omit specific customer information in the summaries.\n",
    "- Keep each piece of the summary short.\n",
    "\n",
    "Here's an attempt at providing the output instructions, including our specific output JSON format:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_pt2 = \"\"\"\n",
    "Instructions:\n",
    "1. Read the transcript carefully.\n",
    "2. Analyze the transcript, focusing on the main issue, resolution, and any follow-up required.\n",
    "3. Generate a JSON object summarizing the key aspects of the interaction according to the specified structure.\n",
    "\n",
    "Important guidelines:\n",
    "- Confidentiality: Omit all specific customer data like names, phone numbers, and email addresses.\n",
    "- Character limit: Restrict each text field to a maximum of 100 characters.\n",
    "- Maintain a professional tone in your summary.\n",
    "\n",
    "Output format:\n",
    "Generate a JSON object with the following structure:\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Brief description of the main problem or reason for the call\",\n",
    "    \"resolution\": \"How the issue was addressed or resolved, if applicable\",\n",
    "    \"followUpRequired\": true/false,\n",
    "    \"followUpDetails\": \"Description of any necessary follow-up actions, or null if none required\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"List of any unclear or vague points in the conversation, or an empty array if none\"]\n",
    "}\n",
    "</json>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Using XML tags and giving Claude room to think\n",
    "Next, we'll employ two more prompting strategies: giving Claude room to think and using XML tags.\n",
    "- We'll ask Claude to start by outputting `<thinking>` tags that contain its analysis.\n",
    "- Then, we'll ask Claude to output its JSON output inside of `<json>`.\n",
    "\n",
    "Here's the final piece of our first draft prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_pt3 = \"\"\"\n",
    "Before generating the JSON, please analyze the transcript in <thinking> tags. \n",
    "Include your identification of the main issue, resolution, follow-up requirements, and any ambiguities. \n",
    "Then, provide your JSON output in <json> tags.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "By asking Claude to put its analysis within `<thinking>` tags, we're prompting it to break down its thought process before formulating the final JSON output. This encourages a more thorough and structured approach to analyzing the transcript.\n",
    "The `<thinking>` section allows us (and potentially other reviewers or systems) to see Claude's reasoning process. This transparency can be crucial for debugging and quality assurance purposes.\n",
    "\n",
    "\n",
    "By separating the analysis (`<thinking`>) from the structured output (`<json>`), we create a clear distinction between Claude's interpretation of the transcript and its formatted summary. This can be helpful in cases where we might want to review the analysis separately from the JSON output, but also by isolating the JSON content inside of `<json>` tags, we make it easy to parse the final response and capture the JSON we want to work with.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Testing our updated prompt\n",
    "\n",
    "Here's the complete version of the prompt, constructed by combining the individual prompt pieces we've written so far:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "system = \"\"\"\n",
    "You are an expert customer service analyst, skilled at extracting key information from call transcripts and summarizing them in a structured format.\n",
    "Your task is to analyze customer service call transcripts and generate concise, accurate summaries while maintaining a professional tone.\n",
    "\"\"\"\n",
    "\n",
    "prompt = \"\"\"\n",
    "Analyze the following customer service call transcript and generate a JSON summary of the interaction:\n",
    "\n",
    "<transcript>\n",
    "[INSERT CALL TRANSCRIPT HERE]\n",
    "</transcript>\n",
    "\n",
    "Instructions:\n",
    "1. Read the transcript carefully.\n",
    "2. Analyze the transcript, focusing on the main issue, resolution, and any follow-up required.\n",
    "3. Generate a JSON object summarizing the key aspects of the interaction according to the specified structure.\n",
    "\n",
    "Important guidelines:\n",
    "- Confidentiality: Omit all specific customer data like names, phone numbers, and email addresses.\n",
    "- Character limit: Restrict each text field to a maximum of 100 characters.\n",
    "- Maintain a professional tone in your summary.\n",
    "\n",
    "Output format:\n",
    "Generate a JSON object with the following structure:\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Brief description of the main problem or reason for the call\",\n",
    "    \"resolution\": \"How the issue was addressed or resolved, if applicable\",\n",
    "    \"followUpRequired\": true/false,\n",
    "    \"followUpDetails\": \"Description of any necessary follow-up actions, or null if none required\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"List of any unclear or vague points in the conversation, or an empty array if none\"]\n",
    "}\n",
    "</json>\n",
    "\n",
    "Before generating the JSON, please analyze the transcript in <thinking> tags. \n",
    "Include your identification of the main issue, resolution, follow-up requirements, and any ambiguities. \n",
    "Then, provide your JSON output in <json> tags.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's a function we can use to test our prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def summarize_call_with_improved_prompt(transcript):\n",
    "    final_prompt = prompt.replace(\"[INSERT CALL TRANSCRIPT HERE]\", transcript)\n",
    "    # Make the API call\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-sonnet@20240229\",\n",
    "        system=system,\n",
    "        max_tokens=4096,\n",
    "        messages=[\n",
    "            {\"role\": \"user\", \"content\": final_prompt}\n",
    "        ]\n",
    "    )\n",
    "    print(response.content[0].text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test out the prompt using some of the call transcripts we previously defined:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "From the transcript, the main issue appears to be that the customer could not turn on their smart light bulb. The resolution provided by the agent was to reset the bulb by turning the power off for 5 seconds and then back on.\n",
      "\n",
      "The agent did offer for the customer to call back if they needed further assistance, indicating potential follow-up may be required if the reset did not resolve the issue. However, no specific follow-up details were provided.\n",
      "\n",
      "There do not seem to be any significant ambiguities in the conversation.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Unable to turn on smart light bulb\",\n",
      "    \"resolution\": \"Agent instructed customer to reset the bulb by turning power off for 5 seconds, then back on\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Customer was advised to call back if the reset did not resolve the issue\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(call1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "Main issue: The customer's Acme SmartTherm thermostat is not maintaining the set temperature of 72°F, and the house is much warmer.\n",
      "\n",
      "Resolution: The agent guided the customer through recalibrating the SmartTherm thermostat by:\n",
      "1. Having the customer press and hold the menu button for 5 seconds.\n",
      "2. Navigating to the \"Calibration\" menu and selecting it.\n",
      "3. Adjusting the temperature to match the customer's room thermometer reading of 79°F.\n",
      "4. Confirming the new calibration setting.\n",
      "\n",
      "Follow-up required: Yes, the agent instructed the customer to check back in an hour to see if the recalibration resolved the temperature issue.\n",
      "\n",
      "Ambiguities: None\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Thermostat not maintaining set temperature, causing house to be much warmer.\",\n",
      "    \"resolution\": \"Agent guided customer through recalibrating the thermostat to match room temperature.\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Customer to check back in an hour to see if recalibration resolved the temperature issue.\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(call2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "Main issue: The customer's Acme SecureHome system alarm is going off randomly in the middle of the night, even though doors and windows are closed properly.\n",
      "\n",
      "Resolution: The agent suggests running a diagnostic on the system to identify potential sensor malfunctions. The customer is transferred to the technical team to perform the diagnostic and resolve the issue.\n",
      "\n",
      "Follow-up required: Yes, the technical team needs to follow up with the customer to diagnose and fix the alarm system problem.\n",
      "\n",
      "Ambiguities: None identified in the conversation.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Customer's home security alarm system is going off randomly at night without apparent cause.\",\n",
      "    \"resolution\": \"Agent suggests running a diagnostic to check for sensor malfunction and transfers customer to technical team.\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Technical team to diagnose and resolve the issue with the customer's alarm system.\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(call3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Those responses all look great! Let's try another call transcript that has a bit of ambiguity to it to see if the JSON result includes those ambiguities:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "Main Issue: The customer is experiencing issues with their Acme SmartLock not consistently locking automatically or manually through the app.\n",
      "\n",
      "Resolution: The agent attempted to troubleshoot by asking for the specific SmartLock model and suggesting a reset, but the customer had to leave before completing the troubleshooting process.\n",
      "\n",
      "Follow-Up Required: Yes, the customer needs to call back to complete a full diagnostic and troubleshooting session with the technical team.\n",
      "\n",
      "Ambiguities:\n",
      "- The customer was unsure if the issue was related to their phone or not, suggesting a potential connectivity problem.\n",
      "- The customer mentioned having issues with another Acme product (SmartTherm), but it's unclear if those issues are related to the SmartLock problem.\n",
      "- The customer's contact number was not clearly provided, which could make follow-up more difficult.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"SmartLock not consistently locking automatically or manually through the app.\",\n",
      "    \"resolution\": \"Attempted troubleshooting but customer had to leave before completing the process.\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Customer needs to call back for full diagnostic and troubleshooting session with technical team.\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": [\n",
      "    \"Potential connectivity issue with customer's phone\",\n",
      "    \"Unclear if issues with other Acme product are related\",\n",
      "    \"Customer's contact number not clearly provided\"\n",
      "  ]\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "ambiguous_call = \"\"\"\n",
    "Agent: Thank you for calling Acme Smart Home Support. This is Alex. How may I assist you today?\n",
    "Customer: Hi Alex, I'm having an issue with my SmartLock. It's not working properly.\n",
    "Agent: I'm sorry to hear that. Can you tell me more about what's happening with your SmartLock?\n",
    "Customer: Well, sometimes it doesn't lock when I leave the house. I think it might be related to my phone, but I'm not sure.\n",
    "Agent: I see. When you say it doesn't lock, do you mean it doesn't respond to the auto-lock feature, or are you trying to lock it manually through the app?\n",
    "Customer: Uh, both, I think. Sometimes one works, sometimes the other. It's inconsistent.\n",
    "Agent: Okay. And you mentioned it might be related to your phone. Have you noticed any pattern, like it works better when you're closer to the door?\n",
    "Customer: Maybe? I haven't really paid attention to that.\n",
    "Agent: Alright. Let's try to troubleshoot this. First, can you tell me what model of SmartLock you have?\n",
    "Customer: I'm not sure. I bought it about six months ago, if that helps.\n",
    "Agent: That's okay. Can you see a model number on the lock itself?\n",
    "Customer: I'd have to go check. Can we just assume it's the latest model?\n",
    "Agent: Well, knowing the exact model would help us troubleshoot more effectively. But let's continue with what we know. Have you tried resetting the lock recently?\n",
    "Customer: I think so. Or maybe that was my SmartTherm. I've been having issues with that too.\n",
    "Agent: I see. It sounds like we might need to do a full diagnostic on your SmartLock. Would you be comfortable if I walked you through that process now?\n",
    "Customer: Actually, I have to run to an appointment. Can I call back later?\n",
    "Agent: Of course. Before you go, is there a good contact number where our technical team can reach you for a more in-depth troubleshooting session?\n",
    "Customer: Sure, you can reach me at 555... oh wait, that's my old number. Let me check my new one... You know what, I'll just call back when I have more time.\n",
    "Agent: I understand. We're here 24/7 when you're ready to troubleshoot. Is there anything else I can help with before you go?\n",
    "Customer: No, that's it. Thanks.\n",
    "Agent: You're welcome. Thank you for choosing Acme Smart Home. Have a great day!\n",
    "\"\"\"\n",
    "\n",
    "summarize_call_with_improved_prompt(ambiguous_call)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! Everything seems to be working as intended"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Edge cases\n",
    "\n",
    "So far, all of the call transcripts we've tried have been relatively straightforward customer service calls. In the real world, we would expect to also encounter transcripts that perhaps we don't want to summarize, including: \n",
    "\n",
    "- Calls with connection issues\n",
    "- Calls with language barriers\n",
    "- Calls with garbled transcripts\n",
    "- Calls with irrational or upset customers\n",
    "\n",
    "Remember, our goal is to summarize these calls to help gauge the effectiveness of the customer service we offer.  If we include these edge-case calls in the summaries, we'll likely get skewed results.\n",
    "\n",
    "Let's see what happens with some of these edge cases with our current prompt.  Below we've defined some new call transcripts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "wrong_number_call = \"\"\"\n",
    "Agent: Acme Smart Home Support, Lisa speaking. How can I help you?\n",
    "Customer: Is this tech support?\n",
    "Agent: Yes, this is technical support for Acme Smart Home devices. What can I help you with?\n",
    "Customer: Sorry, wrong number.\n",
    "Agent: No problem. Have a nice day.\n",
    "\"\"\"\n",
    "\n",
    "incomplete_call = \"\"\"\n",
    "Agent: Acme Smart Home Support, this is Sarah. How can I assist you today?\n",
    "Customer: The thing isn't working.\n",
    "Agent: I'm sorry to hear that. Could you please specify which device you're having trouble with?\n",
    "Customer: You know, the usual one. Gotta go, bye.\n",
    "Agent: Wait, I need more infor... [call disconnected]\n",
    "\"\"\"\n",
    "\n",
    "garbled_call = \"\"\"\n",
    "Agent: Thank you for calling Acme Smart Home Support. This is Alex. How may I assist you today?\n",
    "Customer: [garbled voice]\n",
    "Agent: Hello? Are you there?\n",
    "\"\"\"\n",
    "\n",
    "language_barrier_call = \"\"\"\n",
    "Agent: Acme Smart Home Support, Sarah speaking. How can I help you today?\n",
    "Customer: [Speaking in Spanish]\n",
    "Agent: I apologize, but I don't speak Spanish. Do you speak English?\n",
    "Customer: [Continues Spanish]\n",
    "Agent: One moment please, I'll try to get a translator on the line...\n",
    "\"\"\"\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's run these edge-case transcripts through our prompt and see what sort of results we get:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "Issue: The customer appears to have dialed the wrong number for technical support.\n",
      "Resolution: Since it was a wrong number, there was no issue to resolve. The agent politely concluded the call.\n",
      "Follow-up: No follow-up is required since it was a misdialed call.\n",
      "Ambiguities: There are no apparent ambiguities in this brief conversation.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"The customer dialed the wrong number for technical support\",\n",
      "    \"resolution\": \"The agent concluded the call politely since it was a misdialed number\",\n",
      "    \"followUpRequired\": false,\n",
      "    \"followUpDetails\": null\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(wrong_number_call)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "In this transcript, the main issue is unclear as the customer does not provide sufficient details about the device they are having trouble with. The agent attempts to clarify the issue, but the customer disconnects the call before providing more information.\n",
      "\n",
      "There is no resolution since the issue is not clearly identified. However, follow-up is required to gather more details from the customer about the specific device and the problem they are experiencing.\n",
      "\n",
      "The key ambiguity is the lack of clarity about the device and the nature of the problem. The customer's vague statements (\"the thing isn't working\" and \"the usual one\") do not provide enough information for the agent to diagnose or resolve the issue.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Customer reported an unspecified device was not working but did not provide further details.\",\n",
      "    \"resolution\": \"No resolution was possible due to lack of information from the customer.\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Agent needs to contact the customer again to gather details about the specific device and issue.\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": [\"The device the customer was referring to\", \"The nature of the problem with the device\"]\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(incomplete_call)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "From the transcript, it appears the customer called Acme Smart Home Support, but their voice was garbled, and the agent could not understand them. With this limited information, it is unclear what the main issue or reason for the call was. There was no resolution provided, and it is ambiguous whether follow-up is required since the issue itself is unknown. The key ambiguity is the lack of clear communication from the customer, preventing the agent from understanding the problem.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Unclear due to garbled voice from the customer\",\n",
      "    \"resolution\": \"No resolution provided since the issue could not be understood\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Agent should try to reconnect with the customer for clearer communication\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": [\"The customer's voice was garbled, preventing understanding of the issue\"]\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(garbled_call)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<thinking>\n",
      "Main issue: The customer called and spoke in Spanish, but the agent could not understand Spanish.\n",
      "Resolution: The agent tried to get a translator on the line to resolve the language barrier.\n",
      "Follow-up required: Yes, the agent needs to connect with a Spanish translator to assist the customer.\n",
      "Ambiguities: It is unclear why the customer called, as the reason for their call is not stated in the transcript.\n",
      "</thinking>\n",
      "\n",
      "<json>\n",
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Customer spoke in a language the agent did not understand (Spanish).\",\n",
      "    \"resolution\": \"Agent attempted to get a translator to resolve the language barrier.\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Agent needs to connect the customer with a Spanish translator.\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": [\"Reason for the customer's call is not stated in the transcript.\"]\n",
      "}\n",
      "</json>\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_improved_prompt(language_barrier_call)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Unfortunately, we're getting full summaries for these edge-case transcripts.  Here are some problematic parts of the responses: \n",
    "\n",
    ">  \"customerIssue\": \"Customer spoke in a language the agent did not understand (Spanish).\"\n",
    "\n",
    "> \"customerIssue\": \"Unclear due to garbled voice from the customer\"\n",
    "\n",
    "> \"customerIssue\": \"The customer dialed the wrong number for technical support\" \n",
    "\n",
    "Remember that our goal is to summarize our customer service calls to get some insight into how effective our customer service team is.  These edge-case transcripts are resulting in complete summaries that will cause problems when analyzing all the summaries.  We'll need to decide on a strategy for handling these calls."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Further prompt improvements\n",
    "\n",
    "As we previously saw, our prompt is currently generating full summaries for edge-case transcripts.  We want to change this behavior.  We have a couple of options for how we handle these edge-cases:\n",
    "\n",
    "- Flag them in some way to indicate they are not summarizable, allowing for later human-review.\n",
    "- Categorize them separately (e.g., \"technical difficulty,\"  \"language barrier,\"  etc.).\n",
    "\n",
    "For simplicity's sake, we'll opt to flag these edge-case calls by asking the model to output JSON that looks like this: \n",
    "\n",
    "```json\n",
    "{\n",
    "  \"status\": \"INSUFFICIENT_DATA\"\n",
    "}\n",
    "```\n",
    "\n",
    "In order to make this work, we'll need to update our prompt in the following ways:\n",
    "- Add instructions explaining the desired \"INSUFFICIENT_DATA\" output\n",
    "- Add examples to show summarizable and non-summarizable transcripts along with their corresponding JSON outputs.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Updating our instructions\n",
    "\n",
    "Let's write a new part of the instructions portion of the prompt to explain when the model should output our \"INSUFFICIENT_DATA\" JSON."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Just the new content.  We'll look at the entire prompt in a moment\n",
    "new_instructions_addition = \"\"\"\n",
    "Insufficient data criteria:\n",
    "   If either of these conditions are met:\n",
    "   a) The transcript has fewer than 5 total exchanges, or\n",
    "   b) The customer's issue is unclear\n",
    "   c) The call is garbled, incomplete, or is hindered by a language barrier\n",
    "   Then return ONLY the following JSON:\n",
    "   {\n",
    "     \"status\": \"INSUFFICIENT_DATA\"\n",
    "   }\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Adding examples\n",
    "\n",
    "As we discussed previously in this course, it's almost always a good idea to add examples to a prompt.  In this specific use case, examples will help Claude generally understand the types of summaries we want for both summarizable and non-summarizable call transcripts.\n",
    "\n",
    "Here's a set of examples we could include in our prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "examples_for_prompt = \"\"\"\n",
    "<examples>\n",
    "1. Complete interaction:\n",
    "<transcript>\n",
    "Agent: Thank you for calling Acme Smart Home Support. This is Alex. How may I assist you today?\n",
    "Customer: Hi Alex, my Acme SmartTherm isn't maintaining the temperature I set. It's set to 72 but the house is much warmer.\n",
    "Agent: I'm sorry to hear that. Let's troubleshoot. Is your SmartTherm connected to Wi-Fi?\n",
    "Customer: Yes, the Wi-Fi symbol is showing on the display.\n",
    "Agent: Great. Let's recalibrate your SmartTherm. Press and hold the menu button for 5 seconds.\n",
    "Customer: Okay, done. A new menu came up.\n",
    "Agent: Perfect. Navigate to \"Calibration\" and press select. Adjust the temperature to match your room thermometer.\n",
    "Customer: Alright, I've set it to 79 degrees to match.\n",
    "Agent: Great. Press select to confirm. It will recalibrate, which may take a few minutes. Check back in an hour to see if it's fixed.\n",
    "Customer: Okay, I'll do that. Thank you for your help, Alex.\n",
    "Agent: You're welcome! Is there anything else I can assist you with today?\n",
    "Customer: No, that's all. Thanks again.\n",
    "Agent: Thank you for choosing Acme Smart Home. Have a great day!\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "Main issue: SmartTherm not maintaining set temperature\n",
    "Resolution: Guided customer through recalibration process\n",
    "Follow-up: Not required, but customer should check effectiveness after an hour\n",
    "Ambiguities: None identified\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"SmartTherm not maintaining set temperature, showing higher than set 72 degrees\",\n",
    "    \"resolution\": \"Guided customer through SmartTherm recalibration process\",\n",
    "    \"followUpRequired\": false,\n",
    "    \"followUpDetails\": null\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": []\n",
    "}\n",
    "</json>\n",
    "\n",
    "2. Interaction requiring follow-up:\n",
    "<transcript>\n",
    "Agent: Acme Smart Home Support, this is Jamie. How can I help you?\n",
    "Customer: Hi, I just installed my new Acme SmartCam, but I can't get it to connect to my Wi-Fi.\n",
    "Agent: I'd be happy to help. Are you using the Acme Smart Home app?\n",
    "Customer: Yes, I have the app on my phone.\n",
    "Agent: Great. Make sure your phone is connected to the 2.4GHz Wi-Fi network, not the 5GHz one.\n",
    "Customer: Oh, I'm on the 5GHz network. Should I switch?\n",
    "Agent: Yes, please switch to the 2.4GHz network. The SmartCam only works with 2.4GHz.\n",
    "Customer: Okay, done. Now what?\n",
    "Agent: Open the app, select 'Add Device', choose 'SmartCam', and follow the on-screen instructions.\n",
    "Customer: It's asking for a password now.\n",
    "Agent: Enter your Wi-Fi password and it should connect.\n",
    "Customer: It's still not working. I keep getting an error message.\n",
    "Agent: I see. In that case, I'd like to escalate this to our technical team. They'll contact you within 24 hours.\n",
    "Customer: Okay, that sounds good. Thank you for trying to help.\n",
    "Agent: You're welcome. Is there anything else you need assistance with?\n",
    "Customer: No, that's all for now. Thanks again.\n",
    "Agent: Thank you for choosing Acme Smart Home. Have a great day!\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "Main issue: Customer unable to connect new SmartCam to Wi-Fi\n",
    "Resolution: Initial troubleshooting unsuccessful, issue escalated to technical team\n",
    "Follow-up: Required, technical team to contact customer within 24 hours\n",
    "Ambiguities: Specific error message customer is receiving not mentioned\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Unable to connect new SmartCam to Wi-Fi\",\n",
    "    \"resolution\": \"Initial troubleshooting unsuccessful, issue escalated to technical team\",\n",
    "    \"followUpRequired\": true,\n",
    "    \"followUpDetails\": \"Technical team to contact customer within 24 hours for further assistance\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"Specific error message customer is receiving not mentioned\"]\n",
    "}\n",
    "</json>\n",
    "\n",
    "3. Insufficient data:\n",
    "<transcript>\n",
    "Agent: Acme Smart Home Support, this is Sam. How may I assist you?\n",
    "Customer: Hi, my smart lock isn't working.\n",
    "Agent: I'm sorry to hear that. Can you tell me more about the issue?\n",
    "Customer: It just doesn't work. I don't know what else to say.\n",
    "Agent: Okay, when did you first notice the problem? And what model of Acme smart lock do you have?\n",
    "Customer: I don't remember. Listen, I have to go. I'll call back later.\n",
    "Agent: Alright, we're here 24/7 if you need further assistance. Have a good day.\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "This transcript has fewer than 5 exchanges and the customer's issue is unclear. The customer doesn't provide specific details about the problem with the smart lock or respond to the agent's questions. This interaction doesn't provide sufficient information for a complete summary.\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"status\": \"INSUFFICIENT_DATA\"\n",
    "}\n",
    "</json>\n",
    "</examples>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the examples cover three different situations: \n",
    "* A complete interaction that does not require follow up\n",
    "* A complete interaction that does require follow up and contains ambiguities\n",
    "* A non-summarizable interaction that contains insufficient data\n",
    "\n",
    "When providing examples to Claude, it's important to cover a variety of input/output pairs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--- \n",
    "\n",
    "## Our final prompt\n",
    "\n",
    "Let's combine our initial prompt with the additions we made in the previous section:\n",
    "* the instructions on handling calls with insufficient data\n",
    "* the set of example inputs and outputs\n",
    "\n",
    "This is the new complete prompt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "system = \"\"\"\n",
    "You are an expert customer service analyst, skilled at extracting key information from call transcripts and summarizing them in a structured format.\n",
    "Your task is to analyze customer service call transcripts and generate concise, accurate summaries while maintaining a professional tone.\n",
    "\"\"\"\n",
    "\n",
    "prompt = \"\"\"\n",
    "Analyze the following customer service call transcript and generate a JSON summary of the interaction:\n",
    "\n",
    "<transcript>\n",
    "[INSERT CALL TRANSCRIPT HERE]\n",
    "</transcript>\n",
    "\n",
    "Instructions:\n",
    "<instructions>\n",
    "1. Read the transcript carefully.\n",
    "2. Analyze the transcript, focusing on the main issue, resolution, and any follow-up required.\n",
    "3. Generate a JSON object summarizing the key aspects of the interaction according to the specified structure.\n",
    "\n",
    "Important guidelines:\n",
    "- Confidentiality: Omit all specific customer data like names, phone numbers, and email addresses.\n",
    "- Character limit: Restrict each text field to a maximum of 100 characters.\n",
    "- Maintain a professional tone in your summary.\n",
    "\n",
    "Output format:\n",
    "Generate a JSON object with the following structure:\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Brief description of the main problem or reason for the call\",\n",
    "    \"resolution\": \"How the issue was addressed or resolved, if applicable\",\n",
    "    \"followUpRequired\": true/false,\n",
    "    \"followUpDetails\": \"Description of any necessary follow-up actions, or null if none required\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"List of any unclear or vague points in the conversation, or an empty array if none\"]\n",
    "}\n",
    "</json>\n",
    "\n",
    "Insufficient data criteria:\n",
    "   If any of these conditions are met:\n",
    "   a) The transcript has fewer than 5 total exchanges\n",
    "   b) The customer's issue is unclear\n",
    "   c) The call is garbled, incomplete, or is hindered by a language barrier\n",
    "   Then return ONLY the following JSON:\n",
    "   {\n",
    "     \"status\": \"INSUFFICIENT_DATA\"\n",
    "   }\n",
    "\n",
    "Examples: \n",
    "<examples>\n",
    "1. Complete interaction:\n",
    "<transcript>\n",
    "Agent: Thank you for calling Acme Smart Home Support. This is Alex. How may I assist you today?\n",
    "Customer: Hi Alex, my Acme SmartTherm isn't maintaining the temperature I set. It's set to 72 but the house is much warmer.\n",
    "Agent: I'm sorry to hear that. Let's troubleshoot. Is your SmartTherm connected to Wi-Fi?\n",
    "Customer: Yes, the Wi-Fi symbol is showing on the display.\n",
    "Agent: Great. Let's recalibrate your SmartTherm. Press and hold the menu button for 5 seconds.\n",
    "Customer: Okay, done. A new menu came up.\n",
    "Agent: Perfect. Navigate to \"Calibration\" and press select. Adjust the temperature to match your room thermometer.\n",
    "Customer: Alright, I've set it to 79 degrees to match.\n",
    "Agent: Great. Press select to confirm. It will recalibrate, which may take a few minutes. Check back in an hour to see if it's fixed.\n",
    "Customer: Okay, I'll do that. Thank you for your help, Alex.\n",
    "Agent: You're welcome! Is there anything else I can assist you with today?\n",
    "Customer: No, that's all. Thanks again.\n",
    "Agent: Thank you for choosing Acme Smart Home. Have a great day!\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "Main issue: SmartTherm not maintaining set temperature\n",
    "Resolution: Guided customer through recalibration process\n",
    "Follow-up: Not required, but customer should check effectiveness after an hour\n",
    "Ambiguities: None identified\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"SmartTherm not maintaining set temperature, showing higher than set 72 degrees\",\n",
    "    \"resolution\": \"Guided customer through SmartTherm recalibration process\",\n",
    "    \"followUpRequired\": false,\n",
    "    \"followUpDetails\": null\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": []\n",
    "}\n",
    "</json>\n",
    "\n",
    "2. Interaction requiring follow-up:\n",
    "<transcript>\n",
    "Agent: Acme Smart Home Support, this is Jamie. How can I help you?\n",
    "Customer: Hi, I just installed my new Acme SmartCam, but I can't get it to connect to my Wi-Fi.\n",
    "Agent: I'd be happy to help. Are you using the Acme Smart Home app?\n",
    "Customer: Yes, I have the app on my phone.\n",
    "Agent: Great. Make sure your phone is connected to the 2.4GHz Wi-Fi network, not the 5GHz one.\n",
    "Customer: Oh, I'm on the 5GHz network. Should I switch?\n",
    "Agent: Yes, please switch to the 2.4GHz network. The SmartCam only works with 2.4GHz.\n",
    "Customer: Okay, done. Now what?\n",
    "Agent: Open the app, select 'Add Device', choose 'SmartCam', and follow the on-screen instructions.\n",
    "Customer: It's asking for a password now.\n",
    "Agent: Enter your Wi-Fi password and it should connect.\n",
    "Customer: It's still not working. I keep getting an error message.\n",
    "Agent: I see. In that case, I'd like to escalate this to our technical team. They'll contact you within 24 hours.\n",
    "Customer: Okay, that sounds good. Thank you for trying to help.\n",
    "Agent: You're welcome. Is there anything else you need assistance with?\n",
    "Customer: No, that's all for now. Thanks again.\n",
    "Agent: Thank you for choosing Acme Smart Home. Have a great day!\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "Main issue: Customer unable to connect new SmartCam to Wi-Fi\n",
    "Resolution: Initial troubleshooting unsuccessful, issue escalated to technical team\n",
    "Follow-up: Required, technical team to contact customer within 24 hours\n",
    "Ambiguities: Specific error message customer is receiving not mentioned\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"summary\": {\n",
    "    \"customerIssue\": \"Unable to connect new SmartCam to Wi-Fi\",\n",
    "    \"resolution\": \"Initial troubleshooting unsuccessful, issue escalated to technical team\",\n",
    "    \"followUpRequired\": true,\n",
    "    \"followUpDetails\": \"Technical team to contact customer within 24 hours for further assistance\"\n",
    "  },\n",
    "  \"status\": \"COMPLETE\",\n",
    "  \"ambiguities\": [\"Specific error message customer is receiving not mentioned\"]\n",
    "}\n",
    "</json>\n",
    "\n",
    "3. Insufficient data:\n",
    "<transcript>\n",
    "Agent: Acme Smart Home Support, this is Sam. How may I assist you?\n",
    "Customer: Hi, my smart lock isn't working.\n",
    "Agent: I'm sorry to hear that. Can you tell me more about the issue?\n",
    "Customer: It just doesn't work. I don't know what else to say.\n",
    "Agent: Okay, when did you first notice the problem? And what model of Acme smart lock do you have?\n",
    "Customer: I don't remember. Listen, I have to go. I'll call back later.\n",
    "Agent: Alright, we're here 24/7 if you need further assistance. Have a good day.\n",
    "</transcript>\n",
    "\n",
    "<thinking>\n",
    "This transcript has fewer than 5 exchanges and the customer's issue is unclear. The customer doesn't provide specific details about the problem with the smart lock or respond to the agent's questions. This interaction doesn't provide sufficient information for a complete summary.\n",
    "</thinking>\n",
    "\n",
    "<json>\n",
    "{\n",
    "  \"status\": \"INSUFFICIENT_DATA\"\n",
    "}\n",
    "</json>\n",
    "</examples>\n",
    "</instructions>\n",
    "\n",
    "Before generating the JSON, please analyze the transcript in <thinking> tags. \n",
    "Include your identification of the main issue, resolution, follow-up requirements, and any ambiguities. \n",
    "Then, provide your JSON output in <json> tags.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The above prompt is quite long, but here is the general structure:\n",
    "- The system prompt sets the context, role, and tone for the model.\n",
    "- The main prompt includes the following:\n",
    "    - the call transcript\n",
    "    - a set of instructions containing:\n",
    "        - general instructions\n",
    "        - guidelines \n",
    "        - output format requirements\n",
    "        - details on handling edge-case calls\n",
    "        - examples\n",
    "    - details on the XML tags to use in the output\n",
    "\n",
    "Here's a summary to help visualize the flow of the prompt: \n",
    "\n",
    "```txt\n",
    "Analyze the following customer service call transcript and generate a JSON summary of the interaction:\n",
    "\n",
    "<transcript>\n",
    "[INSERT CALL TRANSCRIPT HERE]\n",
    "</transcript>\n",
    "\n",
    "<instructions>\n",
    "- General instructions and guidelines\n",
    "- Output JSON format description\n",
    "- Insufficient data (edge-case) criteria\n",
    "<examples>\n",
    "varied example inputs and outputs\n",
    "</examples>\n",
    "</instructions>\n",
    "\n",
    "Before generating the JSON, please analyze the transcript in <thinking> tags. \n",
    "Include your identification of the main issue, resolution, follow-up requirements, and any ambiguities. \n",
    "Then, provide your JSON output in <json> tags.\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test the final prompt with a new function.  Note that this function extracts the JSON summary content inside the `<json>` tags:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "\n",
    "def summarize_call_with_final_prompt(transcript):\n",
    "    final_prompt = prompt.replace(\"[INSERT CALL TRANSCRIPT HERE]\", transcript)\n",
    "    # Make the API call\n",
    "    response = client.messages.create(\n",
    "        model=\"claude-3-sonnet@20240229\",\n",
    "        system=system,\n",
    "        max_tokens=4096,\n",
    "        messages=[\n",
    "            {\"role\": \"user\", \"content\": final_prompt}\n",
    "        ]\n",
    "    )\n",
    "    \n",
    "    # Extract content between <json> tags\n",
    "    json_content = re.search(r'<json>(.*?)</json>', response.content[0].text, re.DOTALL)\n",
    "    \n",
    "    if json_content:\n",
    "        print(json_content.group(1).strip())\n",
    "    else:\n",
    "        print(\"No JSON content found in the response.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test it out with a bunch of our existing call variables:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Unable to turn on smart light bulb\",\n",
      "    \"resolution\": \"Agent guided customer to reset the bulb by cycling power off and on\",\n",
      "    \"followUpRequired\": false,\n",
      "    \"followUpDetails\": null\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(call1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"Acme SecureHome alarm system going off randomly multiple times at night without apparent cause\",\n",
      "    \"resolution\": \"Initial troubleshooting steps taken, but issue unresolved. Customer transferred to technical team for diagnostics\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Technical team to diagnose and resolve issue with alarm system\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": []\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(call3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try our call transcript that should result in a summary with a non-empty `ambiguities` array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"summary\": {\n",
      "    \"customerIssue\": \"SmartLock not reliably locking automatically or through app, behavior is inconsistent\",\n",
      "    \"resolution\": \"Troubleshooting attempted but incomplete due to lack of model details, customer had to leave\",\n",
      "    \"followUpRequired\": true,\n",
      "    \"followUpDetails\": \"Customer to call back for further troubleshooting of SmartLock issue when available\"\n",
      "  },\n",
      "  \"status\": \"COMPLETE\",\n",
      "  \"ambiguities\": [\n",
      "    \"Unclear if related SmartTherm issue mentioned\",\n",
      "    \"SmartLock model not identified\",\n",
      "    \"Customer's contact number not confirmed\"\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(ambiguous_call)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's try some of our edge case prompts that we do not want summarized:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"status\": \"INSUFFICIENT_DATA\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(garbled_call)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"status\": \"INSUFFICIENT_DATA\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(language_barrier_call)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"status\": \"INSUFFICIENT_DATA\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(incomplete_call)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! We're getting the exact outputs we want! Let's try pushing it even further:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"status\": \"INSUFFICIENT_DATA\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(\"blah blah blah\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"status\": \"INSUFFICIENT_DATA\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "summarize_call_with_final_prompt(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Excellent, the prompt is handling all of our edge cases!\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wrap up"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this lesson, we walked through the process of developing a complex prompt for summarizing customer service call transcripts. Let's recap the prompting techniques we employed:\n",
    "\n",
    "* System Prompt: We used a system prompt to set the overall context and role for Claude.\n",
    "* Structured Input: We placed the call transcript at the beginning of the prompt using XML tags.\n",
    "* Clear Instructions: We provided detailed guidelines on what to focus on and how to structure the output.\n",
    "* Output Formatting: We specified a JSON structure for the summary, ensuring consistent and easily parseable results.\n",
    "* Handling Edge Cases: We added criteria for identifying calls with insufficient data.\n",
    "* Examples: We included diverse examples to illustrate desired outputs for different scenarios.\n",
    "* Thinking Aloud: We asked Claude to show its analysis in <thinking> tags before providing the final JSON output.\n",
    "\n",
    "\n",
    "By employing these techniques, we created a robust prompt capable of generating structured summaries for a wide range of customer service call transcripts, while appropriately handling edge cases. This approach can be adapted to many other complex prompting scenarios beyond call summarization.\n",
    "\n",
    "\n",
    "**Important Note:** While we've developed a sophisticated prompt that appears to handle our test cases well, it's crucial to understand that this prompt is not yet production-ready. What we've created is a promising starting point, but it requires extensive testing and evaluation before it can be reliably used in a real-world setting. Our current eye-ball test evaluation has been based on a small set of examples. This is not representative of the diverse and often unpredictable nature of real customer service calls. To ensure the prompt's effectiveness and reliability, we need to implement a comprehensive evaluation process that includes quantitative metrics.  Robust, data-driven evaluations are the key to bridging the gap between a promising prototype and a reliable, production-grade solution."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py311",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}