feat: cache-tracking progress

wip: cache-tracking progress
2026-04-01 06:25:26 +00:00 · 2026-04-01 06:15:13 +00:00 · 2026-04-01 04:40:17 +00:00 · 2026-04-01 04:30:24 +00:00 · 2026-04-01 03:35:25 +00:00 · 2026-04-01 03:22:34 +00:00
21 changed files with 2896 additions and 339 deletions
@@ -0,0 +1,5 @@
 {
  "permissions": {
    "defaultMode": "dontAsk"
  }
 }
@@ -2,3 +2,6 @@ __pycache__/
 archive/
 .omx/
 .clawd-agents/
 # Claude Code local artifacts
 .claude/settings.local.json
 .claude/sessions/
@@ -0,0 +1,21 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Detected stack
 - Languages: Rust.
 - Frameworks: none detected from the supported starter markers.
 ## Verification
 - Run Rust verification from `rust/`: `cargo fmt`, `cargo clippy --workspace --all-targets -- -D warnings`, `cargo test --workspace`
 - `src/` and `tests/` are both present; update both surfaces together when behavior changes.
 ## Repository shape
 - `rust/` contains the Rust workspace and active CLI/runtime implementation.
 - `src/` contains source files that should stay consistent with generated guidance and tests.
 - `tests/` contains validation surfaces that should be reviewed alongside code changes.
 ## Working agreement
 - Prefer small, reviewable changes and keep generated bootstrap files aligned with actual repo workflows.
 - Keep shared defaults in `.claude.json`; reserve `.claude/settings.local.json` for machine-local overrides.
 - Do not overwrite existing `CLAUDE.md` content automatically; update it intentionally when repo workflows change.
@@ -0,0 +1 @@
 {"messages":[{"blocks":[{"text":"clear","type":"text"}],"role":"user"},{"blocks":[{"text":"\n\nI've cleared the conversation. How can I help you today?","type":"text"}],"role":"assistant","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":4272,"output_tokens":17}}],"version":1}
@@ -0,0 +1 @@
 {"messages":[{"blocks":[{"text":"exit","type":"text"}],"role":"user"},{"blocks":[{"text":"\n\nGoodbye! 👋","type":"text"}],"role":"assistant","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":4272,"output_tokens":10}}],"version":1}
@@ -0,0 +1 @@
 {"messages":[],"version":1}
@@ -1,22 +1,27 @@
 [
  {
-    "content": "Phase 0: Structural Cleanup — spawn 4 agents for 0.1-0.4",
+    "content": "Architecture & dependency analysis",
-    "activeForm": "Executing Phase 0: Structural Cleanup via sub-agents",
+    "activeForm": "Complete",
    "status": "completed"
  },
  {
    "content": "Runtime crate deep analysis",
    "activeForm": "Complete",
    "status": "completed"
  },
  {
    "content": "CLI & Tools analysis",
    "activeForm": "Complete",
    "status": "completed"
  },
  {
    "content": "Code quality verification",
    "activeForm": "Complete",
    "status": "completed"
  },
  {
    "content": "Synthesize findings into unified report",
    "activeForm": "Writing report",
    "status": "in_progress"
  },
  {
    "content": "Phase 1.1-1.2: Status bar with live HUD and token counter",
    "activeForm": "Awaiting Phase 0",
    "status": "pending"
  },
  {
    "content": "Phase 2.4: Remove artificial 8ms stream delay",
    "activeForm": "Awaiting Phase 0",
    "status": "pending"
  },
  {
    "content": "Phase 3.1: Collapsible tool output",
    "activeForm": "Awaiting Phase 0",
    "status": "pending"
  }
 ]
@@ -1545,10 +1545,12 @@ dependencies = [
 name = "tools"
 version = "0.1.0"
 dependencies = [
 "api",
 "reqwest",
 "runtime",
 "serde",
 "serde_json",
 "tokio",
 ]
 [[package]]
@@ -1,230 +1,149 @@
-# Rusty Claude CLI
+# 🦞 Claw Code — Rust Implementation
-`rust/` contains the Rust workspace for the integrated `rusty-claude-cli` deliverable.
+A high-performance Rust rewrite of the Claude Code CLI agent harness. Built for speed, safety, and native tool execution.
 It is intended to be something you can clone, build, and run directly.
-## Workspace layout
+## Quick Start
-```text
+```bash
 # Build
 cd rust/
 cargo build --release
 # Run interactive REPL
 ./target/release/claw
 # One-shot prompt
 ./target/release/claw prompt "explain this codebase"
 # With specific model
 ./target/release/claw --model sonnet prompt "fix the bug in main.rs"
 ```
 ## Configuration
 Set your API credentials:
 ```bash
 export ANTHROPIC_API_KEY="sk-ant-..."
 # Or use a proxy
 export ANTHROPIC_BASE_URL="https://your-proxy.com"
 ```
 Or authenticate via OAuth:
 ```bash
 claw login
 ```
 ## Features
 | Feature | Status |
 |---------|--------|
 | Anthropic API + streaming | ✅ |
 | OAuth login/logout | ✅ |
 | Interactive REPL (rustyline) | ✅ |
 | Tool system (bash, read, write, edit, grep, glob) | ✅ |
 | Web tools (search, fetch) | ✅ |
 | Sub-agent orchestration | ✅ |
 | Todo tracking | ✅ |
 | Notebook editing | ✅ |
 | CLAUDE.md / project memory | ✅ |
 | Config file hierarchy (.claude.json) | ✅ |
 | Permission system | ✅ |
 | MCP server lifecycle | ✅ |
 | Session persistence + resume | ✅ |
 | Extended thinking (thinking blocks) | ✅ |
 | Cost tracking + usage display | ✅ |
 | Git integration | ✅ |
 | Markdown terminal rendering (ANSI) | ✅ |
 | Model aliases (opus/sonnet/haiku) | ✅ |
 | Slash commands (/status, /compact, /clear, etc.) | ✅ |
 | Hooks (PreToolUse/PostToolUse) | 🔧 Config only |
 | Plugin system | 📋 Planned |
 | Skills registry | 📋 Planned |
 ## Model Aliases
 Short names resolve to the latest model versions:
 | Alias | Resolves To |
 |-------|------------|
 | `opus` | `claude-opus-4-6` |
 | `sonnet` | `claude-sonnet-4-6` |
 | `haiku` | `claude-haiku-4-5-20251213` |
 ## CLI Flags
 ```
 claw [OPTIONS] [COMMAND]
 Options:
  --model MODEL                    Set the model (alias or full name)
  --dangerously-skip-permissions   Skip all permission checks
  --permission-mode MODE           Set read-only, workspace-write, or danger-full-access
  --allowedTools TOOLS             Restrict enabled tools
  --output-format FORMAT           Output format (text or json)
  --version, -V                    Print version info
 Commands:
  prompt <text>      One-shot prompt (non-interactive)
  login              Authenticate via OAuth
  logout             Clear stored credentials
  init               Initialize project config
  doctor             Check environment health
  self-update        Update to latest version
 ```
 ## Slash Commands (REPL)
 | Command | Description |
 |---------|-------------|
 | `/help` | Show help |
 | `/status` | Show session status (model, tokens, cost) |
 | `/cost` | Show cost breakdown |
 | `/compact` | Compact conversation history |
 | `/clear` | Clear conversation |
 | `/model [name]` | Show or switch model |
 | `/permissions` | Show or switch permission mode |
 | `/config [section]` | Show config (env, hooks, model) |
 | `/memory` | Show CLAUDE.md contents |
 | `/diff` | Show git diff |
 | `/export [path]` | Export conversation |
 | `/session [id]` | Resume a previous session |
 | `/version` | Show version |
 ## Workspace Layout
 ```
 rust/
-├── Cargo.toml
+├── Cargo.toml              # Workspace root
 ├── Cargo.lock
 ├── README.md
 └── crates/
-    ├── api/               # Anthropic API client + SSE streaming support
+    ├── api/                # Anthropic API client + SSE streaming
-    ├── commands/          # Shared slash-command metadata/help surfaces
+    ├── commands/           # Shared slash-command registry
-    ├── compat-harness/    # Upstream TS manifest extraction harness
+    ├── compat-harness/     # TS manifest extraction harness
-    ├── runtime/           # Session/runtime/config/prompt orchestration
+    ├── runtime/            # Session, config, permissions, MCP, prompts
-    ├── rusty-claude-cli/  # Main CLI binary
+    ├── rusty-claude-cli/   # Main CLI binary (`claw`)
-    └── tools/             # Built-in tool implementations
+    └── tools/              # Built-in tool implementations
 ```
-## Prerequisites
+### Crate Responsibilities
- Rust toolchain installed (`rustup`, stable toolchain)
+- **api** — HTTP client, SSE stream parser, request/response types, auth (API key + OAuth bearer)
- Network access and Anthropic credentials for live prompt/REPL usage
+- **commands** — Slash command definitions and help text generation
 - **compat-harness** — Extracts tool/prompt manifests from upstream TS source
 - **runtime** — `ConversationRuntime` agentic loop, `ConfigLoader` hierarchy, `Session` persistence, permission policy, MCP client, system prompt assembly, usage tracking
 - **rusty-claude-cli** — REPL, one-shot prompt, streaming display, tool call rendering, CLI argument parsing
 - **tools** — Tool specs + execution: Bash, ReadFile, WriteFile, EditFile, GlobSearch, GrepSearch, WebSearch, WebFetch, Agent, TodoWrite, NotebookEdit, Skill, ToolSearch, REPL runtimes
-## Build
+## Stats
-From the repository root:
+- **~20K lines** of Rust
 - **6 crates** in workspace
 - **Binary name:** `claw`
 - **Default model:** `claude-opus-4-6`
 - **Default permissions:** `danger-full-access`
-```bash
+## License
 cd rust
 cargo build --release -p rusty-claude-cli
 ```
-The optimized binary will be written to:
+See repository root.
 ```bash
 ./target/release/rusty-claude-cli
 ```
 ## Test
 Run the verified workspace test suite used for release-readiness:
 ```bash
 cd rust
 cargo test --workspace --exclude compat-harness
 ```
 ## Quick start
 ### Show help
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --help
 ```
 ### Print version
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --version
 ```
 ### Login with OAuth
 Configure `settings.json` with an `oauth` block containing `clientId`, `authorizeUrl`, `tokenUrl`, optional `callbackPort`, and optional `scopes`, then run:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- login
 ```
 This opens the browser, listens on the configured localhost callback, exchanges the auth code for tokens, and stores OAuth credentials in `~/.claude/credentials.json` (or `$CLAUDE_CONFIG_HOME/credentials.json`).
 ### Logout
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- logout
 ```
 This removes only the stored OAuth credentials and preserves unrelated JSON fields in `credentials.json`.
 ### Self-update
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- self-update
 ```
 The command checks the latest GitHub release for `instructkr/clawd-code`, compares it to the current binary version, downloads the matching binary asset plus checksum manifest, verifies SHA-256, replaces the current executable, and prints the release changelog. If no published release or matching asset exists, it exits safely with an explanatory message.
 ## Usage examples
 ### 1) Prompt mode
 Send one prompt, stream the answer, then exit:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- prompt "Summarize the architecture of this repository"
 ```
 Use a specific model:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --model claude-sonnet-4-20250514 prompt "List the key crates in this workspace"
 ```
 Restrict enabled tools in an interactive session:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --allowedTools read,glob
 ```
 Bootstrap Claude project files for the current repo:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- init
 ```
 ### 2) REPL mode
 Start the interactive shell:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli --
 ```
 Inside the REPL, useful commands include:
 ```text
 /help
 /status
 /model claude-sonnet-4-20250514
 /permissions workspace-write
 /cost
 /compact
 /memory
 /config
 /init
 /diff
 /version
 /export notes.txt
 /sessions
 /session list
 /exit
 ```
 ### 3) Resume an existing session
 Inspect or maintain a saved session file without entering the REPL:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --resume session-123456 /status /compact /cost
 ```
 You can also inspect memory/config state for a restored session:
 ```bash
 cd rust
 cargo run -p rusty-claude-cli -- --resume ~/.claude/sessions/session-123456.json /memory /config
 ```
 ## Available commands
 ### Top-level CLI commands
 - `prompt <text...>` — run one prompt non-interactively
 - `--resume <session-id-or-path> [/commands...]` — inspect or maintain a saved session stored under `~/.claude/sessions/`
 - `dump-manifests` — print extracted upstream manifest counts
 - `bootstrap-plan` — print the current bootstrap skeleton
 - `system-prompt [--cwd PATH] [--date YYYY-MM-DD]` — render the synthesized system prompt
 - `self-update` — update the installed binary from the latest GitHub release when a matching asset is available
 - `--help` / `-h` — show CLI help
 - `--version` / `-V` — print the CLI version and build info locally (no API call)
 - `--output-format text|json` — choose non-interactive prompt output rendering
 - `--allowedTools <tool[,tool...]>` — restrict enabled tools for interactive sessions and prompt-mode tool use
 ### Interactive slash commands
 - `/help` — show command help
 - `/status` — show current session status
 - `/compact` — compact local session history
 - `/model [model]` — inspect or switch the active model
 - `/permissions [read-only|workspace-write|danger-full-access]` — inspect or switch permissions
 - `/clear [--confirm]` — clear the current local session
 - `/cost` — show token usage totals
 - `/resume <session-id-or-path>` — load a saved session into the REPL
 - `/config [env|hooks|model]` — inspect discovered Claude config
 - `/memory` — inspect loaded instruction memory files
 - `/init` — bootstrap `.claude.json`, `.claude/`, `CLAUDE.md`, and local ignore rules
 - `/diff` — show the current git diff for the workspace
 - `/version` — print version and build metadata locally
 - `/export [file]` — export the current conversation transcript
 - `/sessions` — list recent managed local sessions from `~/.claude/sessions/`
 - `/session [list|switch <session-id>]` — inspect or switch managed local sessions
 - `/exit` — leave the REPL
 ## Environment variables
 ### Anthropic/API
 - `ANTHROPIC_API_KEY` — highest-precedence API credential
 - `ANTHROPIC_AUTH_TOKEN` — bearer-token override used when no API key is set
 - Persisted OAuth credentials in `~/.claude/credentials.json` — used when neither env var is set
 - `ANTHROPIC_BASE_URL` — override the Anthropic API base URL
 - `ANTHROPIC_MODEL` — default model used by selected live integration tests
 ### CLI/runtime
 - `RUSTY_CLAUDE_PERMISSION_MODE` — default REPL permission mode (`read-only`, `workspace-write`, or `danger-full-access`)
 - `CLAUDE_CONFIG_HOME` — override Claude config discovery root
 - `CLAUDE_CODE_REMOTE` — enable remote-session bootstrap handling when supported
 - `CLAUDE_CODE_REMOTE_SESSION_ID` — remote session identifier when using remote mode
 - `CLAUDE_CODE_UPSTREAM` — override the upstream TS source path for compat-harness extraction
 - `CLAWD_WEB_SEARCH_BASE_URL` — override the built-in web search service endpoint used by tooling
 ## Notes
 - `compat-harness` exists to compare the Rust port against the upstream TypeScript codebase and is intentionally excluded from the requested release test run.
 - The CLI currently focuses on a practical integrated workflow: prompt execution, REPL operation, session inspection/resume, config discovery, and tool/runtime plumbing.
@@ -1,4 +1,5 @@
 use std::collections::VecDeque;
 use std::sync::{Arc, Mutex};
 use std::time::{Duration, SystemTime, UNIX_EPOCH};
 use runtime::{
@@ -8,8 +9,9 @@ use runtime::{
 use serde::Deserialize;
 use crate::error::ApiError;
 use crate::prompt_cache::{PromptCache, PromptCacheRecord, PromptCacheStats};
 use crate::sse::SseParser;
-use crate::types::{MessageRequest, MessageResponse, StreamEvent};
+use crate::types::{MessageRequest, MessageResponse, StreamEvent, Usage};
 const DEFAULT_BASE_URL: &str = "https://api.anthropic.com";
 const ANTHROPIC_VERSION: &str = "2023-06-01";
@@ -108,6 +110,8 @@ pub struct AnthropicClient {
    max_retries: u32,
    initial_backoff: Duration,
    max_backoff: Duration,
    prompt_cache: Option<PromptCache>,
    last_prompt_cache_record: Arc<Mutex<Option<PromptCacheRecord>>>,
 }
 impl AnthropicClient {
@@ -120,6 +124,8 @@ impl AnthropicClient {
            max_retries: DEFAULT_MAX_RETRIES,
            initial_backoff: DEFAULT_INITIAL_BACKOFF,
            max_backoff: DEFAULT_MAX_BACKOFF,
            prompt_cache: None,
            last_prompt_cache_record: Arc::new(Mutex::new(None)),
        }
    }
@@ -132,6 +138,8 @@ impl AnthropicClient {
            max_retries: DEFAULT_MAX_RETRIES,
            initial_backoff: DEFAULT_INITIAL_BACKOFF,
            max_backoff: DEFAULT_MAX_BACKOFF,
            prompt_cache: None,
            last_prompt_cache_record: Arc::new(Mutex::new(None)),
        }
    }
@@ -189,6 +197,30 @@ impl AnthropicClient {
        self
    }
    #[must_use]
    pub fn with_prompt_cache(mut self, prompt_cache: PromptCache) -> Self {
        self.prompt_cache = Some(prompt_cache);
        self
    }
    #[must_use]
    pub fn prompt_cache(&self) -> Option<&PromptCache> {
        self.prompt_cache.as_ref()
    }
    #[must_use]
    pub fn prompt_cache_stats(&self) -> Option<PromptCacheStats> {
        self.prompt_cache.as_ref().map(PromptCache::stats)
    }
    #[must_use]
    pub fn take_last_prompt_cache_record(&self) -> Option<PromptCacheRecord> {
        self.last_prompt_cache_record()
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner)
            .take()
    }
    #[must_use]
    pub fn auth_source(&self) -> &AuthSource {
        &self.auth
@@ -198,10 +230,19 @@ impl AnthropicClient {
        &self,
        request: &MessageRequest,
    ) -> Result<MessageResponse, ApiError> {
        self.store_last_prompt_cache_record(None);
        let request = MessageRequest {
            stream: false,
            ..request.clone()
        };
        if let Some(prompt_cache) = &self.prompt_cache {
            if let Some(response) = prompt_cache.lookup_completion(&request) {
                self.store_last_prompt_cache_record(Some(prompt_cache_record_from_stats(
                    prompt_cache.stats(),
                )));
                return Ok(response);
            }
        }
        let response = self.send_with_retry(&request).await?;
        let request_id = request_id_from_headers(response.headers());
        let mut response = response
@@ -211,6 +252,10 @@ impl AnthropicClient {
        if response.request_id.is_none() {
            response.request_id = request_id;
        }
        if let Some(prompt_cache) = &self.prompt_cache {
            let record = prompt_cache.record_response(&request, &response);
            self.store_last_prompt_cache_record(Some(record));
        }
        Ok(response)
    }
@@ -218,6 +263,7 @@ impl AnthropicClient {
        &self,
        request: &MessageRequest,
    ) -> Result<MessageStream, ApiError> {
        self.store_last_prompt_cache_record(None);
        let response = self
            .send_with_retry(&request.clone().with_streaming())
            .await?;
@@ -227,9 +273,30 @@ impl AnthropicClient {
            parser: SseParser::new(),
            pending: VecDeque::new(),
            done: false,
            cache_tracking: self
                .prompt_cache
                .as_ref()
                .map(|prompt_cache| StreamCacheTracking {
                    prompt_cache: prompt_cache.clone(),
                    request: request.clone().with_streaming(),
                    last_usage: None,
                    finalized: false,
                    last_record: self.last_prompt_cache_record.clone(),
                }),
        })
    }
    fn store_last_prompt_cache_record(&self, record: Option<PromptCacheRecord>) {
        *self
            .last_prompt_cache_record()
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner) = record;
    }
    fn last_prompt_cache_record(&self) -> &Arc<Mutex<Option<PromptCacheRecord>>> {
        &self.last_prompt_cache_record
    }
    pub async fn exchange_oauth_code(
        &self,
        config: &OAuthConfig,
@@ -527,6 +594,7 @@ pub struct MessageStream {
    parser: SseParser,
    pending: VecDeque<StreamEvent>,
    done: bool,
    cache_tracking: Option<StreamCacheTracking>,
 }
 impl MessageStream {
@@ -538,6 +606,9 @@ impl MessageStream {
    pub async fn next_event(&mut self) -> Result<Option<StreamEvent>, ApiError> {
        loop {
            if let Some(event) = self.pending.pop_front() {
                if let Some(cache_tracking) = &mut self.cache_tracking {
                    cache_tracking.observe(&event);
                }
                return Ok(Some(event));
            }
@@ -545,8 +616,14 @@ impl MessageStream {
                let remaining = self.parser.finish()?;
                self.pending.extend(remaining);
                if let Some(event) = self.pending.pop_front() {
                    if let Some(cache_tracking) = &mut self.cache_tracking {
                        cache_tracking.observe(&event);
                    }
                    return Ok(Some(event));
                }
                if let Some(cache_tracking) = &mut self.cache_tracking {
                    cache_tracking.finalize();
                }
                return Ok(None);
            }
@@ -562,6 +639,53 @@ impl MessageStream {
    }
 }
 #[derive(Debug, Clone)]
 struct StreamCacheTracking {
    prompt_cache: PromptCache,
    request: MessageRequest,
    last_usage: Option<Usage>,
    finalized: bool,
    last_record: Arc<Mutex<Option<PromptCacheRecord>>>,
 }
 impl StreamCacheTracking {
    fn observe(&mut self, event: &StreamEvent) {
        match event {
            StreamEvent::MessageStart(event) => {
                self.last_usage = Some(event.message.usage.clone());
            }
            StreamEvent::MessageDelta(event) => {
                self.last_usage = Some(event.usage.clone());
            }
            StreamEvent::ContentBlockStart(_)
            | StreamEvent::ContentBlockDelta(_)
            | StreamEvent::ContentBlockStop(_)
            | StreamEvent::MessageStop(_) => {}
        }
    }
    fn finalize(&mut self) {
        if self.finalized {
            return;
        }
        if let Some(usage) = &self.last_usage {
            let record = self.prompt_cache.record_usage(&self.request, usage);
            *self
                .last_record
                .lock()
                .unwrap_or_else(std::sync::PoisonError::into_inner) = Some(record);
        }
        self.finalized = true;
    }
 }
 fn prompt_cache_record_from_stats(stats: PromptCacheStats) -> PromptCacheRecord {
    PromptCacheRecord {
        cache_break: None,
        stats,
    }
 }
 async fn expect_success(response: reqwest::Response) -> Result<reqwest::Response, ApiError> {
    let status = response.status();
    if status.is_success() {
@@ -606,7 +730,7 @@ mod tests {
    use super::{ALT_REQUEST_ID_HEADER, REQUEST_ID_HEADER};
    use std::io::{Read, Write};
    use std::net::TcpListener;
-    use std::sync::{Mutex, OnceLock};
+    use std::sync::atomic::{AtomicU64, Ordering};
    use std::thread;
    use std::time::{Duration, SystemTime, UNIX_EPOCH};
@@ -616,19 +740,15 @@ mod tests {
        now_unix_timestamp, oauth_token_is_expired, resolve_saved_oauth_token,
        resolve_startup_auth_source, AnthropicClient, AuthSource, OAuthTokenSet,
    };
    use crate::test_env_lock;
    use crate::types::{ContentBlockDelta, MessageRequest};
    fn env_lock() -> std::sync::MutexGuard<'static, ()> {
        static LOCK: OnceLock<Mutex<()>> = OnceLock::new();
        LOCK.get_or_init(|| Mutex::new(()))
            .lock()
            .expect("env lock")
    }
    fn temp_config_home() -> std::path::PathBuf {
        static NEXT_ID: AtomicU64 = AtomicU64::new(0);
        std::env::temp_dir().join(format!(
-            "api-oauth-test-{}-{}",
+            "api-oauth-test-{}-{}-{}",
            std::process::id(),
            NEXT_ID.fetch_add(1, Ordering::Relaxed),
            SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .expect("time")
@@ -668,7 +788,7 @@ mod tests {
    #[test]
    fn read_api_key_requires_presence() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
        std::env::remove_var("ANTHROPIC_API_KEY");
        std::env::remove_var("CLAUDE_CONFIG_HOME");
@@ -678,7 +798,7 @@ mod tests {
    #[test]
    fn read_api_key_requires_non_empty_value() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        std::env::set_var("ANTHROPIC_AUTH_TOKEN", "");
        std::env::remove_var("ANTHROPIC_API_KEY");
        let error = super::read_api_key().expect_err("empty key should error");
@@ -688,7 +808,7 @@ mod tests {
    #[test]
    fn read_api_key_prefers_api_key_env() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        std::env::set_var("ANTHROPIC_AUTH_TOKEN", "auth-token");
        std::env::set_var("ANTHROPIC_API_KEY", "legacy-key");
        assert_eq!(
@@ -701,7 +821,7 @@ mod tests {
    #[test]
    fn read_auth_token_reads_auth_token_env() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        std::env::set_var("ANTHROPIC_AUTH_TOKEN", "auth-token");
        assert_eq!(super::read_auth_token().as_deref(), Some("auth-token"));
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -721,7 +841,7 @@ mod tests {
    #[test]
    fn auth_source_from_env_combines_api_key_and_bearer_token() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        std::env::set_var("ANTHROPIC_AUTH_TOKEN", "auth-token");
        std::env::set_var("ANTHROPIC_API_KEY", "legacy-key");
        let auth = AuthSource::from_env().expect("env auth");
@@ -733,7 +853,7 @@ mod tests {
    #[test]
    fn auth_source_from_saved_oauth_when_env_absent() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        let config_home = temp_config_home();
        std::env::set_var("CLAUDE_CONFIG_HOME", &config_home);
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -772,7 +892,7 @@ mod tests {
    #[test]
    fn resolve_saved_oauth_token_refreshes_expired_credentials() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        let config_home = temp_config_home();
        std::env::set_var("CLAUDE_CONFIG_HOME", &config_home);
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -804,7 +924,7 @@ mod tests {
    #[test]
    fn resolve_startup_auth_source_uses_saved_oauth_without_loading_config() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        let config_home = temp_config_home();
        std::env::set_var("CLAUDE_CONFIG_HOME", &config_home);
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -828,7 +948,7 @@ mod tests {
    #[test]
    fn resolve_startup_auth_source_errors_when_refreshable_token_lacks_config() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        let config_home = temp_config_home();
        std::env::set_var("CLAUDE_CONFIG_HOME", &config_home);
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -860,7 +980,7 @@ mod tests {
    #[test]
    fn resolve_saved_oauth_token_preserves_refresh_token_when_refresh_response_omits_it() {
-        let _guard = env_lock();
+        let _guard = test_env_lock();
        let config_home = temp_config_home();
        std::env::set_var("CLAUDE_CONFIG_HOME", &config_home);
        std::env::remove_var("ANTHROPIC_AUTH_TOKEN");
@@ -1,13 +1,18 @@
 mod client;
 mod error;
 mod prompt_cache;
 mod sse;
 mod types;
 pub use client::{
-    oauth_token_is_expired, read_base_url, resolve_saved_oauth_token,
+    oauth_token_is_expired, read_base_url, resolve_saved_oauth_token, resolve_startup_auth_source,
-    resolve_startup_auth_source, AnthropicClient, AuthSource, MessageStream, OAuthTokenSet,
+    AnthropicClient, AuthSource, MessageStream, OAuthTokenSet,
 };
 pub use error::ApiError;
 pub use prompt_cache::{
    CacheBreakEvent, PromptCache, PromptCacheConfig, PromptCachePaths, PromptCacheRecord,
    PromptCacheStats,
 };
 pub use sse::{parse_frame, SseParser};
 pub use types::{
    ContentBlockDelta, ContentBlockDeltaEvent, ContentBlockStartEvent, ContentBlockStopEvent,
@@ -15,3 +20,11 @@ pub use types::{
    MessageResponse, MessageStartEvent, MessageStopEvent, OutputContentBlock, StreamEvent,
    ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
 };
 #[cfg(test)]
 pub(crate) fn test_env_lock() -> std::sync::MutexGuard<'static, ()> {
    static LOCK: std::sync::OnceLock<std::sync::Mutex<()>> = std::sync::OnceLock::new();
    LOCK.get_or_init(|| std::sync::Mutex::new(()))
        .lock()
        .unwrap_or_else(std::sync::PoisonError::into_inner)
 }
@@ -0,0 +1,727 @@
 use std::fs;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, Mutex};
 use std::time::{Duration, SystemTime, UNIX_EPOCH};
 use serde::{Deserialize, Serialize};
 use crate::types::{MessageRequest, MessageResponse, Usage};
 const DEFAULT_COMPLETION_TTL_SECS: u64 = 30;
 const DEFAULT_PROMPT_TTL_SECS: u64 = 5 * 60;
 const DEFAULT_BREAK_MIN_DROP: u32 = 2_000;
 const MAX_SANITIZED_LENGTH: usize = 80;
 const REQUEST_FINGERPRINT_VERSION: u32 = 1;
 const REQUEST_FINGERPRINT_PREFIX: &str = "v1";
 const FNV_OFFSET_BASIS: u64 = 0xcbf2_9ce4_8422_2325;
 const FNV_PRIME: u64 = 0x0000_0100_0000_01b3;
 #[derive(Debug, Clone)]
 pub struct PromptCacheConfig {
    pub session_id: String,
    pub completion_ttl: Duration,
    pub prompt_ttl: Duration,
    pub cache_break_min_drop: u32,
 }
 impl PromptCacheConfig {
    #[must_use]
    pub fn new(session_id: impl Into<String>) -> Self {
        Self {
            session_id: session_id.into(),
            completion_ttl: Duration::from_secs(DEFAULT_COMPLETION_TTL_SECS),
            prompt_ttl: Duration::from_secs(DEFAULT_PROMPT_TTL_SECS),
            cache_break_min_drop: DEFAULT_BREAK_MIN_DROP,
        }
    }
 }
 impl Default for PromptCacheConfig {
    fn default() -> Self {
        Self::new("default")
    }
 }
 #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
 pub struct PromptCachePaths {
    pub root: PathBuf,
    pub session_dir: PathBuf,
    pub completion_dir: PathBuf,
    pub session_state_path: PathBuf,
    pub stats_path: PathBuf,
 }
 impl PromptCachePaths {
    #[must_use]
    pub fn for_session(session_id: &str) -> Self {
        let root = base_cache_root();
        let session_dir = root.join(sanitize_path_segment(session_id));
        let completion_dir = session_dir.join("completions");
        Self {
            root,
            session_state_path: session_dir.join("session-state.json"),
            stats_path: session_dir.join("stats.json"),
            session_dir,
            completion_dir,
        }
    }
    #[must_use]
    pub fn completion_entry_path(&self, request_hash: &str) -> PathBuf {
        self.completion_dir.join(format!("{request_hash}.json"))
    }
 }
 #[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
 pub struct PromptCacheStats {
    pub tracked_requests: u64,
    pub completion_cache_hits: u64,
    pub completion_cache_misses: u64,
    pub completion_cache_writes: u64,
    pub expected_invalidations: u64,
    pub unexpected_cache_breaks: u64,
    pub total_cache_creation_input_tokens: u64,
    pub total_cache_read_input_tokens: u64,
    pub last_cache_creation_input_tokens: Option<u32>,
    pub last_cache_read_input_tokens: Option<u32>,
    pub last_request_hash: Option<String>,
    pub last_completion_cache_key: Option<String>,
    pub last_break_reason: Option<String>,
    pub last_cache_source: Option<String>,
 }
 #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
 pub struct CacheBreakEvent {
    pub unexpected: bool,
    pub reason: String,
    pub previous_cache_read_input_tokens: u32,
    pub current_cache_read_input_tokens: u32,
    pub token_drop: u32,
 }
 #[derive(Debug, Clone, PartialEq, Eq)]
 pub struct PromptCacheRecord {
    pub cache_break: Option<CacheBreakEvent>,
    pub stats: PromptCacheStats,
 }
 #[derive(Debug, Clone)]
 pub struct PromptCache {
    inner: Arc<Mutex<PromptCacheInner>>,
 }
 impl PromptCache {
    #[must_use]
    pub fn new(session_id: impl Into<String>) -> Self {
        Self::with_config(PromptCacheConfig::new(session_id))
    }
    #[must_use]
    pub fn with_config(config: PromptCacheConfig) -> Self {
        let paths = PromptCachePaths::for_session(&config.session_id);
        let stats = read_json::<PromptCacheStats>(&paths.stats_path).unwrap_or_default();
        let previous = read_json::<TrackedPromptState>(&paths.session_state_path);
        Self {
            inner: Arc::new(Mutex::new(PromptCacheInner {
                config,
                paths,
                stats,
                previous,
            })),
        }
    }
    #[must_use]
    pub fn paths(&self) -> PromptCachePaths {
        self.lock().paths.clone()
    }
    #[must_use]
    pub fn stats(&self) -> PromptCacheStats {
        self.lock().stats.clone()
    }
    #[must_use]
    pub fn lookup_completion(&self, request: &MessageRequest) -> Option<MessageResponse> {
        let request_hash = request_hash_hex(request);
        let (paths, ttl) = {
            let inner = self.lock();
            (inner.paths.clone(), inner.config.completion_ttl)
        };
        let entry_path = paths.completion_entry_path(&request_hash);
        let entry = read_json::<CompletionCacheEntry>(&entry_path);
        let Some(entry) = entry else {
            let mut inner = self.lock();
            inner.stats.completion_cache_misses += 1;
            inner.stats.last_completion_cache_key = Some(request_hash);
            persist_state(&inner);
            return None;
        };
        if entry.fingerprint_version != current_fingerprint_version() {
            let mut inner = self.lock();
            inner.stats.completion_cache_misses += 1;
            inner.stats.last_completion_cache_key = Some(request_hash.clone());
            let _ = fs::remove_file(entry_path);
            persist_state(&inner);
            return None;
        }
        let expired = now_unix_secs().saturating_sub(entry.cached_at_unix_secs) >= ttl.as_secs();
        let mut inner = self.lock();
        inner.stats.last_completion_cache_key = Some(request_hash.clone());
        if expired {
            inner.stats.completion_cache_misses += 1;
            let _ = fs::remove_file(entry_path);
            persist_state(&inner);
            return None;
        }
        inner.stats.completion_cache_hits += 1;
        apply_usage_to_stats(
            &mut inner.stats,
            &entry.response.usage,
            &request_hash,
            "completion-cache",
        );
        inner.previous = Some(TrackedPromptState::from_usage(
            request,
            &entry.response.usage,
        ));
        persist_state(&inner);
        Some(entry.response)
    }
    #[must_use]
    pub fn record_response(
        &self,
        request: &MessageRequest,
        response: &MessageResponse,
    ) -> PromptCacheRecord {
        self.record_usage_internal(request, &response.usage, Some(response))
    }
    #[must_use]
    pub fn record_usage(&self, request: &MessageRequest, usage: &Usage) -> PromptCacheRecord {
        self.record_usage_internal(request, usage, None)
    }
    fn record_usage_internal(
        &self,
        request: &MessageRequest,
        usage: &Usage,
        response: Option<&MessageResponse>,
    ) -> PromptCacheRecord {
        let request_hash = request_hash_hex(request);
        let mut inner = self.lock();
        let previous = inner.previous.clone();
        let current = TrackedPromptState::from_usage(request, usage);
        let cache_break = detect_cache_break(&inner.config, previous.as_ref(), &current);
        inner.stats.tracked_requests += 1;
        apply_usage_to_stats(&mut inner.stats, usage, &request_hash, "api-response");
        if let Some(event) = &cache_break {
            if event.unexpected {
                inner.stats.unexpected_cache_breaks += 1;
            } else {
                inner.stats.expected_invalidations += 1;
            }
            inner.stats.last_break_reason = Some(event.reason.clone());
        }
        inner.previous = Some(current);
        if let Some(response) = response {
            write_completion_entry(&inner.paths, &request_hash, response);
            inner.stats.completion_cache_writes += 1;
        }
        persist_state(&inner);
        PromptCacheRecord {
            cache_break,
            stats: inner.stats.clone(),
        }
    }
    fn lock(&self) -> std::sync::MutexGuard<'_, PromptCacheInner> {
        self.inner
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner)
    }
 }
 #[derive(Debug)]
 struct PromptCacheInner {
    config: PromptCacheConfig,
    paths: PromptCachePaths,
    stats: PromptCacheStats,
    previous: Option<TrackedPromptState>,
 }
 #[derive(Debug, Clone, Serialize, Deserialize)]
 struct CompletionCacheEntry {
    cached_at_unix_secs: u64,
    #[serde(default = "current_fingerprint_version")]
    fingerprint_version: u32,
    response: MessageResponse,
 }
 #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
 struct TrackedPromptState {
    observed_at_unix_secs: u64,
    #[serde(default = "current_fingerprint_version")]
    fingerprint_version: u32,
    model_hash: u64,
    system_hash: u64,
    tools_hash: u64,
    messages_hash: u64,
    cache_read_input_tokens: u32,
 }
 impl TrackedPromptState {
    fn from_usage(request: &MessageRequest, usage: &Usage) -> Self {
        let hashes = RequestFingerprints::from_request(request);
        Self {
            observed_at_unix_secs: now_unix_secs(),
            fingerprint_version: current_fingerprint_version(),
            model_hash: hashes.model,
            system_hash: hashes.system,
            tools_hash: hashes.tools,
            messages_hash: hashes.messages,
            cache_read_input_tokens: usage.cache_read_input_tokens,
        }
    }
 }
 #[derive(Debug, Clone, Copy)]
 struct RequestFingerprints {
    model: u64,
    system: u64,
    tools: u64,
    messages: u64,
 }
 impl RequestFingerprints {
    fn from_request(request: &MessageRequest) -> Self {
        Self {
            model: hash_serializable(&request.model),
            system: hash_serializable(&request.system),
            tools: hash_serializable(&request.tools),
            messages: hash_serializable(&request.messages),
        }
    }
 }
 fn detect_cache_break(
    config: &PromptCacheConfig,
    previous: Option<&TrackedPromptState>,
    current: &TrackedPromptState,
 ) -> Option<CacheBreakEvent> {
    let previous = previous?;
    if previous.fingerprint_version != current.fingerprint_version {
        return Some(CacheBreakEvent {
            unexpected: false,
            reason: format!(
                "fingerprint version changed (v{} -> v{})",
                previous.fingerprint_version, current.fingerprint_version
            ),
            previous_cache_read_input_tokens: previous.cache_read_input_tokens,
            current_cache_read_input_tokens: current.cache_read_input_tokens,
            token_drop: previous
                .cache_read_input_tokens
                .saturating_sub(current.cache_read_input_tokens),
        });
    }
    let token_drop = previous
        .cache_read_input_tokens
        .saturating_sub(current.cache_read_input_tokens);
    if token_drop < config.cache_break_min_drop {
        return None;
    }
    let mut reasons = Vec::new();
    if previous.model_hash != current.model_hash {
        reasons.push("model changed");
    }
    if previous.system_hash != current.system_hash {
        reasons.push("system prompt changed");
    }
    if previous.tools_hash != current.tools_hash {
        reasons.push("tool definitions changed");
    }
    if previous.messages_hash != current.messages_hash {
        reasons.push("message payload changed");
    }
    let elapsed = current
        .observed_at_unix_secs
        .saturating_sub(previous.observed_at_unix_secs);
    let (unexpected, reason) = if reasons.is_empty() {
        if elapsed > config.prompt_ttl.as_secs() {
            (
                false,
                format!("possible prompt cache TTL expiry after {elapsed}s"),
            )
        } else {
            (
                true,
                "cache read tokens dropped while prompt fingerprint remained stable".to_string(),
            )
        }
    } else {
        (false, reasons.join(", "))
    };
    Some(CacheBreakEvent {
        unexpected,
        reason,
        previous_cache_read_input_tokens: previous.cache_read_input_tokens,
        current_cache_read_input_tokens: current.cache_read_input_tokens,
        token_drop,
    })
 }
 fn apply_usage_to_stats(
    stats: &mut PromptCacheStats,
    usage: &Usage,
    request_hash: &str,
    source: &str,
 ) {
    stats.total_cache_creation_input_tokens += u64::from(usage.cache_creation_input_tokens);
    stats.total_cache_read_input_tokens += u64::from(usage.cache_read_input_tokens);
    stats.last_cache_creation_input_tokens = Some(usage.cache_creation_input_tokens);
    stats.last_cache_read_input_tokens = Some(usage.cache_read_input_tokens);
    stats.last_request_hash = Some(request_hash.to_string());
    stats.last_cache_source = Some(source.to_string());
 }
 fn persist_state(inner: &PromptCacheInner) {
    let _ = ensure_cache_dirs(&inner.paths);
    let _ = write_json(&inner.paths.stats_path, &inner.stats);
    if let Some(previous) = &inner.previous {
        let _ = write_json(&inner.paths.session_state_path, previous);
    }
 }
 fn write_completion_entry(
    paths: &PromptCachePaths,
    request_hash: &str,
    response: &MessageResponse,
 ) {
    let _ = ensure_cache_dirs(paths);
    let entry = CompletionCacheEntry {
        cached_at_unix_secs: now_unix_secs(),
        fingerprint_version: current_fingerprint_version(),
        response: response.clone(),
    };
    let _ = write_json(&paths.completion_entry_path(request_hash), &entry);
 }
 fn ensure_cache_dirs(paths: &PromptCachePaths) -> std::io::Result<()> {
    fs::create_dir_all(&paths.completion_dir)
 }
 fn write_json<T: Serialize>(path: &Path, value: &T) -> std::io::Result<()> {
    let json = serde_json::to_vec_pretty(value)
        .map_err(|error| std::io::Error::new(std::io::ErrorKind::InvalidData, error))?;
    fs::write(path, json)
 }
 fn read_json<T: for<'de> Deserialize<'de>>(path: &Path) -> Option<T> {
    let bytes = fs::read(path).ok()?;
    serde_json::from_slice(&bytes).ok()
 }
 fn request_hash_hex(request: &MessageRequest) -> String {
    format!(
        "{REQUEST_FINGERPRINT_PREFIX}-{:016x}",
        hash_serializable(request)
    )
 }
 fn hash_serializable<T: Serialize>(value: &T) -> u64 {
    let json = serde_json::to_vec(value).unwrap_or_default();
    stable_hash_bytes(&json)
 }
 fn sanitize_path_segment(value: &str) -> String {
    let sanitized: String = value
        .chars()
        .map(|ch| if ch.is_ascii_alphanumeric() { ch } else { '-' })
        .collect();
    if sanitized.len() <= MAX_SANITIZED_LENGTH {
        return sanitized;
    }
    let suffix = format!("-{:x}", hash_string(value));
    format!(
        "{}{}",
        &sanitized[..MAX_SANITIZED_LENGTH.saturating_sub(suffix.len())],
        suffix
    )
 }
 fn hash_string(value: &str) -> u64 {
    stable_hash_bytes(value.as_bytes())
 }
 fn base_cache_root() -> PathBuf {
    if let Some(config_home) = std::env::var_os("CLAUDE_CONFIG_HOME") {
        return PathBuf::from(config_home)
            .join("cache")
            .join("prompt-cache");
    }
    if let Some(home) = std::env::var_os("HOME") {
        return PathBuf::from(home)
            .join(".claude")
            .join("cache")
            .join("prompt-cache");
    }
    std::env::temp_dir().join("claude-prompt-cache")
 }
 fn now_unix_secs() -> u64 {
    SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .map_or(0, |duration| duration.as_secs())
 }
 const fn current_fingerprint_version() -> u32 {
    REQUEST_FINGERPRINT_VERSION
 }
 fn stable_hash_bytes(bytes: &[u8]) -> u64 {
    let mut hash = FNV_OFFSET_BASIS;
    for byte in bytes {
        hash ^= u64::from(*byte);
        hash = hash.wrapping_mul(FNV_PRIME);
    }
    hash
 }
 #[cfg(test)]
 mod tests {
    use std::time::{Duration, SystemTime, UNIX_EPOCH};
    use super::{
        detect_cache_break, read_json, request_hash_hex, sanitize_path_segment, PromptCache,
        PromptCacheConfig, PromptCachePaths, TrackedPromptState, REQUEST_FINGERPRINT_PREFIX,
    };
    use crate::test_env_lock;
    use crate::types::{InputMessage, MessageRequest, MessageResponse, OutputContentBlock, Usage};
    #[test]
    fn path_builder_sanitizes_session_identifier() {
        let paths = PromptCachePaths::for_session("session:/with spaces");
        let session_dir = paths
            .session_dir
            .file_name()
            .and_then(|value| value.to_str())
            .expect("session dir name");
        assert_eq!(session_dir, "session--with-spaces");
        assert!(paths.completion_dir.ends_with("completions"));
        assert!(paths.stats_path.ends_with("stats.json"));
        assert!(paths.session_state_path.ends_with("session-state.json"));
    }
    #[test]
    fn request_fingerprint_drives_unexpected_break_detection() {
        let request = sample_request("same");
        let previous = TrackedPromptState::from_usage(
            &request,
            &Usage {
                input_tokens: 0,
                cache_creation_input_tokens: 0,
                cache_read_input_tokens: 6_000,
                output_tokens: 0,
            },
        );
        let current = TrackedPromptState::from_usage(
            &request,
            &Usage {
                input_tokens: 0,
                cache_creation_input_tokens: 0,
                cache_read_input_tokens: 1_000,
                output_tokens: 0,
            },
        );
        let event = detect_cache_break(&PromptCacheConfig::default(), Some(&previous), &current)
            .expect("break should be detected");
        assert!(event.unexpected);
        assert!(event.reason.contains("stable"));
    }
    #[test]
    fn changed_prompt_marks_break_as_expected() {
        let previous_request = sample_request("first");
        let current_request = sample_request("second");
        let previous = TrackedPromptState::from_usage(
            &previous_request,
            &Usage {
                input_tokens: 0,
                cache_creation_input_tokens: 0,
                cache_read_input_tokens: 6_000,
                output_tokens: 0,
            },
        );
        let current = TrackedPromptState::from_usage(
            &current_request,
            &Usage {
                input_tokens: 0,
                cache_creation_input_tokens: 0,
                cache_read_input_tokens: 1_000,
                output_tokens: 0,
            },
        );
        let event = detect_cache_break(&PromptCacheConfig::default(), Some(&previous), &current)
            .expect("break should be detected");
        assert!(!event.unexpected);
        assert!(event.reason.contains("message payload changed"));
    }
    #[test]
    fn completion_cache_round_trip_persists_recent_response() {
        let _guard = test_env_lock();
        let temp_root = std::env::temp_dir().join(format!(
            "prompt-cache-test-{}-{}",
            std::process::id(),
            SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .expect("time")
                .as_nanos()
        ));
        std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
        let cache = PromptCache::new("unit-test-session");
        let request = sample_request("cache me");
        let response = sample_response(42, 12, "cached");
        assert!(cache.lookup_completion(&request).is_none());
        let record = cache.record_response(&request, &response);
        assert!(record.cache_break.is_none());
        let cached = cache
            .lookup_completion(&request)
            .expect("cached response should load");
        assert_eq!(cached.content, response.content);
        let stats = cache.stats();
        assert_eq!(stats.completion_cache_hits, 1);
        assert_eq!(stats.completion_cache_misses, 1);
        assert_eq!(stats.completion_cache_writes, 1);
        let persisted = read_json::<super::PromptCacheStats>(&cache.paths().stats_path)
            .expect("stats should persist");
        assert_eq!(persisted.completion_cache_hits, 1);
        std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
        std::env::remove_var("CLAUDE_CONFIG_HOME");
    }
    #[test]
    fn distinct_requests_do_not_collide_in_completion_cache() {
        let _guard = test_env_lock();
        let temp_root = std::env::temp_dir().join(format!(
            "prompt-cache-distinct-{}-{}",
            std::process::id(),
            SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .expect("time")
                .as_nanos()
        ));
        std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
        let cache = PromptCache::new("distinct-request-session");
        let first_request = sample_request("first");
        let second_request = sample_request("second");
        let response = sample_response(42, 12, "cached");
        let _ = cache.record_response(&first_request, &response);
        assert!(cache.lookup_completion(&second_request).is_none());
        std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
        std::env::remove_var("CLAUDE_CONFIG_HOME");
    }
    #[test]
    fn expired_completion_entries_are_not_reused() {
        let _guard = test_env_lock();
        let temp_root = std::env::temp_dir().join(format!(
            "prompt-cache-expired-{}-{}",
            std::process::id(),
            SystemTime::now()
                .duration_since(UNIX_EPOCH)
                .expect("time")
                .as_nanos()
        ));
        std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
        let cache = PromptCache::with_config(PromptCacheConfig {
            session_id: "expired-session".to_string(),
            completion_ttl: Duration::ZERO,
            ..PromptCacheConfig::default()
        });
        let request = sample_request("expire me");
        let response = sample_response(7, 3, "stale");
        let _ = cache.record_response(&request, &response);
        assert!(cache.lookup_completion(&request).is_none());
        let stats = cache.stats();
        assert_eq!(stats.completion_cache_hits, 0);
        assert_eq!(stats.completion_cache_misses, 1);
        std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
        std::env::remove_var("CLAUDE_CONFIG_HOME");
    }
    #[test]
    fn sanitize_path_caps_long_values() {
        let long_value = "x".repeat(200);
        let sanitized = sanitize_path_segment(&long_value);
        assert!(sanitized.len() <= 80);
    }
    #[test]
    fn request_hashes_are_versioned_and_stable() {
        let request = sample_request("stable");
        let first = request_hash_hex(&request);
        let second = request_hash_hex(&request);
        assert_eq!(first, second);
        assert!(first.starts_with(REQUEST_FINGERPRINT_PREFIX));
    }
    fn sample_request(text: &str) -> MessageRequest {
        MessageRequest {
            model: "claude-3-7-sonnet-latest".to_string(),
            max_tokens: 64,
            messages: vec![InputMessage::user_text(text)],
            system: Some("system".to_string()),
            tools: None,
            tool_choice: None,
            stream: false,
        }
    }
    fn sample_response(
        cache_read_input_tokens: u32,
        output_tokens: u32,
        text: &str,
    ) -> MessageResponse {
        MessageResponse {
            id: "msg_test".to_string(),
            kind: "message".to_string(),
            role: "assistant".to_string(),
            content: vec![OutputContentBlock::Text {
                text: text.to_string(),
            }],
            model: "claude-3-7-sonnet-latest".to_string(),
            stop_reason: Some("end_turn".to_string()),
            stop_sequence: None,
            usage: Usage {
                input_tokens: 10,
                cache_creation_input_tokens: 5,
                cache_read_input_tokens,
                output_tokens,
            },
            request_id: Some("req_test".to_string()),
        }
    }
 }
@@ -1,17 +1,25 @@
 use std::collections::HashMap;
 use std::sync::Arc;
 use std::sync::{Mutex as StdMutex, OnceLock};
 use std::time::Duration;
 use api::{
    AnthropicClient, ApiError, ContentBlockDelta, ContentBlockDeltaEvent, ContentBlockStartEvent,
    InputContentBlock, InputMessage, MessageDeltaEvent, MessageRequest, OutputContentBlock,
-    StreamEvent, ToolChoice, ToolDefinition,
+    PromptCache, StreamEvent, ToolChoice, ToolDefinition,
 };
 use serde_json::json;
 use tokio::io::{AsyncReadExt, AsyncWriteExt};
 use tokio::net::TcpListener;
 use tokio::sync::Mutex;
 fn env_lock() -> std::sync::MutexGuard<'static, ()> {
    static LOCK: OnceLock<StdMutex<()>> = OnceLock::new();
    LOCK.get_or_init(|| StdMutex::new(()))
        .lock()
        .unwrap_or_else(std::sync::PoisonError::into_inner)
 }
 #[tokio::test]
 async fn send_message_posts_json_and_parses_response() {
    let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
@@ -45,6 +53,8 @@ async fn send_message_posts_json_and_parses_response() {
    assert_eq!(response.id, "msg_test");
    assert_eq!(response.total_tokens(), 16);
    assert_eq!(response.request_id.as_deref(), Some("req_body_123"));
    assert_eq!(response.usage.cache_creation_input_tokens, 0);
    assert_eq!(response.usage.cache_read_input_tokens, 0);
    assert_eq!(
        response.content,
        vec![OutputContentBlock::Text {
@@ -76,11 +86,55 @@ async fn send_message_posts_json_and_parses_response() {
 }
 #[tokio::test]
 async fn send_message_parses_prompt_cache_token_usage_from_response() {
    let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
    let body = concat!(
        "{",
        "\"id\":\"msg_cache_tokens\",",
        "\"type\":\"message\",",
        "\"role\":\"assistant\",",
        "\"content\":[{\"type\":\"text\",\"text\":\"Cache tokens\"}],",
        "\"model\":\"claude-3-7-sonnet-latest\",",
        "\"stop_reason\":\"end_turn\",",
        "\"stop_sequence\":null,",
        "\"usage\":{\"input_tokens\":12,\"cache_creation_input_tokens\":321,\"cache_read_input_tokens\":654,\"output_tokens\":4}",
        "}"
    );
    let server = spawn_server(
        state,
        vec![http_response("200 OK", "application/json", body)],
    )
    .await;
    let client = AnthropicClient::new("test-key").with_base_url(server.base_url());
    let response = client
        .send_message(&sample_request(false))
        .await
        .expect("request should succeed");
    assert_eq!(response.usage.input_tokens, 12);
    assert_eq!(response.usage.cache_creation_input_tokens, 321);
    assert_eq!(response.usage.cache_read_input_tokens, 654);
    assert_eq!(response.usage.output_tokens, 4);
 }
 #[tokio::test]
 #[allow(clippy::await_holding_lock)]
 async fn stream_message_parses_sse_events_with_tool_use() {
    let _guard = env_lock();
    let temp_root = std::env::temp_dir().join(format!(
        "api-stream-cache-{}-{}",
        std::process::id(),
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .expect("time")
            .as_nanos()
    ));
    std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
    let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
    let sse = concat!(
        "event: message_start\n",
-        "data: {\"type\":\"message_start\",\"message\":{\"id\":\"msg_stream\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[],\"model\":\"claude-3-7-sonnet-latest\",\"stop_reason\":null,\"stop_sequence\":null,\"usage\":{\"input_tokens\":8,\"output_tokens\":0}}}\n\n",
+        "data: {\"type\":\"message_start\",\"message\":{\"id\":\"msg_stream\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[],\"model\":\"claude-3-7-sonnet-latest\",\"stop_reason\":null,\"stop_sequence\":null,\"usage\":{\"input_tokens\":8,\"cache_creation_input_tokens\":13,\"cache_read_input_tokens\":21,\"output_tokens\":0}}}\n\n",
        "event: content_block_start\n",
        "data: {\"type\":\"content_block_start\",\"index\":0,\"content_block\":{\"type\":\"tool_use\",\"id\":\"toolu_123\",\"name\":\"get_weather\",\"input\":{}}}\n\n",
        "event: content_block_delta\n",
@@ -88,7 +142,7 @@ async fn stream_message_parses_sse_events_with_tool_use() {
        "event: content_block_stop\n",
        "data: {\"type\":\"content_block_stop\",\"index\":0}\n\n",
        "event: message_delta\n",
-        "data: {\"type\":\"message_delta\",\"delta\":{\"stop_reason\":\"tool_use\",\"stop_sequence\":null},\"usage\":{\"input_tokens\":8,\"output_tokens\":1}}\n\n",
+        "data: {\"type\":\"message_delta\",\"delta\":{\"stop_reason\":\"tool_use\",\"stop_sequence\":null},\"usage\":{\"input_tokens\":8,\"cache_creation_input_tokens\":34,\"cache_read_input_tokens\":55,\"output_tokens\":1}}\n\n",
        "event: message_stop\n",
        "data: {\"type\":\"message_stop\"}\n\n",
        "data: [DONE]\n\n"
@@ -106,7 +160,8 @@ async fn stream_message_parses_sse_events_with_tool_use() {
    let client = AnthropicClient::new("test-key")
        .with_auth_token(Some("proxy-token".to_string()))
-        .with_base_url(server.base_url());
+        .with_base_url(server.base_url())
        .with_prompt_cache(PromptCache::new("stream-session"));
    let mut stream = client
        .stream_message(&sample_request(false))
        .await
@@ -160,6 +215,20 @@ async fn stream_message_parses_sse_events_with_tool_use() {
    let captured = state.lock().await;
    let request = captured.first().expect("server should capture request");
    assert!(request.body.contains("\"stream\":true"));
    let cache_stats = client
        .prompt_cache_stats()
        .expect("prompt cache stats should exist");
    assert_eq!(cache_stats.tracked_requests, 1);
    assert_eq!(cache_stats.last_cache_creation_input_tokens, Some(34));
    assert_eq!(cache_stats.last_cache_read_input_tokens, Some(55));
    assert_eq!(
        cache_stats.last_cache_source.as_deref(),
        Some("api-response")
    );
    std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
    std::env::remove_var("CLAUDE_CONFIG_HOME");
 }
 #[tokio::test]
@@ -243,6 +312,121 @@ async fn surfaces_retry_exhaustion_for_persistent_retryable_errors() {
    }
 }
 #[tokio::test]
 #[allow(clippy::await_holding_lock)]
 async fn send_message_reuses_recent_completion_cache_entries() {
    let _guard = env_lock();
    let temp_root = std::env::temp_dir().join(format!(
        "api-prompt-cache-{}-{}",
        std::process::id(),
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .expect("time")
            .as_nanos()
    ));
    std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
    let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
    let server = spawn_server(
        state.clone(),
        vec![http_response(
            "200 OK",
            "application/json",
            "{\"id\":\"msg_cached\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Cached once\"}],\"model\":\"claude-3-7-sonnet-latest\",\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":3,\"cache_creation_input_tokens\":5,\"cache_read_input_tokens\":4000,\"output_tokens\":2}}",
        )],
    )
    .await;
    let client = AnthropicClient::new("test-key")
        .with_base_url(server.base_url())
        .with_prompt_cache(PromptCache::new("integration-session"));
    let first = client
        .send_message(&sample_request(false))
        .await
        .expect("first request should succeed");
    let second = client
        .send_message(&sample_request(false))
        .await
        .expect("second request should reuse cache");
    assert_eq!(first.content, second.content);
    assert_eq!(state.lock().await.len(), 1);
    let cache_stats = client
        .prompt_cache_stats()
        .expect("prompt cache stats should exist");
    assert_eq!(cache_stats.completion_cache_hits, 1);
    assert_eq!(cache_stats.completion_cache_misses, 1);
    assert_eq!(cache_stats.completion_cache_writes, 1);
    std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
    std::env::remove_var("CLAUDE_CONFIG_HOME");
 }
 #[tokio::test]
 #[allow(clippy::await_holding_lock)]
 async fn send_message_tracks_unexpected_prompt_cache_breaks() {
    let _guard = env_lock();
    let temp_root = std::env::temp_dir().join(format!(
        "api-prompt-break-{}-{}",
        std::process::id(),
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .expect("time")
            .as_nanos()
    ));
    std::env::set_var("CLAUDE_CONFIG_HOME", &temp_root);
    let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
    let server = spawn_server(
        state,
        vec![
            http_response(
                "200 OK",
                "application/json",
                "{\"id\":\"msg_one\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"One\"}],\"model\":\"claude-3-7-sonnet-latest\",\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":3,\"cache_creation_input_tokens\":5,\"cache_read_input_tokens\":6000,\"output_tokens\":2}}",
            ),
            http_response(
                "200 OK",
                "application/json",
                "{\"id\":\"msg_two\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Two\"}],\"model\":\"claude-3-7-sonnet-latest\",\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":3,\"cache_creation_input_tokens\":0,\"cache_read_input_tokens\":1000,\"output_tokens\":2}}",
            ),
        ],
    )
    .await;
    let request = sample_request(false);
    let client = AnthropicClient::new("test-key")
        .with_base_url(server.base_url())
        .with_prompt_cache(PromptCache::with_config(api::PromptCacheConfig {
            session_id: "break-session".to_string(),
            completion_ttl: Duration::from_secs(0),
            ..api::PromptCacheConfig::default()
        }));
    client
        .send_message(&request)
        .await
        .expect("first response should succeed");
    client
        .send_message(&request)
        .await
        .expect("second response should succeed");
    let cache_stats = client
        .prompt_cache_stats()
        .expect("prompt cache stats should exist");
    assert_eq!(cache_stats.unexpected_cache_breaks, 1);
    assert_eq!(
        cache_stats.last_break_reason.as_deref(),
        Some("cache read tokens dropped while prompt fingerprint remained stable")
    );
    std::fs::remove_dir_all(temp_root).expect("cleanup temp root");
    std::env::remove_var("CLAUDE_CONFIG_HOME");
 }
 #[tokio::test]
 #[ignore = "requires ANTHROPIC_API_KEY and network access"]
 async fn live_stream_smoke_test() {
@@ -37,6 +37,7 @@ pub struct RuntimeConfig {
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct RuntimeFeatureConfig {
    hooks: RuntimeHookConfig,
    mcp: McpConfigCollection,
    oauth: Option<OAuthConfig>,
    model: Option<String>,
@@ -44,6 +45,12 @@ pub struct RuntimeFeatureConfig {
    sandbox: SandboxConfig,
 }
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct RuntimeHookConfig {
    pre_tool_use: Vec<String>,
    post_tool_use: Vec<String>,
 }
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct McpConfigCollection {
    servers: BTreeMap<String, ScopedMcpServerConfig>,
@@ -221,6 +228,7 @@ impl ConfigLoader {
        let merged_value = JsonValue::Object(merged.clone());
        let feature_config = RuntimeFeatureConfig {
            hooks: parse_optional_hooks_config(&merged_value)?,
            mcp: McpConfigCollection {
                servers: mcp_servers,
            },
@@ -278,6 +286,11 @@ impl RuntimeConfig {
        &self.feature_config.mcp
    }
    #[must_use]
    pub fn hooks(&self) -> &RuntimeHookConfig {
        &self.feature_config.hooks
    }
    #[must_use]
    pub fn oauth(&self) -> Option<&OAuthConfig> {
        self.feature_config.oauth.as_ref()
@@ -300,6 +313,17 @@ impl RuntimeConfig {
 }
 impl RuntimeFeatureConfig {
    #[must_use]
    pub fn with_hooks(mut self, hooks: RuntimeHookConfig) -> Self {
        self.hooks = hooks;
        self
    }
    #[must_use]
    pub fn hooks(&self) -> &RuntimeHookConfig {
        &self.hooks
    }
    #[must_use]
    pub fn mcp(&self) -> &McpConfigCollection {
        &self.mcp
@@ -326,6 +350,26 @@ impl RuntimeFeatureConfig {
    }
 }
 impl RuntimeHookConfig {
    #[must_use]
    pub fn new(pre_tool_use: Vec<String>, post_tool_use: Vec<String>) -> Self {
        Self {
            pre_tool_use,
            post_tool_use,
        }
    }
    #[must_use]
    pub fn pre_tool_use(&self) -> &[String] {
        &self.pre_tool_use
    }
    #[must_use]
    pub fn post_tool_use(&self) -> &[String] {
        &self.post_tool_use
    }
 }
 impl McpConfigCollection {
    #[must_use]
    pub fn servers(&self) -> &BTreeMap<String, ScopedMcpServerConfig> {
@@ -424,6 +468,22 @@ fn parse_optional_model(root: &JsonValue) -> Option<String> {
        .map(ToOwned::to_owned)
 }
 fn parse_optional_hooks_config(root: &JsonValue) -> Result<RuntimeHookConfig, ConfigError> {
    let Some(object) = root.as_object() else {
        return Ok(RuntimeHookConfig::default());
    };
    let Some(hooks_value) = object.get("hooks") else {
        return Ok(RuntimeHookConfig::default());
    };
    let hooks = expect_object(hooks_value, "merged settings.hooks")?;
    Ok(RuntimeHookConfig {
        pre_tool_use: optional_string_array(hooks, "PreToolUse", "merged settings.hooks")?
            .unwrap_or_default(),
        post_tool_use: optional_string_array(hooks, "PostToolUse", "merged settings.hooks")?
            .unwrap_or_default(),
    })
 }
 fn parse_optional_permission_mode(
    root: &JsonValue,
 ) -> Result<Option<ResolvedPermissionMode>, ConfigError> {
@@ -836,6 +896,8 @@ mod tests {
            .and_then(JsonValue::as_object)
            .expect("hooks object")
            .contains_key("PostToolUse"));
        assert_eq!(loaded.hooks().pre_tool_use(), &["base".to_string()]);
        assert_eq!(loaded.hooks().post_tool_use(), &["project".to_string()]);
        assert!(loaded.mcp().get("home").is_some());
        assert!(loaded.mcp().get("project").is_some());
@@ -4,6 +4,8 @@ use std::fmt::{Display, Formatter};
 use crate::compact::{
    compact_session, estimate_session_tokens, CompactionConfig, CompactionResult,
 };
 use crate::config::RuntimeFeatureConfig;
 use crate::hooks::{HookRunResult, HookRunner};
 use crate::permissions::{PermissionOutcome, PermissionPolicy, PermissionPrompter};
 use crate::session::{ContentBlock, ConversationMessage, Session};
 use crate::usage::{TokenUsage, UsageTracker};
@@ -23,9 +25,19 @@ pub enum AssistantEvent {
        input: String,
    },
    Usage(TokenUsage),
    PromptCache(PromptCacheEvent),
    MessageStop,
 }
 #[derive(Debug, Clone, PartialEq, Eq)]
 pub struct PromptCacheEvent {
    pub unexpected: bool,
    pub reason: String,
    pub previous_cache_read_input_tokens: u32,
    pub current_cache_read_input_tokens: u32,
    pub token_drop: u32,
 }
 pub trait ApiClient {
    fn stream(&mut self, request: ApiRequest) -> Result<Vec<AssistantEvent>, RuntimeError>;
 }
@@ -82,6 +94,7 @@ impl std::error::Error for RuntimeError {}
 pub struct TurnSummary {
    pub assistant_messages: Vec<ConversationMessage>,
    pub tool_results: Vec<ConversationMessage>,
    pub prompt_cache_events: Vec<PromptCacheEvent>,
    pub iterations: usize,
    pub usage: TokenUsage,
 }
@@ -94,6 +107,7 @@ pub struct ConversationRuntime<C, T> {
    system_prompt: Vec<String>,
    max_iterations: usize,
    usage_tracker: UsageTracker,
    hook_runner: HookRunner,
 }
 impl<C, T> ConversationRuntime<C, T>
@@ -108,6 +122,25 @@ where
        tool_executor: T,
        permission_policy: PermissionPolicy,
        system_prompt: Vec<String>,
    ) -> Self {
        Self::new_with_features(
            session,
            api_client,
            tool_executor,
            permission_policy,
            system_prompt,
            &RuntimeFeatureConfig::default(),
        )
    }
    #[must_use]
    pub fn new_with_features(
        session: Session,
        api_client: C,
        tool_executor: T,
        permission_policy: PermissionPolicy,
        system_prompt: Vec<String>,
        feature_config: &RuntimeFeatureConfig,
    ) -> Self {
        let usage_tracker = UsageTracker::from_session(&session);
        Self {
@@ -118,6 +151,7 @@ where
            system_prompt,
            max_iterations: usize::MAX,
            usage_tracker,
            hook_runner: HookRunner::from_feature_config(feature_config),
        }
    }
@@ -138,6 +172,7 @@ where
        let mut assistant_messages = Vec::new();
        let mut tool_results = Vec::new();
        let mut prompt_cache_events = Vec::new();
        let mut iterations = 0;
        loop {
@@ -153,10 +188,12 @@ where
                messages: self.session.messages.clone(),
            };
            let events = self.api_client.stream(request)?;
-            let (assistant_message, usage) = build_assistant_message(events)?;
+            let (assistant_message, usage, turn_prompt_cache_events) =
                build_assistant_message(events)?;
            if let Some(usage) = usage {
                self.usage_tracker.record(usage);
            }
            prompt_cache_events.extend(turn_prompt_cache_events);
            let pending_tool_uses = assistant_message
                .blocks
                .iter()
@@ -185,19 +222,41 @@ where
                let result_message = match permission_outcome {
                    PermissionOutcome::Allow => {
-                        match self.tool_executor.execute(&tool_name, &input) {
+                        let pre_hook_result = self.hook_runner.run_pre_tool_use(&tool_name, &input);
-                            Ok(output) => ConversationMessage::tool_result(
+                        if pre_hook_result.is_denied() {
                            let deny_message = format!("PreToolUse hook denied tool `{tool_name}`");
                            ConversationMessage::tool_result(
                                tool_use_id,
                                tool_name,
                                format_hook_message(&pre_hook_result, &deny_message),
                                true,
                            )
                        } else {
                            let (mut output, mut is_error) =
                                match self.tool_executor.execute(&tool_name, &input) {
                                    Ok(output) => (output, false),
                                    Err(error) => (error.to_string(), true),
                                };
                            output = merge_hook_feedback(pre_hook_result.messages(), output, false);
                            let post_hook_result = self
                                .hook_runner
                                .run_post_tool_use(&tool_name, &input, &output, is_error);
                            if post_hook_result.is_denied() {
                                is_error = true;
                            }
                            output = merge_hook_feedback(
                                post_hook_result.messages(),
                                output,
                                post_hook_result.is_denied(),
                            );
                            ConversationMessage::tool_result(
                                tool_use_id,
                                tool_name,
                                output,
-                                false,
+                                is_error,
-                            ),
+                            )
                            Err(error) => ConversationMessage::tool_result(
                                tool_use_id,
                                tool_name,
                                error.to_string(),
                                true,
                            ),
                        }
                    }
                    PermissionOutcome::Deny { reason } => {
@@ -212,6 +271,7 @@ where
        Ok(TurnSummary {
            assistant_messages,
            tool_results,
            prompt_cache_events,
            iterations,
            usage: self.usage_tracker.cumulative_usage(),
        })
@@ -245,9 +305,17 @@ where
 fn build_assistant_message(
    events: Vec<AssistantEvent>,
-) -> Result<(ConversationMessage, Option<TokenUsage>), RuntimeError> {
+) -> Result<
    (
        ConversationMessage,
        Option<TokenUsage>,
        Vec<PromptCacheEvent>,
    ),
    RuntimeError,
 > {
    let mut text = String::new();
    let mut blocks = Vec::new();
    let mut prompt_cache_events = Vec::new();
    let mut finished = false;
    let mut usage = None;
@@ -259,6 +327,7 @@ fn build_assistant_message(
                blocks.push(ContentBlock::ToolUse { id, name, input });
            }
            AssistantEvent::Usage(value) => usage = Some(value),
            AssistantEvent::PromptCache(event) => prompt_cache_events.push(event),
            AssistantEvent::MessageStop => {
                finished = true;
            }
@@ -279,6 +348,7 @@ fn build_assistant_message(
    Ok((
        ConversationMessage::assistant_with_usage(blocks, usage),
        usage,
        prompt_cache_events,
    ))
 }
@@ -290,6 +360,32 @@ fn flush_text_block(text: &mut String, blocks: &mut Vec<ContentBlock>) {
    }
 }
 fn format_hook_message(result: &HookRunResult, fallback: &str) -> String {
    if result.messages().is_empty() {
        fallback.to_string()
    } else {
        result.messages().join("\n")
    }
 }
 fn merge_hook_feedback(messages: &[String], output: String, denied: bool) -> String {
    if messages.is_empty() {
        return output;
    }
    let mut sections = Vec::new();
    if !output.trim().is_empty() {
        sections.push(output);
    }
    let label = if denied {
        "Hook feedback (denied)"
    } else {
        "Hook feedback"
    };
    sections.push(format!("{label}:\n{}", messages.join("\n")));
    sections.join("\n\n")
 }
 type ToolHandler = Box<dyn FnMut(&str) -> Result<String, ToolError>>;
 #[derive(Default)]
@@ -325,10 +421,11 @@ impl ToolExecutor for StaticToolExecutor {
 #[cfg(test)]
 mod tests {
    use super::{
-        ApiClient, ApiRequest, AssistantEvent, ConversationRuntime, RuntimeError,
+        ApiClient, ApiRequest, AssistantEvent, ConversationRuntime, PromptCacheEvent, RuntimeError,
        StaticToolExecutor,
    };
    use crate::compact::CompactionConfig;
    use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
    use crate::permissions::{
        PermissionMode, PermissionPolicy, PermissionPromptDecision, PermissionPrompter,
        PermissionRequest,
@@ -381,6 +478,15 @@ mod tests {
                            cache_creation_input_tokens: 1,
                            cache_read_input_tokens: 3,
                        }),
                        AssistantEvent::PromptCache(PromptCacheEvent {
                            unexpected: true,
                            reason:
                                "cache read tokens dropped while prompt fingerprint remained stable"
                                    .to_string(),
                            previous_cache_read_input_tokens: 6_000,
                            current_cache_read_input_tokens: 1_000,
                            token_drop: 5_000,
                        }),
                        AssistantEvent::MessageStop,
                    ])
                }
@@ -434,8 +540,10 @@ mod tests {
        assert_eq!(summary.iterations, 2);
        assert_eq!(summary.assistant_messages.len(), 2);
        assert_eq!(summary.tool_results.len(), 1);
        assert_eq!(summary.prompt_cache_events.len(), 1);
        assert_eq!(runtime.session().messages.len(), 4);
        assert_eq!(summary.usage.output_tokens, 10);
        assert!(summary.prompt_cache_events[0].unexpected);
        assert!(matches!(
            runtime.session().messages[1].blocks[1],
            ContentBlock::ToolUse { .. }
@@ -503,6 +611,141 @@ mod tests {
        ));
    }
    #[test]
    fn denies_tool_use_when_pre_tool_hook_blocks() {
        struct SingleCallApiClient;
        impl ApiClient for SingleCallApiClient {
            fn stream(&mut self, request: ApiRequest) -> Result<Vec<AssistantEvent>, RuntimeError> {
                if request
                    .messages
                    .iter()
                    .any(|message| message.role == MessageRole::Tool)
                {
                    return Ok(vec![
                        AssistantEvent::TextDelta("blocked".to_string()),
                        AssistantEvent::MessageStop,
                    ]);
                }
                Ok(vec![
                    AssistantEvent::ToolUse {
                        id: "tool-1".to_string(),
                        name: "blocked".to_string(),
                        input: r#"{"path":"secret.txt"}"#.to_string(),
                    },
                    AssistantEvent::MessageStop,
                ])
            }
        }
        let mut runtime = ConversationRuntime::new_with_features(
            Session::new(),
            SingleCallApiClient,
            StaticToolExecutor::new().register("blocked", |_input| {
                panic!("tool should not execute when hook denies")
            }),
            PermissionPolicy::new(PermissionMode::DangerFullAccess),
            vec!["system".to_string()],
            &RuntimeFeatureConfig::default().with_hooks(RuntimeHookConfig::new(
                vec![shell_snippet("printf 'blocked by hook'; exit 2")],
                Vec::new(),
            )),
        );
        let summary = runtime
            .run_turn("use the tool", None)
            .expect("conversation should continue after hook denial");
        assert_eq!(summary.tool_results.len(), 1);
        let ContentBlock::ToolResult {
            is_error, output, ..
        } = &summary.tool_results[0].blocks[0]
        else {
            panic!("expected tool result block");
        };
        assert!(
            *is_error,
            "hook denial should produce an error result: {output}"
        );
        assert!(
            output.contains("denied tool") || output.contains("blocked by hook"),
            "unexpected hook denial output: {output:?}"
        );
    }
    #[test]
    fn appends_post_tool_hook_feedback_to_tool_result() {
        struct TwoCallApiClient {
            calls: usize,
        }
        impl ApiClient for TwoCallApiClient {
            fn stream(&mut self, request: ApiRequest) -> Result<Vec<AssistantEvent>, RuntimeError> {
                self.calls += 1;
                match self.calls {
                    1 => Ok(vec![
                        AssistantEvent::ToolUse {
                            id: "tool-1".to_string(),
                            name: "add".to_string(),
                            input: r#"{"lhs":2,"rhs":2}"#.to_string(),
                        },
                        AssistantEvent::MessageStop,
                    ]),
                    2 => {
                        assert!(request
                            .messages
                            .iter()
                            .any(|message| message.role == MessageRole::Tool));
                        Ok(vec![
                            AssistantEvent::TextDelta("done".to_string()),
                            AssistantEvent::MessageStop,
                        ])
                    }
                    _ => Err(RuntimeError::new("unexpected extra API call")),
                }
            }
        }
        let mut runtime = ConversationRuntime::new_with_features(
            Session::new(),
            TwoCallApiClient { calls: 0 },
            StaticToolExecutor::new().register("add", |_input| Ok("4".to_string())),
            PermissionPolicy::new(PermissionMode::DangerFullAccess),
            vec!["system".to_string()],
            &RuntimeFeatureConfig::default().with_hooks(RuntimeHookConfig::new(
                vec![shell_snippet("printf 'pre hook ran'")],
                vec![shell_snippet("printf 'post hook ran'")],
            )),
        );
        let summary = runtime
            .run_turn("use add", None)
            .expect("tool loop succeeds");
        assert_eq!(summary.tool_results.len(), 1);
        let ContentBlock::ToolResult {
            is_error, output, ..
        } = &summary.tool_results[0].blocks[0]
        else {
            panic!("expected tool result block");
        };
        assert!(
            !*is_error,
            "post hook should preserve non-error result: {output:?}"
        );
        assert!(
            output.contains('4'),
            "tool output missing value: {output:?}"
        );
        assert!(
            output.contains("pre hook ran"),
            "tool output missing pre hook feedback: {output:?}"
        );
        assert!(
            output.contains("post hook ran"),
            "tool output missing post hook feedback: {output:?}"
        );
    }
    #[test]
    fn reconstructs_usage_tracker_from_restored_session() {
        struct SimpleApi;
@@ -581,4 +824,14 @@ mod tests {
            MessageRole::System
        );
    }
    #[cfg(windows)]
    fn shell_snippet(script: &str) -> String {
        script.replace('\'', "\"")
    }
    #[cfg(not(windows))]
    fn shell_snippet(script: &str) -> String {
        script.to_string()
    }
 }
@@ -0,0 +1,347 @@
 use std::ffi::OsStr;
 use std::process::Command;
 use serde_json::json;
 use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub enum HookEvent {
    PreToolUse,
    PostToolUse,
 }
 impl HookEvent {
    fn as_str(self) -> &'static str {
        match self {
            Self::PreToolUse => "PreToolUse",
            Self::PostToolUse => "PostToolUse",
        }
    }
 }
 #[derive(Debug, Clone, PartialEq, Eq)]
 pub struct HookRunResult {
    denied: bool,
    messages: Vec<String>,
 }
 impl HookRunResult {
    #[must_use]
    pub fn allow(messages: Vec<String>) -> Self {
        Self {
            denied: false,
            messages,
        }
    }
    #[must_use]
    pub fn is_denied(&self) -> bool {
        self.denied
    }
    #[must_use]
    pub fn messages(&self) -> &[String] {
        &self.messages
    }
 }
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct HookRunner {
    config: RuntimeHookConfig,
 }
 impl HookRunner {
    #[must_use]
    pub fn new(config: RuntimeHookConfig) -> Self {
        Self { config }
    }
    #[must_use]
    pub fn from_feature_config(feature_config: &RuntimeFeatureConfig) -> Self {
        Self::new(feature_config.hooks().clone())
    }
    #[must_use]
    pub fn run_pre_tool_use(&self, tool_name: &str, tool_input: &str) -> HookRunResult {
        Self::run_commands(
            HookEvent::PreToolUse,
            self.config.pre_tool_use(),
            tool_name,
            tool_input,
            None,
            false,
        )
    }
    #[must_use]
    pub fn run_post_tool_use(
        &self,
        tool_name: &str,
        tool_input: &str,
        tool_output: &str,
        is_error: bool,
    ) -> HookRunResult {
        Self::run_commands(
            HookEvent::PostToolUse,
            self.config.post_tool_use(),
            tool_name,
            tool_input,
            Some(tool_output),
            is_error,
        )
    }
    fn run_commands(
        event: HookEvent,
        commands: &[String],
        tool_name: &str,
        tool_input: &str,
        tool_output: Option<&str>,
        is_error: bool,
    ) -> HookRunResult {
        if commands.is_empty() {
            return HookRunResult::allow(Vec::new());
        }
        let payload = json!({
            "hook_event_name": event.as_str(),
            "tool_name": tool_name,
            "tool_input": parse_tool_input(tool_input),
            "tool_input_json": tool_input,
            "tool_output": tool_output,
            "tool_result_is_error": is_error,
        })
        .to_string();
        let mut messages = Vec::new();
        for command in commands {
            match Self::run_command(
                command,
                event,
                tool_name,
                tool_input,
                tool_output,
                is_error,
                &payload,
            ) {
                HookCommandOutcome::Allow { message } => {
                    if let Some(message) = message {
                        messages.push(message);
                    }
                }
                HookCommandOutcome::Deny { message } => {
                    let message = message.unwrap_or_else(|| {
                        format!("{} hook denied tool `{tool_name}`", event.as_str())
                    });
                    messages.push(message);
                    return HookRunResult {
                        denied: true,
                        messages,
                    };
                }
                HookCommandOutcome::Warn { message } => messages.push(message),
            }
        }
        HookRunResult::allow(messages)
    }
    fn run_command(
        command: &str,
        event: HookEvent,
        tool_name: &str,
        tool_input: &str,
        tool_output: Option<&str>,
        is_error: bool,
        payload: &str,
    ) -> HookCommandOutcome {
        let mut child = shell_command(command);
        child.stdin(std::process::Stdio::piped());
        child.stdout(std::process::Stdio::piped());
        child.stderr(std::process::Stdio::piped());
        child.env("HOOK_EVENT", event.as_str());
        child.env("HOOK_TOOL_NAME", tool_name);
        child.env("HOOK_TOOL_INPUT", tool_input);
        child.env("HOOK_TOOL_IS_ERROR", if is_error { "1" } else { "0" });
        if let Some(tool_output) = tool_output {
            child.env("HOOK_TOOL_OUTPUT", tool_output);
        }
        match child.output_with_stdin(payload.as_bytes()) {
            Ok(output) => {
                let stdout = String::from_utf8_lossy(&output.stdout).trim().to_string();
                let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
                let message = (!stdout.is_empty()).then_some(stdout);
                match output.status.code() {
                    Some(0) => HookCommandOutcome::Allow { message },
                    Some(2) => HookCommandOutcome::Deny { message },
                    Some(code) => HookCommandOutcome::Warn {
                        message: format_hook_warning(
                            command,
                            code,
                            message.as_deref(),
                            stderr.as_str(),
                        ),
                    },
                    None => HookCommandOutcome::Warn {
                        message: format!(
                            "{} hook `{command}` terminated by signal while handling `{tool_name}`",
                            event.as_str()
                        ),
                    },
                }
            }
            Err(error) => HookCommandOutcome::Warn {
                message: format!(
                    "{} hook `{command}` failed to start for `{tool_name}`: {error}",
                    event.as_str()
                ),
            },
        }
    }
 }
 enum HookCommandOutcome {
    Allow { message: Option<String> },
    Deny { message: Option<String> },
    Warn { message: String },
 }
 fn parse_tool_input(tool_input: &str) -> serde_json::Value {
    serde_json::from_str(tool_input).unwrap_or_else(|_| json!({ "raw": tool_input }))
 }
 fn format_hook_warning(command: &str, code: i32, stdout: Option<&str>, stderr: &str) -> String {
    let mut message =
        format!("Hook `{command}` exited with status {code}; allowing tool execution to continue");
    if let Some(stdout) = stdout.filter(|stdout| !stdout.is_empty()) {
        message.push_str(": ");
        message.push_str(stdout);
    } else if !stderr.is_empty() {
        message.push_str(": ");
        message.push_str(stderr);
    }
    message
 }
 fn shell_command(command: &str) -> CommandWithStdin {
    #[cfg(windows)]
    let mut command_builder = {
        let mut command_builder = Command::new("cmd");
        command_builder.arg("/C").arg(command);
        CommandWithStdin::new(command_builder)
    };
    #[cfg(not(windows))]
    let command_builder = {
        let mut command_builder = Command::new("sh");
        command_builder.arg("-lc").arg(command);
        CommandWithStdin::new(command_builder)
    };
    command_builder
 }
 struct CommandWithStdin {
    command: Command,
 }
 impl CommandWithStdin {
    fn new(command: Command) -> Self {
        Self { command }
    }
    fn stdin(&mut self, cfg: std::process::Stdio) -> &mut Self {
        self.command.stdin(cfg);
        self
    }
    fn stdout(&mut self, cfg: std::process::Stdio) -> &mut Self {
        self.command.stdout(cfg);
        self
    }
    fn stderr(&mut self, cfg: std::process::Stdio) -> &mut Self {
        self.command.stderr(cfg);
        self
    }
    fn env<K, V>(&mut self, key: K, value: V) -> &mut Self
    where
        K: AsRef<OsStr>,
        V: AsRef<OsStr>,
    {
        self.command.env(key, value);
        self
    }
    fn output_with_stdin(&mut self, stdin: &[u8]) -> std::io::Result<std::process::Output> {
        let mut child = self.command.spawn()?;
        if let Some(mut child_stdin) = child.stdin.take() {
            use std::io::Write;
            child_stdin.write_all(stdin)?;
        }
        child.wait_with_output()
    }
 }
 #[cfg(test)]
 mod tests {
    use super::{HookRunResult, HookRunner};
    use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
    #[test]
    fn allows_exit_code_zero_and_captures_stdout() {
        let runner = HookRunner::new(RuntimeHookConfig::new(
            vec![shell_snippet("printf 'pre ok'")],
            Vec::new(),
        ));
        let result = runner.run_pre_tool_use("Read", r#"{"path":"README.md"}"#);
        assert_eq!(result, HookRunResult::allow(vec!["pre ok".to_string()]));
    }
    #[test]
    fn denies_exit_code_two() {
        let runner = HookRunner::new(RuntimeHookConfig::new(
            vec![shell_snippet("printf 'blocked by hook'; exit 2")],
            Vec::new(),
        ));
        let result = runner.run_pre_tool_use("Bash", r#"{"command":"pwd"}"#);
        assert!(result.is_denied());
        assert_eq!(result.messages(), &["blocked by hook".to_string()]);
    }
    #[test]
    fn warns_for_other_non_zero_statuses() {
        let runner = HookRunner::from_feature_config(&RuntimeFeatureConfig::default().with_hooks(
            RuntimeHookConfig::new(
                vec![shell_snippet("printf 'warning hook'; exit 1")],
                Vec::new(),
            ),
        ));
        let result = runner.run_pre_tool_use("Edit", r#"{"file":"src/lib.rs"}"#);
        assert!(!result.is_denied());
        assert!(result
            .messages()
            .iter()
            .any(|message| message.contains("allowing tool execution to continue")));
    }
    #[cfg(windows)]
    fn shell_snippet(script: &str) -> String {
        script.replace('\'', "\"")
    }
    #[cfg(not(windows))]
    fn shell_snippet(script: &str) -> String {
        script.to_string()
    }
 }
@@ -4,6 +4,7 @@ mod compact;
 mod config;
 mod conversation;
 mod file_ops;
 mod hooks;
 mod json;
 mod mcp;
 mod mcp_client;
@@ -26,18 +27,19 @@ pub use config::{
    ConfigEntry, ConfigError, ConfigLoader, ConfigSource, McpClaudeAiProxyServerConfig,
    McpConfigCollection, McpOAuthConfig, McpRemoteServerConfig, McpSdkServerConfig,
    McpServerConfig, McpStdioServerConfig, McpTransport, McpWebSocketServerConfig, OAuthConfig,
-    ResolvedPermissionMode, RuntimeConfig, RuntimeFeatureConfig, ScopedMcpServerConfig,
+    ResolvedPermissionMode, RuntimeConfig, RuntimeFeatureConfig, RuntimeHookConfig,
-    CLAUDE_CODE_SETTINGS_SCHEMA_NAME,
+    ScopedMcpServerConfig, CLAUDE_CODE_SETTINGS_SCHEMA_NAME,
 };
 pub use conversation::{
-    ApiClient, ApiRequest, AssistantEvent, ConversationRuntime, RuntimeError, StaticToolExecutor,
+    ApiClient, ApiRequest, AssistantEvent, ConversationRuntime, PromptCacheEvent, RuntimeError,
-    ToolError, ToolExecutor, TurnSummary,
+    StaticToolExecutor, ToolError, ToolExecutor, TurnSummary,
 };
 pub use file_ops::{
    edit_file, glob_search, grep_search, read_file, write_file, EditFileOutput, GlobSearchOutput,
    GrepSearchInput, GrepSearchOutput, ReadFileOutput, StructuredPatchHunk, TextFilePayload,
    WriteFileOutput,
 };
 pub use hooks::{HookEvent, HookRunResult, HookRunner};
 pub use mcp::{
    mcp_server_signature, mcp_tool_name, mcp_tool_prefix, normalize_name_for_mcp,
    scoped_mcp_config_hash, unwrap_ccr_proxy_url,
@@ -13,8 +13,9 @@ use std::time::{SystemTime, UNIX_EPOCH};
 use api::{
    resolve_startup_auth_source, AnthropicClient, AuthSource, ContentBlockDelta, InputContentBlock,
-    InputMessage, MessageRequest, MessageResponse, OutputContentBlock,
+    InputMessage, MessageRequest, MessageResponse, OutputContentBlock, PromptCache,
-    StreamEvent as ApiStreamEvent, ToolChoice, ToolDefinition, ToolResultContentBlock,
+    PromptCacheRecord, StreamEvent as ApiStreamEvent, ToolChoice, ToolDefinition,
    ToolResultContentBlock,
 };
 use commands::{
@@ -27,9 +28,9 @@ use runtime::{
    clear_oauth_credentials, generate_pkce_pair, generate_state, load_system_prompt,
    parse_oauth_callback_request_target, save_oauth_credentials, ApiClient, ApiRequest,
    AssistantEvent, CompactionConfig, ConfigLoader, ConfigSource, ContentBlock,
-    ConversationMessage, ConversationRuntime, MessageRole, OAuthAuthorizationRequest,
+    ConversationMessage, ConversationRuntime, MessageRole, OAuthAuthorizationRequest, OAuthConfig,
-    OAuthTokenExchangeRequest, PermissionMode, PermissionPolicy, ProjectContext, RuntimeError,
+    OAuthTokenExchangeRequest, PermissionMode, PermissionPolicy, ProjectContext, PromptCacheEvent,
-    Session, TokenUsage, ToolError, ToolExecutor, UsageTracker,
+    RuntimeError, Session, TokenUsage, ToolError, ToolExecutor, UsageTracker,
 };
 use serde_json::json;
 use tools::{execute_tool, mvp_tool_specs, ToolSpec};
@@ -196,6 +197,25 @@ fn parse_args(args: &[String]) -> Result<CliAction, String> {
                permission_mode = PermissionMode::DangerFullAccess;
                index += 1;
            }
            "-p" => {
                // Claude Code compat: -p "prompt" = one-shot prompt
                let prompt = args[index + 1..].join(" ");
                if prompt.trim().is_empty() {
                    return Err("-p requires a prompt string".to_string());
                }
                return Ok(CliAction::Prompt {
                    prompt,
                    model: resolve_model_alias(&model).to_string(),
                    output_format,
                    allowed_tools: normalize_allowed_tools(&allowed_tool_values)?,
                    permission_mode,
                });
            }
            "--print" => {
                // Claude Code compat: --print makes output non-interactive
                output_format = CliOutputFormat::Text;
                index += 1;
            }
            "--allowedTools" | "--allowed-tools" => {
                let value = args
                    .get(index + 1)
@@ -428,15 +448,26 @@ fn print_bootstrap_plan() {
    }
 }
 fn default_oauth_config() -> OAuthConfig {
    OAuthConfig {
        client_id: String::from("9d1c250a-e61b-44d9-88ed-5944d1962f5e"),
        authorize_url: String::from("https://platform.claude.com/oauth/authorize"),
        token_url: String::from("https://platform.claude.com/v1/oauth/token"),
        callback_port: None,
        manual_redirect_url: None,
        scopes: vec![
            String::from("user:profile"),
            String::from("user:inference"),
            String::from("user:sessions:claude_code"),
        ],
    }
 }
 fn run_login() -> Result<(), Box<dyn std::error::Error>> {
    let cwd = env::current_dir()?;
    let config = ConfigLoader::default_for(&cwd).load()?;
-    let oauth = config.oauth().ok_or_else(|| {
+    let default_oauth = default_oauth_config();
-        io::Error::new(
+    let oauth = config.oauth().unwrap_or(&default_oauth);
            io::ErrorKind::NotFound,
            "OAuth config is missing. Add settings.oauth.clientId/authorizeUrl/tokenUrl first.",
        )
    })?;
    let callback_port = oauth.callback_port.unwrap_or(DEFAULT_OAUTH_CALLBACK_PORT);
    let redirect_uri = runtime::loopback_redirect_uri(callback_port);
    let pkce = generate_pkce_pair()?;
@@ -965,6 +996,7 @@ impl LiveCli {
        let session = create_managed_session_handle()?;
        let runtime = build_runtime(
            Session::new(),
            session.id.clone(),
            model.clone(),
            system_prompt.clone(),
            enable_tools,
@@ -1020,13 +1052,14 @@ impl LiveCli {
        let mut permission_prompter = CliPermissionPrompter::new(self.permission_mode);
        let result = self.runtime.run_turn(input, Some(&mut permission_prompter));
        match result {
-            Ok(_) => {
+            Ok(summary) => {
                spinner.finish(
                    "✨ Done",
                    TerminalRenderer::new().color_theme(),
                    &mut stdout,
                )?;
                println!();
                print_prompt_cache_events(&summary);
                self.persist_session()?;
                Ok(())
            }
@@ -1056,6 +1089,7 @@ impl LiveCli {
        let session = self.runtime.session().clone();
        let mut runtime = build_runtime(
            session,
            self.session.id.clone(),
            self.model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1075,6 +1109,7 @@ impl LiveCli {
                "iterations": summary.iterations,
                "tool_uses": collect_tool_uses(&summary),
                "tool_results": collect_tool_results(&summary),
                "prompt_cache_events": collect_prompt_cache_events(&summary),
                "usage": {
                    "input_tokens": summary.usage.input_tokens,
                    "output_tokens": summary.usage.output_tokens,
@@ -1202,6 +1237,7 @@ impl LiveCli {
        let message_count = session.messages.len();
        self.runtime = build_runtime(
            session,
            self.session.id.clone(),
            model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1245,6 +1281,7 @@ impl LiveCli {
        self.permission_mode = permission_mode_from_label(normalized);
        self.runtime = build_runtime(
            session,
            self.session.id.clone(),
            self.model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1270,6 +1307,7 @@ impl LiveCli {
        self.session = create_managed_session_handle()?;
        self.runtime = build_runtime(
            Session::new(),
            self.session.id.clone(),
            self.model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1305,6 +1343,7 @@ impl LiveCli {
        let message_count = session.messages.len();
        self.runtime = build_runtime(
            session,
            handle.id.clone(),
            self.model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1377,6 +1416,7 @@ impl LiveCli {
                let message_count = session.messages.len();
                self.runtime = build_runtime(
                    session,
                    handle.id.clone(),
                    self.model.clone(),
                    self.system_prompt.clone(),
                    true,
@@ -1407,6 +1447,7 @@ impl LiveCli {
        let skipped = removed == 0;
        self.runtime = build_runtime(
            result.compacted_session,
            self.session.id.clone(),
            self.model.clone(),
            self.system_prompt.clone(),
            true,
@@ -1873,8 +1914,19 @@ fn build_system_prompt() -> Result<Vec<String>, Box<dyn std::error::Error>> {
    )?)
 }
 fn build_runtime_feature_config(
 ) -> Result<runtime::RuntimeFeatureConfig, Box<dyn std::error::Error>> {
    let cwd = env::current_dir()?;
    Ok(ConfigLoader::default_for(cwd)
        .load()?
        .feature_config()
        .clone())
 }
 #[allow(clippy::too_many_arguments)]
 fn build_runtime(
    session: Session,
    session_id: String,
    model: String,
    system_prompt: Vec<String>,
    enable_tools: bool,
@@ -1883,12 +1935,19 @@ fn build_runtime(
    permission_mode: PermissionMode,
 ) -> Result<ConversationRuntime<AnthropicRuntimeClient, CliToolExecutor>, Box<dyn std::error::Error>>
 {
-    Ok(ConversationRuntime::new(
+    Ok(ConversationRuntime::new_with_features(
        session,
-        AnthropicRuntimeClient::new(model, enable_tools, emit_output, allowed_tools.clone())?,
+        AnthropicRuntimeClient::new(
            model,
            enable_tools,
            emit_output,
            allowed_tools.clone(),
            session_id,
        )?,
        CliToolExecutor::new(allowed_tools, emit_output),
        permission_policy(permission_mode),
        system_prompt,
        &build_runtime_feature_config()?,
    ))
 }
@@ -1953,11 +2012,13 @@ impl AnthropicRuntimeClient {
        enable_tools: bool,
        emit_output: bool,
        allowed_tools: Option<AllowedToolSet>,
        session_id: impl Into<String>,
    ) -> Result<Self, Box<dyn std::error::Error>> {
        Ok(Self {
            runtime: tokio::runtime::Runtime::new()?,
            client: AnthropicClient::from_auth(resolve_cli_auth_source()?)
-                .with_base_url(api::read_base_url()),
+                .with_base_url(api::read_base_url())
                .with_prompt_cache(PromptCache::new(session_id)),
            model,
            enable_tools,
            emit_output,
@@ -2072,8 +2133,8 @@ impl ApiClient for AnthropicRuntimeClient {
                        events.push(AssistantEvent::Usage(TokenUsage {
                            input_tokens: delta.usage.input_tokens,
                            output_tokens: delta.usage.output_tokens,
-                            cache_creation_input_tokens: 0,
+                            cache_creation_input_tokens: delta.usage.cache_creation_input_tokens,
-                            cache_read_input_tokens: 0,
+                            cache_read_input_tokens: delta.usage.cache_read_input_tokens,
                        }));
                    }
                    ApiStreamEvent::MessageStop(_) => {
@@ -2088,6 +2149,8 @@ impl ApiClient for AnthropicRuntimeClient {
                }
            }
            push_prompt_cache_record(&self.client, &mut events);
            if !saw_stop
                && events.iter().any(|event| {
                    matches!(event, AssistantEvent::TextDelta(text) if !text.is_empty())
@@ -2112,7 +2175,9 @@ impl ApiClient for AnthropicRuntimeClient {
                })
                .await
                .map_err(|error| RuntimeError::new(error.to_string()))?;
-            response_to_events(response, out)
+            let mut events = response_to_events(response, out)?;
            push_prompt_cache_record(&self.client, &mut events);
            Ok(events)
        })
    }
 }
@@ -2173,6 +2238,39 @@ fn collect_tool_results(summary: &runtime::TurnSummary) -> Vec<serde_json::Value
        .collect()
 }
 fn collect_prompt_cache_events(summary: &runtime::TurnSummary) -> Vec<serde_json::Value> {
    summary
        .prompt_cache_events
        .iter()
        .map(|event| {
            json!({
                "unexpected": event.unexpected,
                "reason": event.reason,
                "previous_cache_read_input_tokens": event.previous_cache_read_input_tokens,
                "current_cache_read_input_tokens": event.current_cache_read_input_tokens,
                "token_drop": event.token_drop,
            })
        })
        .collect()
 }
 fn print_prompt_cache_events(summary: &runtime::TurnSummary) {
    for event in &summary.prompt_cache_events {
        let label = if event.unexpected {
            "Prompt cache break"
        } else {
            "Prompt cache invalidation"
        };
        println!(
            "{label}: {} (cache read {} -> {}, drop {})",
            event.reason,
            event.previous_cache_read_input_tokens,
            event.current_cache_read_input_tokens,
            event.token_drop,
        );
    }
 }
 fn slash_command_completion_candidates() -> Vec<String> {
    slash_command_specs()
        .iter()
@@ -2319,18 +2417,20 @@ fn first_visible_line(text: &str) -> &str {
 }
 fn format_bash_result(icon: &str, parsed: &serde_json::Value) -> String {
    use std::fmt::Write as _;
    let mut lines = vec![format!("{icon} \x1b[38;5;245mbash\x1b[0m")];
    if let Some(task_id) = parsed
        .get("backgroundTaskId")
        .and_then(|value| value.as_str())
    {
-        lines[0].push_str(&format!(" backgrounded ({task_id})"));
+        let _ = write!(lines[0], " backgrounded ({task_id})");
    } else if let Some(status) = parsed
        .get("returnCodeInterpretation")
        .and_then(|value| value.as_str())
        .filter(|status| !status.is_empty())
    {
-        lines[0].push_str(&format!(" {status}"));
+        let _ = write!(lines[0], " {status}");
    }
    if let Some(stdout) = parsed.get("stdout").and_then(|value| value.as_str()) {
@@ -2352,15 +2452,15 @@ fn format_read_result(icon: &str, parsed: &serde_json::Value) -> String {
    let path = extract_tool_path(file);
    let start_line = file
        .get("startLine")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(1);
    let num_lines = file
        .get("numLines")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(0);
    let total_lines = file
        .get("totalLines")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(num_lines);
    let content = file
        .get("content")
@@ -2386,8 +2486,7 @@ fn format_write_result(icon: &str, parsed: &serde_json::Value) -> String {
    let line_count = parsed
        .get("content")
        .and_then(|value| value.as_str())
-        .map(|content| content.lines().count())
+        .map_or(0, |content| content.lines().count());
        .unwrap_or(0);
    format!(
        "{icon} \x1b[1;32m✏️ {} {path}\x1b[0m \x1b[2m({line_count} lines)\x1b[0m",
        if kind == "create" { "Wrote" } else { "Updated" },
@@ -2418,7 +2517,7 @@ fn format_edit_result(icon: &str, parsed: &serde_json::Value) -> String {
    let path = extract_tool_path(parsed);
    let suffix = if parsed
        .get("replaceAll")
-        .and_then(|value| value.as_bool())
+        .and_then(serde_json::Value::as_bool)
        .unwrap_or(false)
    {
        " (replace all)"
@@ -2446,7 +2545,7 @@ fn format_edit_result(icon: &str, parsed: &serde_json::Value) -> String {
 fn format_glob_result(icon: &str, parsed: &serde_json::Value) -> String {
    let num_files = parsed
        .get("numFiles")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(0);
    let filenames = parsed
        .get("filenames")
@@ -2470,11 +2569,11 @@ fn format_glob_result(icon: &str, parsed: &serde_json::Value) -> String {
 fn format_grep_result(icon: &str, parsed: &serde_json::Value) -> String {
    let num_matches = parsed
        .get("numMatches")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(0);
    let num_files = parsed
        .get("numFiles")
-        .and_then(|value| value.as_u64())
+        .and_then(serde_json::Value::as_u64)
        .unwrap_or(0);
    let content = parsed
        .get("content")
@@ -2581,6 +2680,26 @@ fn response_to_events(
    Ok(events)
 }
 fn push_prompt_cache_record(client: &AnthropicClient, events: &mut Vec<AssistantEvent>) {
    if let Some(event) = client
        .take_last_prompt_cache_record()
        .and_then(prompt_cache_record_to_runtime_event)
    {
        events.push(AssistantEvent::PromptCache(event));
    }
 }
 fn prompt_cache_record_to_runtime_event(record: PromptCacheRecord) -> Option<PromptCacheEvent> {
    let cache_break = record.cache_break?;
    Some(PromptCacheEvent {
        unexpected: cache_break.unexpected,
        reason: cache_break.reason,
        previous_cache_read_input_tokens: cache_break.previous_cache_read_input_tokens,
        current_cache_read_input_tokens: cache_break.current_cache_read_input_tokens,
        token_drop: cache_break.token_drop,
    })
 }
 struct CliToolExecutor {
    renderer: TerminalRenderer,
    emit_output: bool,
@@ -286,7 +286,7 @@ impl TerminalRenderer {
    ) {
        match event {
            Event::Start(Tag::Heading { level, .. }) => {
-                self.start_heading(state, level as u8, output)
+                Self::start_heading(state, level as u8, output);
            }
            Event::End(TagEnd::Paragraph) => output.push_str("\n\n"),
            Event::Start(Tag::BlockQuote(..)) => self.start_quote(state, output),
@@ -426,7 +426,7 @@ impl TerminalRenderer {
        }
    }
-    fn start_heading(&self, state: &mut RenderState, level: u8, output: &mut String) {
+    fn start_heading(state: &mut RenderState, level: u8, output: &mut String) {
        state.heading_level = Some(level);
        if !output.is_empty() {
            output.push('\n');
@@ -6,10 +6,12 @@ license.workspace = true
 publish.workspace = true
 [dependencies]
 api = { path = "../api" }
 runtime = { path = "../runtime" }
 reqwest = { version = "0.12", default-features = false, features = ["blocking", "rustls-tls"] }
 serde = { version = "1", features = ["derive"] }
 serde_json = "1"
 tokio = { version = "1", features = ["rt-multi-thread"] }
 [lints]
 workspace = true
@@ -3,10 +3,17 @@ use std::path::{Path, PathBuf};
 use std::process::Command;
 use std::time::{Duration, Instant};
 use api::{
    read_base_url, AnthropicClient, ContentBlockDelta, InputContentBlock, InputMessage,
    MessageRequest, MessageResponse, OutputContentBlock, PromptCache, PromptCacheRecord,
    StreamEvent as ApiStreamEvent, ToolChoice, ToolDefinition, ToolResultContentBlock,
 };
 use reqwest::blocking::Client;
 use runtime::{
-    edit_file, execute_bash, glob_search, grep_search, read_file, write_file, BashCommandInput,
+    edit_file, execute_bash, glob_search, grep_search, load_system_prompt, read_file, write_file,
-    GrepSearchInput, PermissionMode,
+    ApiClient, ApiRequest, AssistantEvent, BashCommandInput, ContentBlock, ConversationMessage,
    ConversationRuntime, GrepSearchInput, MessageRole, PermissionMode, PermissionPolicy,
    PromptCacheEvent, RuntimeError, Session, TokenUsage, ToolError, ToolExecutor,
 };
 use serde::{Deserialize, Serialize};
 use serde_json::{json, Value};
@@ -702,7 +709,7 @@ struct SkillOutput {
    prompt: String,
 }
-#[derive(Debug, Serialize, Deserialize)]
+#[derive(Debug, Clone, Serialize, Deserialize)]
 struct AgentOutput {
    #[serde(rename = "agentId")]
    agent_id: String,
@@ -718,6 +725,20 @@ struct AgentOutput {
    manifest_file: String,
    #[serde(rename = "createdAt")]
    created_at: String,
    #[serde(rename = "startedAt", skip_serializing_if = "Option::is_none")]
    started_at: Option<String>,
    #[serde(rename = "completedAt", skip_serializing_if = "Option::is_none")]
    completed_at: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    error: Option<String>,
 }
 #[derive(Debug, Clone)]
 struct AgentJob {
    manifest: AgentOutput,
    prompt: String,
    system_prompt: Vec<String>,
    allowed_tools: BTreeSet<String>,
 }
 #[derive(Debug, Serialize)]
@@ -1315,7 +1336,18 @@ fn resolve_skill_path(skill: &str) -> Result<std::path::PathBuf, String> {
    Err(format!("unknown skill: {requested}"))
 }
 const DEFAULT_AGENT_MODEL: &str = "claude-opus-4-6";
 const DEFAULT_AGENT_SYSTEM_DATE: &str = "2026-03-31";
 const DEFAULT_AGENT_MAX_ITERATIONS: usize = 32;
 fn execute_agent(input: AgentInput) -> Result<AgentOutput, String> {
    execute_agent_with_spawn(input, spawn_agent_job)
 }
 fn execute_agent_with_spawn<F>(input: AgentInput, spawn_fn: F) -> Result<AgentOutput, String>
 where
    F: FnOnce(AgentJob) -> Result<(), String>,
 {
    if input.description.trim().is_empty() {
        return Err(String::from("description must not be empty"));
    }
@@ -1329,6 +1361,7 @@ fn execute_agent(input: AgentInput) -> Result<AgentOutput, String> {
    let output_file = output_dir.join(format!("{agent_id}.md"));
    let manifest_file = output_dir.join(format!("{agent_id}.json"));
    let normalized_subagent_type = normalize_subagent_type(input.subagent_type.as_deref());
    let model = resolve_agent_model(input.model.as_deref());
    let agent_name = input
        .name
        .as_deref()
@@ -1336,6 +1369,8 @@ fn execute_agent(input: AgentInput) -> Result<AgentOutput, String> {
        .filter(|name| !name.is_empty())
        .unwrap_or_else(|| slugify_agent_name(&input.description));
    let created_at = iso8601_now();
    let system_prompt = build_agent_system_prompt(&normalized_subagent_type)?;
    let allowed_tools = allowed_tools_for_subagent(&normalized_subagent_type);
    let output_contents = format!(
        "# Agent Task
@@ -1359,21 +1394,545 @@ fn execute_agent(input: AgentInput) -> Result<AgentOutput, String> {
        name: agent_name,
        description: input.description,
        subagent_type: Some(normalized_subagent_type),
-        model: input.model,
+        model: Some(model),
-        status: String::from("queued"),
+        status: String::from("running"),
        output_file: output_file.display().to_string(),
        manifest_file: manifest_file.display().to_string(),
-        created_at,
+        created_at: created_at.clone(),
        started_at: Some(created_at),
        completed_at: None,
        error: None,
    };
-    std::fs::write(
+    write_agent_manifest(&manifest)?;
-        &manifest_file,
+
-        serde_json::to_string_pretty(&manifest).map_err(|error| error.to_string())?,
+    let manifest_for_spawn = manifest.clone();
-    )
+    let job = AgentJob {
-    .map_err(|error| error.to_string())?;
+        manifest: manifest_for_spawn,
        prompt: input.prompt,
        system_prompt,
        allowed_tools,
    };
    if let Err(error) = spawn_fn(job) {
        let error = format!("failed to spawn sub-agent: {error}");
        persist_agent_terminal_state(&manifest, "failed", None, Some(error.clone()))?;
        return Err(error);
    }
    Ok(manifest)
 }
 fn spawn_agent_job(job: AgentJob) -> Result<(), String> {
    let thread_name = format!("clawd-agent-{}", job.manifest.agent_id);
    std::thread::Builder::new()
        .name(thread_name)
        .spawn(move || {
            let result =
                std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| run_agent_job(&job)));
            match result {
                Ok(Ok(())) => {}
                Ok(Err(error)) => {
                    let _ =
                        persist_agent_terminal_state(&job.manifest, "failed", None, Some(error));
                }
                Err(_) => {
                    let _ = persist_agent_terminal_state(
                        &job.manifest,
                        "failed",
                        None,
                        Some(String::from("sub-agent thread panicked")),
                    );
                }
            }
        })
        .map(|_| ())
        .map_err(|error| error.to_string())
 }
 fn run_agent_job(job: &AgentJob) -> Result<(), String> {
    let mut runtime = build_agent_runtime(job)?.with_max_iterations(DEFAULT_AGENT_MAX_ITERATIONS);
    let summary = runtime
        .run_turn(job.prompt.clone(), None)
        .map_err(|error| error.to_string())?;
    let final_text = final_assistant_text(&summary);
    persist_agent_terminal_state(&job.manifest, "completed", Some(final_text.as_str()), None)
 }
 fn build_agent_runtime(
    job: &AgentJob,
 ) -> Result<ConversationRuntime<AnthropicRuntimeClient, SubagentToolExecutor>, String> {
    let model = job
        .manifest
        .model
        .clone()
        .unwrap_or_else(|| DEFAULT_AGENT_MODEL.to_string());
    let allowed_tools = job.allowed_tools.clone();
    let api_client =
        AnthropicRuntimeClient::new(model, allowed_tools.clone(), job.manifest.agent_id.clone())?;
    let tool_executor = SubagentToolExecutor::new(allowed_tools);
    Ok(ConversationRuntime::new(
        Session::new(),
        api_client,
        tool_executor,
        agent_permission_policy(),
        job.system_prompt.clone(),
    ))
 }
 fn build_agent_system_prompt(subagent_type: &str) -> Result<Vec<String>, String> {
    let cwd = std::env::current_dir().map_err(|error| error.to_string())?;
    let mut prompt = load_system_prompt(
        cwd,
        DEFAULT_AGENT_SYSTEM_DATE.to_string(),
        std::env::consts::OS,
        "unknown",
    )
    .map_err(|error| error.to_string())?;
    prompt.push(format!(
        "You are a background sub-agent of type `{subagent_type}`. Work only on the delegated task, use only the tools available to you, do not ask the user questions, and finish with a concise result."
    ));
    Ok(prompt)
 }
 fn resolve_agent_model(model: Option<&str>) -> String {
    model
        .map(str::trim)
        .filter(|model| !model.is_empty())
        .unwrap_or(DEFAULT_AGENT_MODEL)
        .to_string()
 }
 fn allowed_tools_for_subagent(subagent_type: &str) -> BTreeSet<String> {
    let tools = match subagent_type {
        "Explore" => vec![
            "read_file",
            "glob_search",
            "grep_search",
            "WebFetch",
            "WebSearch",
            "ToolSearch",
            "Skill",
            "StructuredOutput",
        ],
        "Plan" => vec![
            "read_file",
            "glob_search",
            "grep_search",
            "WebFetch",
            "WebSearch",
            "ToolSearch",
            "Skill",
            "TodoWrite",
            "StructuredOutput",
            "SendUserMessage",
        ],
        "Verification" => vec![
            "bash",
            "read_file",
            "glob_search",
            "grep_search",
            "WebFetch",
            "WebSearch",
            "ToolSearch",
            "TodoWrite",
            "StructuredOutput",
            "SendUserMessage",
            "PowerShell",
        ],
        "claude-code-guide" => vec![
            "read_file",
            "glob_search",
            "grep_search",
            "WebFetch",
            "WebSearch",
            "ToolSearch",
            "Skill",
            "StructuredOutput",
            "SendUserMessage",
        ],
        "statusline-setup" => vec![
            "bash",
            "read_file",
            "write_file",
            "edit_file",
            "glob_search",
            "grep_search",
            "ToolSearch",
        ],
        _ => vec![
            "bash",
            "read_file",
            "write_file",
            "edit_file",
            "glob_search",
            "grep_search",
            "WebFetch",
            "WebSearch",
            "TodoWrite",
            "Skill",
            "ToolSearch",
            "NotebookEdit",
            "Sleep",
            "SendUserMessage",
            "Config",
            "StructuredOutput",
            "REPL",
            "PowerShell",
        ],
    };
    tools.into_iter().map(str::to_string).collect()
 }
 fn agent_permission_policy() -> PermissionPolicy {
    mvp_tool_specs().into_iter().fold(
        PermissionPolicy::new(PermissionMode::DangerFullAccess),
        |policy, spec| policy.with_tool_requirement(spec.name, spec.required_permission),
    )
 }
 fn write_agent_manifest(manifest: &AgentOutput) -> Result<(), String> {
    std::fs::write(
        &manifest.manifest_file,
        serde_json::to_string_pretty(manifest).map_err(|error| error.to_string())?,
    )
    .map_err(|error| error.to_string())
 }
 fn persist_agent_terminal_state(
    manifest: &AgentOutput,
    status: &str,
    result: Option<&str>,
    error: Option<String>,
 ) -> Result<(), String> {
    append_agent_output(
        &manifest.output_file,
        &format_agent_terminal_output(status, result, error.as_deref()),
    )?;
    let mut next_manifest = manifest.clone();
    next_manifest.status = status.to_string();
    next_manifest.completed_at = Some(iso8601_now());
    next_manifest.error = error;
    write_agent_manifest(&next_manifest)
 }
 fn append_agent_output(path: &str, suffix: &str) -> Result<(), String> {
    use std::io::Write as _;
    let mut file = std::fs::OpenOptions::new()
        .append(true)
        .open(path)
        .map_err(|error| error.to_string())?;
    file.write_all(suffix.as_bytes())
        .map_err(|error| error.to_string())
 }
 fn format_agent_terminal_output(status: &str, result: Option<&str>, error: Option<&str>) -> String {
    let mut sections = vec![format!("\n## Result\n\n- status: {status}\n")];
    if let Some(result) = result.filter(|value| !value.trim().is_empty()) {
        sections.push(format!("\n### Final response\n\n{}\n", result.trim()));
    }
    if let Some(error) = error.filter(|value| !value.trim().is_empty()) {
        sections.push(format!("\n### Error\n\n{}\n", error.trim()));
    }
    sections.join("")
 }
 struct AnthropicRuntimeClient {
    runtime: tokio::runtime::Runtime,
    client: AnthropicClient,
    model: String,
    allowed_tools: BTreeSet<String>,
 }
 impl AnthropicRuntimeClient {
    fn new(
        model: String,
        allowed_tools: BTreeSet<String>,
        session_id: impl Into<String>,
    ) -> Result<Self, String> {
        let client = AnthropicClient::from_env()
            .map_err(|error| error.to_string())?
            .with_base_url(read_base_url())
            .with_prompt_cache(PromptCache::new(session_id));
        Ok(Self {
            runtime: tokio::runtime::Runtime::new().map_err(|error| error.to_string())?,
            client,
            model,
            allowed_tools,
        })
    }
 }
 impl ApiClient for AnthropicRuntimeClient {
    #[allow(clippy::too_many_lines)]
    fn stream(&mut self, request: ApiRequest) -> Result<Vec<AssistantEvent>, RuntimeError> {
        let tools = tool_specs_for_allowed_tools(Some(&self.allowed_tools))
            .into_iter()
            .map(|spec| ToolDefinition {
                name: spec.name.to_string(),
                description: Some(spec.description.to_string()),
                input_schema: spec.input_schema,
            })
            .collect::<Vec<_>>();
        let message_request = MessageRequest {
            model: self.model.clone(),
            max_tokens: 32_000,
            messages: convert_messages(&request.messages),
            system: (!request.system_prompt.is_empty()).then(|| request.system_prompt.join("\n\n")),
            tools: (!tools.is_empty()).then_some(tools),
            tool_choice: (!self.allowed_tools.is_empty()).then_some(ToolChoice::Auto),
            stream: true,
        };
        self.runtime.block_on(async {
            let mut stream = self
                .client
                .stream_message(&message_request)
                .await
                .map_err(|error| RuntimeError::new(error.to_string()))?;
            let mut events = Vec::new();
            let mut pending_tool: Option<(String, String, String)> = None;
            let mut saw_stop = false;
            while let Some(event) = stream
                .next_event()
                .await
                .map_err(|error| RuntimeError::new(error.to_string()))?
            {
                match event {
                    ApiStreamEvent::MessageStart(start) => {
                        for block in start.message.content {
                            push_output_block(block, &mut events, &mut pending_tool, true);
                        }
                    }
                    ApiStreamEvent::ContentBlockStart(start) => {
                        push_output_block(
                            start.content_block,
                            &mut events,
                            &mut pending_tool,
                            true,
                        );
                    }
                    ApiStreamEvent::ContentBlockDelta(delta) => match delta.delta {
                        ContentBlockDelta::TextDelta { text } => {
                            if !text.is_empty() {
                                events.push(AssistantEvent::TextDelta(text));
                            }
                        }
                        ContentBlockDelta::InputJsonDelta { partial_json } => {
                            if let Some((_, _, input)) = &mut pending_tool {
                                input.push_str(&partial_json);
                            }
                        }
                    },
                    ApiStreamEvent::ContentBlockStop(_) => {
                        if let Some((id, name, input)) = pending_tool.take() {
                            events.push(AssistantEvent::ToolUse { id, name, input });
                        }
                    }
                    ApiStreamEvent::MessageDelta(delta) => {
                        events.push(AssistantEvent::Usage(TokenUsage {
                            input_tokens: delta.usage.input_tokens,
                            output_tokens: delta.usage.output_tokens,
                            cache_creation_input_tokens: delta.usage.cache_creation_input_tokens,
                            cache_read_input_tokens: delta.usage.cache_read_input_tokens,
                        }));
                    }
                    ApiStreamEvent::MessageStop(_) => {
                        saw_stop = true;
                        events.push(AssistantEvent::MessageStop);
                    }
                }
            }
            push_prompt_cache_record(&self.client, &mut events);
            if !saw_stop
                && events.iter().any(|event| {
                    matches!(event, AssistantEvent::TextDelta(text) if !text.is_empty())
                        || matches!(event, AssistantEvent::ToolUse { .. })
                })
            {
                events.push(AssistantEvent::MessageStop);
            }
            if events
                .iter()
                .any(|event| matches!(event, AssistantEvent::MessageStop))
            {
                return Ok(events);
            }
            let response = self
                .client
                .send_message(&MessageRequest {
                    stream: false,
                    ..message_request.clone()
                })
                .await
                .map_err(|error| RuntimeError::new(error.to_string()))?;
            let mut events = response_to_events(response);
            push_prompt_cache_record(&self.client, &mut events);
            Ok(events)
        })
    }
 }
 struct SubagentToolExecutor {
    allowed_tools: BTreeSet<String>,
 }
 impl SubagentToolExecutor {
    fn new(allowed_tools: BTreeSet<String>) -> Self {
        Self { allowed_tools }
    }
 }
 impl ToolExecutor for SubagentToolExecutor {
    fn execute(&mut self, tool_name: &str, input: &str) -> Result<String, ToolError> {
        if !self.allowed_tools.contains(tool_name) {
            return Err(ToolError::new(format!(
                "tool `{tool_name}` is not enabled for this sub-agent"
            )));
        }
        let value = serde_json::from_str(input)
            .map_err(|error| ToolError::new(format!("invalid tool input JSON: {error}")))?;
        execute_tool(tool_name, &value).map_err(ToolError::new)
    }
 }
 fn tool_specs_for_allowed_tools(allowed_tools: Option<&BTreeSet<String>>) -> Vec<ToolSpec> {
    mvp_tool_specs()
        .into_iter()
        .filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
        .collect()
 }
 fn convert_messages(messages: &[ConversationMessage]) -> Vec<InputMessage> {
    messages
        .iter()
        .filter_map(|message| {
            let role = match message.role {
                MessageRole::System | MessageRole::User | MessageRole::Tool => "user",
                MessageRole::Assistant => "assistant",
            };
            let content = message
                .blocks
                .iter()
                .map(|block| match block {
                    ContentBlock::Text { text } => InputContentBlock::Text { text: text.clone() },
                    ContentBlock::ToolUse { id, name, input } => InputContentBlock::ToolUse {
                        id: id.clone(),
                        name: name.clone(),
                        input: serde_json::from_str(input)
                            .unwrap_or_else(|_| serde_json::json!({ "raw": input })),
                    },
                    ContentBlock::ToolResult {
                        tool_use_id,
                        output,
                        is_error,
                        ..
                    } => InputContentBlock::ToolResult {
                        tool_use_id: tool_use_id.clone(),
                        content: vec![ToolResultContentBlock::Text {
                            text: output.clone(),
                        }],
                        is_error: *is_error,
                    },
                })
                .collect::<Vec<_>>();
            (!content.is_empty()).then(|| InputMessage {
                role: role.to_string(),
                content,
            })
        })
        .collect()
 }
 fn push_output_block(
    block: OutputContentBlock,
    events: &mut Vec<AssistantEvent>,
    pending_tool: &mut Option<(String, String, String)>,
    streaming_tool_input: bool,
 ) {
    match block {
        OutputContentBlock::Text { text } => {
            if !text.is_empty() {
                events.push(AssistantEvent::TextDelta(text));
            }
        }
        OutputContentBlock::ToolUse { id, name, input } => {
            let initial_input = if streaming_tool_input
                && input.is_object()
                && input.as_object().is_some_and(serde_json::Map::is_empty)
            {
                String::new()
            } else {
                input.to_string()
            };
            *pending_tool = Some((id, name, initial_input));
        }
    }
 }
 fn response_to_events(response: MessageResponse) -> Vec<AssistantEvent> {
    let mut events = Vec::new();
    let mut pending_tool = None;
    for block in response.content {
        push_output_block(block, &mut events, &mut pending_tool, false);
        if let Some((id, name, input)) = pending_tool.take() {
            events.push(AssistantEvent::ToolUse { id, name, input });
        }
    }
    events.push(AssistantEvent::Usage(TokenUsage {
        input_tokens: response.usage.input_tokens,
        output_tokens: response.usage.output_tokens,
        cache_creation_input_tokens: response.usage.cache_creation_input_tokens,
        cache_read_input_tokens: response.usage.cache_read_input_tokens,
    }));
    events.push(AssistantEvent::MessageStop);
    events
 }
 fn push_prompt_cache_record(client: &AnthropicClient, events: &mut Vec<AssistantEvent>) {
    if let Some(event) = client
        .take_last_prompt_cache_record()
        .and_then(prompt_cache_record_to_runtime_event)
    {
        events.push(AssistantEvent::PromptCache(event));
    }
 }
 fn prompt_cache_record_to_runtime_event(record: PromptCacheRecord) -> Option<PromptCacheEvent> {
    let cache_break = record.cache_break?;
    Some(PromptCacheEvent {
        unexpected: cache_break.unexpected,
        reason: cache_break.reason,
        previous_cache_read_input_tokens: cache_break.previous_cache_read_input_tokens,
        current_cache_read_input_tokens: cache_break.current_cache_read_input_tokens,
        token_drop: cache_break.token_drop,
    })
 }
 fn final_assistant_text(summary: &runtime::TurnSummary) -> String {
    summary
        .assistant_messages
        .last()
        .map(|message| {
            message
                .blocks
                .iter()
                .filter_map(|block| match block {
                    ContentBlock::Text { text } => Some(text.as_str()),
                    _ => None,
                })
                .collect::<Vec<_>>()
                .join("")
        })
        .unwrap_or_default()
 }
 #[allow(clippy::needless_pass_by_value)]
 fn execute_tool_search(input: ToolSearchInput) -> ToolSearchOutput {
    let deferred = deferred_tool_specs();
@@ -2207,7 +2766,7 @@ fn execute_shell_command(
            persisted_output_path: None,
            persisted_output_size: None,
            sandbox_status: None,
-});
+        });
    }
    let mut process = std::process::Command::new(shell);
@@ -2276,7 +2835,7 @@ Command exceeded timeout of {timeout_ms} ms",
                    persisted_output_path: None,
                    persisted_output_size: None,
                    sandbox_status: None,
-});
+                });
            }
            std::thread::sleep(Duration::from_millis(10));
        }
@@ -2365,6 +2924,7 @@ fn parse_skill_description(contents: &str) -> Option<String> {
 #[cfg(test)]
 mod tests {
    use std::collections::BTreeSet;
    use std::fs;
    use std::io::{Read, Write};
    use std::net::{SocketAddr, TcpListener};
@@ -2373,7 +2933,12 @@ mod tests {
    use std::thread;
    use std::time::Duration;
-    use super::{execute_tool, mvp_tool_specs};
+    use super::{
        agent_permission_policy, allowed_tools_for_subagent, execute_agent_with_spawn,
        execute_tool, final_assistant_text, mvp_tool_specs, persist_agent_terminal_state,
        AgentInput, AgentJob, SubagentToolExecutor,
    };
    use runtime::{ApiRequest, AssistantEvent, ConversationRuntime, RuntimeError, Session};
    use serde_json::json;
    fn env_lock() -> &'static Mutex<()> {
@@ -2765,32 +3330,48 @@ mod tests {
            .unwrap_or_else(std::sync::PoisonError::into_inner);
        let dir = temp_path("agent-store");
        std::env::set_var("CLAWD_AGENT_STORE", &dir);
        let captured = Arc::new(Mutex::new(None::<AgentJob>));
        let captured_for_spawn = Arc::clone(&captured);
-        let result = execute_tool(
+        let manifest = execute_agent_with_spawn(
-            "Agent",
+            AgentInput {
-            &json!({
+                description: "Audit the branch".to_string(),
-                "description": "Audit the branch",
+                prompt: "Check tests and outstanding work.".to_string(),
-                "prompt": "Check tests and outstanding work.",
+                subagent_type: Some("Explore".to_string()),
-                "subagent_type": "Explore",
+                name: Some("ship-audit".to_string()),
-                "name": "ship-audit"
+                model: None,
-            }),
+            },
            move |job| {
                *captured_for_spawn
                    .lock()
                    .unwrap_or_else(std::sync::PoisonError::into_inner) = Some(job);
                Ok(())
            },
        )
        .expect("Agent should succeed");
        std::env::remove_var("CLAWD_AGENT_STORE");
-        let output: serde_json::Value = serde_json::from_str(&result).expect("valid json");
+        assert_eq!(manifest.name, "ship-audit");
-        assert_eq!(output["name"], "ship-audit");
+        assert_eq!(manifest.subagent_type.as_deref(), Some("Explore"));
-        assert_eq!(output["subagentType"], "Explore");
+        assert_eq!(manifest.status, "running");
-        assert_eq!(output["status"], "queued");
+        assert!(!manifest.created_at.is_empty());
-        assert!(output["createdAt"].as_str().is_some());
+        assert!(manifest.started_at.is_some());
-        let manifest_file = output["manifestFile"].as_str().expect("manifest file");
+        assert!(manifest.completed_at.is_none());
-        let output_file = output["outputFile"].as_str().expect("output file");
+        let contents = std::fs::read_to_string(&manifest.output_file).expect("agent file exists");
        let contents = std::fs::read_to_string(output_file).expect("agent file exists");
        let manifest_contents =
-            std::fs::read_to_string(manifest_file).expect("manifest file exists");
+            std::fs::read_to_string(&manifest.manifest_file).expect("manifest file exists");
        assert!(contents.contains("Audit the branch"));
        assert!(contents.contains("Check tests and outstanding work."));
        assert!(manifest_contents.contains("\"subagentType\": \"Explore\""));
        assert!(manifest_contents.contains("\"status\": \"running\""));
        let captured_job = captured
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner)
            .clone()
            .expect("spawn job should be captured");
        assert_eq!(captured_job.prompt, "Check tests and outstanding work.");
        assert!(captured_job.allowed_tools.contains("read_file"));
        assert!(!captured_job.allowed_tools.contains("Agent"));
        let normalized = execute_tool(
            "Agent",
@@ -2819,6 +3400,195 @@ mod tests {
        let _ = std::fs::remove_dir_all(dir);
    }
    #[test]
    fn agent_fake_runner_can_persist_completion_and_failure() {
        let _guard = env_lock()
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner);
        let dir = temp_path("agent-runner");
        std::env::set_var("CLAWD_AGENT_STORE", &dir);
        let completed = execute_agent_with_spawn(
            AgentInput {
                description: "Complete the task".to_string(),
                prompt: "Do the work".to_string(),
                subagent_type: Some("Explore".to_string()),
                name: Some("complete-task".to_string()),
                model: Some("claude-sonnet-4-6".to_string()),
            },
            |job| {
                persist_agent_terminal_state(
                    &job.manifest,
                    "completed",
                    Some("Finished successfully"),
                    None,
                )
            },
        )
        .expect("completed agent should succeed");
        let completed_manifest = std::fs::read_to_string(&completed.manifest_file)
            .expect("completed manifest should exist");
        let completed_output =
            std::fs::read_to_string(&completed.output_file).expect("completed output should exist");
        assert!(completed_manifest.contains("\"status\": \"completed\""));
        assert!(completed_output.contains("Finished successfully"));
        let failed = execute_agent_with_spawn(
            AgentInput {
                description: "Fail the task".to_string(),
                prompt: "Do the failing work".to_string(),
                subagent_type: Some("Verification".to_string()),
                name: Some("fail-task".to_string()),
                model: None,
            },
            |job| {
                persist_agent_terminal_state(
                    &job.manifest,
                    "failed",
                    None,
                    Some(String::from("simulated failure")),
                )
            },
        )
        .expect("failed agent should still spawn");
        let failed_manifest =
            std::fs::read_to_string(&failed.manifest_file).expect("failed manifest should exist");
        let failed_output =
            std::fs::read_to_string(&failed.output_file).expect("failed output should exist");
        assert!(failed_manifest.contains("\"status\": \"failed\""));
        assert!(failed_manifest.contains("simulated failure"));
        assert!(failed_output.contains("simulated failure"));
        let spawn_error = execute_agent_with_spawn(
            AgentInput {
                description: "Spawn error task".to_string(),
                prompt: "Never starts".to_string(),
                subagent_type: None,
                name: Some("spawn-error".to_string()),
                model: None,
            },
            |_| Err(String::from("thread creation failed")),
        )
        .expect_err("spawn errors should surface");
        assert!(spawn_error.contains("failed to spawn sub-agent"));
        let spawn_error_manifest = std::fs::read_dir(&dir)
            .expect("agent dir should exist")
            .filter_map(Result::ok)
            .map(|entry| entry.path())
            .filter(|path| path.extension().and_then(|ext| ext.to_str()) == Some("json"))
            .find_map(|path| {
                let contents = std::fs::read_to_string(&path).ok()?;
                contents
                    .contains("\"name\": \"spawn-error\"")
                    .then_some(contents)
            })
            .expect("failed manifest should still be written");
        assert!(spawn_error_manifest.contains("\"status\": \"failed\""));
        assert!(spawn_error_manifest.contains("thread creation failed"));
        std::env::remove_var("CLAWD_AGENT_STORE");
        let _ = std::fs::remove_dir_all(dir);
    }
    #[test]
    fn agent_tool_subset_mapping_is_expected() {
        let general = allowed_tools_for_subagent("general-purpose");
        assert!(general.contains("bash"));
        assert!(general.contains("write_file"));
        assert!(!general.contains("Agent"));
        let explore = allowed_tools_for_subagent("Explore");
        assert!(explore.contains("read_file"));
        assert!(explore.contains("grep_search"));
        assert!(!explore.contains("bash"));
        let plan = allowed_tools_for_subagent("Plan");
        assert!(plan.contains("TodoWrite"));
        assert!(plan.contains("StructuredOutput"));
        assert!(!plan.contains("Agent"));
        let verification = allowed_tools_for_subagent("Verification");
        assert!(verification.contains("bash"));
        assert!(verification.contains("PowerShell"));
        assert!(!verification.contains("write_file"));
    }
    #[derive(Debug)]
    struct MockSubagentApiClient {
        calls: usize,
        input_path: String,
    }
    impl runtime::ApiClient for MockSubagentApiClient {
        fn stream(&mut self, request: ApiRequest) -> Result<Vec<AssistantEvent>, RuntimeError> {
            self.calls += 1;
            match self.calls {
                1 => {
                    assert_eq!(request.messages.len(), 1);
                    Ok(vec![
                        AssistantEvent::ToolUse {
                            id: "tool-1".to_string(),
                            name: "read_file".to_string(),
                            input: json!({ "path": self.input_path }).to_string(),
                        },
                        AssistantEvent::MessageStop,
                    ])
                }
                2 => {
                    assert!(request.messages.len() >= 3);
                    Ok(vec![
                        AssistantEvent::TextDelta("Scope: completed mock review".to_string()),
                        AssistantEvent::MessageStop,
                    ])
                }
                _ => panic!("unexpected mock stream call"),
            }
        }
    }
    #[test]
    fn subagent_runtime_executes_tool_loop_with_isolated_session() {
        let _guard = env_lock()
            .lock()
            .unwrap_or_else(std::sync::PoisonError::into_inner);
        let path = temp_path("subagent-input.txt");
        std::fs::write(&path, "hello from child").expect("write input file");
        let mut runtime = ConversationRuntime::new(
            Session::new(),
            MockSubagentApiClient {
                calls: 0,
                input_path: path.display().to_string(),
            },
            SubagentToolExecutor::new(BTreeSet::from([String::from("read_file")])),
            agent_permission_policy(),
            vec![String::from("system prompt")],
        );
        let summary = runtime
            .run_turn("Inspect the delegated file", None)
            .expect("subagent loop should succeed");
        assert_eq!(
            final_assistant_text(&summary),
            "Scope: completed mock review"
        );
        assert!(runtime
            .session()
            .messages
            .iter()
            .flat_map(|message| message.blocks.iter())
            .any(|block| matches!(
                block,
                runtime::ContentBlock::ToolResult { output, .. }
                    if output.contains("hello from child")
            )));
        let _ = std::fs::remove_file(path);
    }
    #[test]
    fn agent_rejects_blank_required_fields() {
        let missing_description = execute_tool(
Author	SHA1	Message	Date
Yeachan-Heo	799c92eada	feat: cache-tracking progress	2026-04-01 06:25:26 +00:00
Yeachan-Heo	c9d214c8d1	feat: cache-tracking progress	2026-04-01 06:15:13 +00:00
Yeachan-Heo	26344c578b	wip: cache-tracking progress	2026-04-01 04:40:17 +00:00
Yeachan-Heo	0cf2204d43	wip: cache-tracking progress	2026-04-01 04:30:24 +00:00
Yeachan-Heo	ac6c5d00a8	Enable Claude-compatible tool hooks in the Rust runtime This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claude hook commands actually run before and after tools. Constraint: Hook behavior needed to match Claude-style settings.json hooks without broad plugin/MCP parity work in this change Rejected: Delay hook loading to the tool executor layer \| would miss denied tool calls and duplicate runtime policy plumbing Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics Tested: cargo test; cargo build --release Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity	2026-04-01 03:35:25 +00:00
Yeachan-Heo	a94ef61b01	feat: -p flag compat, --print flag, OAuth defaults, UI rendering merge	2026-04-01 03:22:34 +00:00
Yeachan-Heo	a9ac7e5bb8	feat: default OAuth config for claude.com, merge UI polish rendering	2026-04-01 03:20:26 +00:00
Yeachan-Heo	0175ee0a90	Merge remote-tracking branch 'origin/rcc/ui-polish' into dev/rust	2026-04-01 03:17:16 +00:00
Yeachan-Heo	1bd0eef368	Merge remote-tracking branch 'origin/rcc/subagent' into dev/rust	2026-04-01 03:12:25 +00:00
Yeachan-Heo	ba220d210e	Enable real Agent tool delegation in the Rust CLI The Rust Agent tool only persisted queued metadata, so delegated work never actually ran. This change wires Agent into a detached background conversation path with isolated runtime, API client, session state, restricted tool subsets, and file-backed lifecycle/result updates. Constraint: Keep the tool entrypoint in the tools crate and avoid copying the upstream TypeScript implementation Rejected: Spawn an external claw process \| less aligned with the requested in-process runtime/client design Rejected: Leave execution in the CLI crate only \| would keep tools::Agent as a metadata-only stub Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Tool subset mappings are curated guardrails; revisit them before enabling recursive Agent access or richer agent definitions Tested: cargo build --release --manifest-path rust/Cargo.toml Tested: cargo test --manifest-path rust/Cargo.toml Not-tested: Live end-to-end background sub-agent run against Anthropic API credentials	2026-04-01 03:10:20 +00:00
Yeachan-Heo	04b1f1e85d	docs: rewrite rust/ README with full feature matrix and usage guide	2026-04-01 02:59:05 +00:00
		`@@ -0,0 +1 @@`
							`{"messages":[{"blocks":[{"text":"clear","type":"text"}],"role":"user"},{"blocks":[{"text":"\n\nI've cleared the conversation. How can I help you today?","type":"text"}],"role":"assistant","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":4272,"output_tokens":17}}],"version":1}`
		`@@ -0,0 +1 @@`
							`{"messages":[{"blocks":[{"text":"exit","type":"text"}],"role":"user"},{"blocks":[{"text":"\n\nGoodbye! 👋","type":"text"}],"role":"assistant","usage":{"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"input_tokens":4272,"output_tokens":10}}],"version":1}`