Expose structured thinking without polluting normal assistant output

Extended thinking needed to travel end-to-end through the API, runtime, and CLI so the client can request a thinking budget, preserve streamed reasoning blocks, and present them in a collapsed text-first form. The implementation keeps thinking strictly opt-in, adds a session-local toggle, and reuses the existing flag/slash-command/reporting surfaces instead of introducing a new UI layer. Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q Not-tested: Live upstream thinking payloads against the production API contract
2026-04-01 01:08:18 +00:00
parent d6341d54c1
commit c14196c730
9 changed files with 353 additions and 31 deletions
--- a/rust/crates/api/src/lib.rs
+++ b/rust/crates/api/src/lib.rs
@@ -13,5 +13,5 @@ pub use types::{
    ContentBlockDelta, ContentBlockDeltaEvent, ContentBlockStartEvent, ContentBlockStopEvent,
    InputContentBlock, InputMessage, MessageDelta, MessageDeltaEvent, MessageRequest,
    MessageResponse, MessageStartEvent, MessageStopEvent, OutputContentBlock, StreamEvent,
-    ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
+    ThinkingConfig, ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
 };