Enable local image prompts without breaking text-only CLI flows

The Rust CLI now recognizes explicit local image references in prompt text,
encodes supported image files as base64, and serializes mixed text/image
content blocks for the API. The request conversion path was kept narrow so
existing runtime/session structures remain stable while prompt mode and user
text conversion gain multimodal support.

Constraint: Must support PNG, JPG/JPEG, GIF, and WebP without adding broad runtime abstractions
Constraint: Existing text-only prompt behavior and API tool flows must keep working unchanged
Rejected: Add only explicit --image CLI flags | does not satisfy auto-detect image refs in prompt text
Rejected: Persist native image blocks in runtime session model | broader refactor than needed for prompt support
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep image parsing scoped to outbound user prompt adaptation unless session persistence truly needs multimodal history
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Live remote multimodal request against Anthropic API
This commit is contained in:
Yeachan-Heo
2026-04-01 00:59:16 +00:00
parent d6341d54c1
commit 5b046836b9
5 changed files with 377 additions and 42 deletions

View File

@@ -11,7 +11,7 @@ pub use error::ApiError;
pub use sse::{parse_frame, SseParser};
pub use types::{
ContentBlockDelta, ContentBlockDeltaEvent, ContentBlockStartEvent, ContentBlockStopEvent,
InputContentBlock, InputMessage, MessageDelta, MessageDeltaEvent, MessageRequest,
ImageSource, InputContentBlock, InputMessage, MessageDelta, MessageDeltaEvent, MessageRequest,
MessageResponse, MessageStartEvent, MessageStopEvent, OutputContentBlock, StreamEvent,
ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
};